The thought and knowledge of communities and users need to belong to those communities and users. To people they intentionally and thoughtfully delegate to and trust. We need to decentralize our communications, like how the internet used to be before the arrival of social media and mega forums. We need to revert to small, focused forums, with less anonymous, more persistent communication, run by people we trust. Otherwise, we will continue to see mega companies harvest our data and use it (or not provide it) against our wishes. If we don’t work to mitigate that dynamic, we have nobody to blame for the poor outcomes but ourselves.
Unfortunately the dumps themselves are not a legal requirement, just a gentleman's agreement, so realistically exercising this ability was still at the whim of the company.
This reminds me of the promise OpenAI was built on. Unfortunately, it turned out to be a bold claim to be respected and too good to be true [0]
Even if the underlying tech is decentralized, the community will settle around one or a few big instances (for example, Gmail and GitHub) which often end up having significant control over the trajectory of the entire ecosystem. If you run your own email server and you get put onto Google's spam list - you're fucked.
Email is a great example where most people wouldn't be interested in a version of email that only let's you email other @gmail.com users. Having a email address that can contact anyone, a phone number that can ring any other phone number etc instead of being locked into a single corporation network is a clear value add that people care about.
The main issue from my perspective is that we only have a select few large tech companies that operate as monopolies so are effectively able to block out new decentralized protocols from coming to be.
RCS messaging is a great example which I think most people would use over alternatives like WhatsApp and Imessage except that apple refusing to support it locks a huge fraction of the market out and stops widespread adoption being possible.
I don't think it's a question of preference, or people being uninterested. It's just a boring and repeated story of corporate monopolies intentionally reducing consumer choice.
It's even worse than that- I ran my own email server, and for some reason gmail delayed any emails from their system to outside of their system. That meant that people would send me an email but I wouldn't get it for 20 minutes. These delays don't exist when using big email providers (it stopped being a problem when I switched to Fastmail, for example) but if you're running a small server Google makes it a nightmare.
The core issue is that user generated data is owned by one individual company. There are existing system that don't have this issues e.g. Usenet or bittorrent.
We don't need to idiot proof the web. There are enough people to gather some place for a social network even if it's hard to use. The others can stay and will stay on reddit anyway until one day when they also had enough and learn to use some alternative.
Well, yes. But it's also been demonstrated over and over again the risks of centralized sites. Maybe, just maybe, one of these days that lesson will stick and communities will take a longer term view. It seems like the cycle is happening a bit faster each time now, so maybe folks will get tired of the "damn, time to move to another site, again..." thing.
Or not. Convenience tends to trump lots of other considerations most of the time.
Maybe, maybe not. It's worth a shot, though, isn't it?
Is there not significant innovation and benefit that was designed and implemented in the first place that caused users to contribute their time, thought and energy?
I think the real problem here is when organizations that rely on a crowd-sourced business models decide they just have to be billionaires or solve all the worlds problems with their platforms, instead of just staying true to their model. I don't see what's wrong with just running a highly successful business that makes money for it's founders and doesn't have to go out and strive every day to be the next Facebook or Google.
Make no mistake. Platforms like Reddit and Stackoverflow are real, serious businesses. But why can't they exist and be a general successful business like your local mom and pop restaurant or toy store or whatever?
I run RadioReference.com and Broadcastify, both which are significant businesses but also rely almost solely on crowd sourced data and content. We're wildly successful - but I've never seen the need to hire 3,000 people, or IPO, or do series raises to expand into solving world peace. Our premium subscription pricing has been the same for 15 years. I completely eliminated advertising on one of the platforms last year. We make a lot of money. We provide a lot of value to our communities, and we carefully innovate and expand to provide value. It's a nice happy life for everyone involved, and I don't have to deal with a VC who will be determined to either make a trillion dollars or torpedo my business.
Over the past quarter of a century of people trying to create online walled gardens of hosted content we've seen this happen over and over again, and the examples are so numerous that reddit was itself a replacement for Digg and StackOverflow for Experts Exchange. And yet, somehow, today, you suddenly woke up :(.
The reality is that we live in a dystopian Eternal September where as people finally notice what is going on and leave they are just replaced by new people who don't care or simply didn't use the prior service and are attracted to the new shiny, and another 25 years from now you're going to see people making the same unapologetic "I now realize" statements.
What we need to do is figure out how to actually replicate the feeling you are having in a way that doesn't require you to have spent years on a platform and then watching it die so it can be communicated to people before they bother to use a new platform, and in a way that somehow makes them willing to collectively not experience viral lock-in.
(And we also need to figure out how to make people willing to accept doing that at some cost to themselves, whatever form that might take: people on HN continuously do the thing where they give up freedom for a little temporary convenience and then get angry at others for daring to suggest that something a bit harder to use or with any extra friction would ever be a sane thing for anyone to use :/.)
Back in 2017 I gave a talk at Mozilla Privacy Lab called "That's How You Get Dystopia" where I just documented a ton of examples of abuse of centralized power and the reality is that every few days I just come across more stuff to add to the list... and this talk doesn't even bother with all the numerous service that simply enshittified or shuttered.
I was saying to someone yesterday that "enshittification" is a sub-optimal coinage for something that really shouldn't need a new term, and which focuses attention on symptoms rather than root causes. If you give someone a power of attorney over your assets, you'll likely find that they start behaving less well towards you. Or if you give up agency, others will treat you like less of an agent. But what matters is not their behavior at the end but your decisions before that point.
You're seriously underestimating the effort it took to build that platform and how much effort it continues to take to keep it running well. I'm not talking about technical challenges, but social ones. It took a long time for them to get the system and incentives right (and it's still not quite right, IMHO), and it takes continued effort to keep it running well in the form of moderation and stopping abuse (and here it also doesn't quite get things right).
I could bang out a "BufferUnderrun.com" in a few months; many people could. But that's not the hard part.
Once upon a time, most people who wanted or had something to say wrote their own little website and hosted it themselves (be it in a datacenter or a server in their closet). Some even ran forums and got fancy with server-side magic because that's what nerds do. Even the kids who couldn't afford anything had free, basic hosting services to choose from (anyone remember those days?).
The internet was designed as a distributed network and the denizens then were distributed. You only got as centralized as a given ISP or datacenter provider.
Of course, we all know as more and more commoners came onto the internet they didn't want to bother with developing or hosting or maintaining a website or anything. They just wanted to shitpost, for free, with blackjack and hookers.
And so "free" services like Reddit, Facebook, et al. came about to serve that demand. Information became centralized, because who the fuck has time to be responsible? Offload that crap!
The cost of that offloading of responsibility has now come knocking with debt collectors in tow, with interest.
I guess what I'm trying to say is: We don't need to rethink anything. We just need to take some god damn responsibility for ourselves. Responsibility is power, and with power you can tell commercial interests you disagree with to screw off.
In fact, people don't record what they talk about in pubs because the point is the chat experience not the records of previous chats. Data isn't oil and it isn't quite sewage, it's more like quicksand or thickets of weeds growing and tangling around your feet. Like minimalists say 'stuff is bad' but stuff is useful, it's having stuff hidden in cupboards and drawers and a garage full of stuff and wanting a bigger house to hold more stuff and most of the stuff going unused because you can't bring yourself to let go of it, and companies advertising that more and newer stuff will make your life better and solve your problems, which is the biggest problem with stuff. Sufficientism might be a more appropriate name - enough stuff to make your life better and no more.
Enough chat to make your life better isn't "all of it kept forever".
Seems to me SO built and delivered huge, huge amounts of value and it’s now all at risk because multibillion dollar companies are free riding.
It's not that SO has a moral right to control and profit from that content. The reality is that SO holding that content at all is a conditionally granted privilege that the community affords the site, and it is a privilege that was always designed to be revocable and the data moveable if SO started abusing its position of power as a host and trying to lock down access.
Some writing/content sites that have taken steps to restrict AI access based specifically on community request. That's a very different situation; if a community (particularly a closed or close-knit community) is collectively and (mostly) uniformly trying to avoid an AI scraping the content that they created, then good for them. There are communities online that are in that position. But "how will the company get reimbursed for our valuable asset" should not be part of that conversation. And SO in particular was set up around norms that deliberately allowed this kind of scraping. It's not their asset to protect.
> rent-seeking AI models
I have issues with modern AI economic models too but I don't think that "rent-seeking" is an accurate term to use. A better word would probably be "parasitic"; I understand (and at somewhat agree with) the argument that OpenAI is looking to repackage information it didn't create in a way that redirects attention away from the original source of information.
But I'm having a really hard time figuring out how OpenAI is hoarding a scarce asset to extract value by controlling access to that asset. The more obvious rent seeking behavior here is coming from SO, a company trying to restrict access to Creative Commons licensed content created for free by unpaid volunteers, and trying to reclassify that content as their corporate property.
I guess being as charitable as possible, I do worry about the SaaS model of many AIs that are dedicated to content generation, and I worry a little bit about AI models becoming heavily integrated into creative processes and then extracting a kind of monetary "creative tax" from artists/creators while heavily restricting what they are allowed to make. That's at least adjacent to rent seeking, but I'm still not sure it's the term I would use and I'm not convinced it's a scenario that's applicable here.
I would say: Need to be a public resource, belonging to no-one, i.e. no person, or group, or company should have legitimacy in denying access to it. They should all be considered _trustees_ of such a resource.
> when all they did is provide the platform, and store the data.
To be fair, SE Inc. did a lot more than provide the platform. A lot of development and design work, publication, a bit of the curation work, etc. I don't like how they behave but let's give them what they're due.
---
Also note the ongoing Moderator Strike (!): https://meta.stackexchange.com/q/389811/196834
The idea that these networks and communities need to run on centralised servers is archaic. The technology exists where people should be able to own their own network (followers, subs, following, posts).
And that's exactly why (ignoring the scammers or pump-and-dump businesses) it saw such heavy investment from VC/tech types. The promise they were interested in wasn't democratization even if that's what they told their users -- what they were interested in was taking a plentiful resource (digital bits) and building a scarce asset that they could use to further entrench exclusivity, status, and monopolistic control over what that asset represented.
Read back over every sales pitch for web3 games. At some point they always devolve into talking about how ordinary users will be able to rent seek: to "license" characters/weapons/gear and passively earn income from other players, or to hoard exclusive tokens/releases in the game and speculate on their future value. Web3 looked at infinite digital spaces and its response was, "infinity is a problem that we need to solve." And it's revealing to look at most web3 branded metaverse attempts and see just how quickly they reintroduced real-world concepts like housing/space scarcity (why on earth would we want a housing market in a digital space with no physical constraints?), and how quickly they leaned into cosmetics and customization as a monetization strategy rather than a user right to free expression.
In general, if a technological "paradigm" is primarily associated with and primarily popular with VC firms, it's probably not being developed with the user in mind.
On the other hand: federation, interoperability, mobile identities, and legal efforts to build a right to data export existed independently of web3 and have shown a lot more promise when it comes to actually increasing user agency.
Centralized servers aren’t archaic, they’re a natural outcome of how social systems work: finding communities is hard; people want to contribute their ideas, not play sysadmin; spammers and AI researchers will create enormous costs for you; etc. If you federate, you will have more time dealing with those issues than a single focused competitor and you are unlikely to see free contributions which outweigh those costs.
Everything you mentioned is available now on Mastodon, and it’s really interesting to see how that works. Some people love having a small network of their friends, but a lot of people have trouble finding people they want to follow. Instances can have their own rules but dealing with abuse is now a multiparty process and since a lot of instances are run by volunteers that can be slow, unreliable, and inconsistent. Some small servers get hammered by storage and bandwidth demand but there’s no great path to monetization unless you have a ton of users willing to pay more than most people are used to paying for internet services.
In general, these are social problems and there is only so much technology can do to improve them.
Well no, what we need is web0, the original premise of the Internet.
Every protocol was documented in open RFCs, everything is decentralized and everyone is free to use any client and server (or write their own) and everything interoperates. Nobody can own it, there's no "it" to own. That's the only solution to eliminate the otherwise neverending cycle of proprietary platforms followed by their inevitable "oh s* moment".
This problem is inherent to client/server software, and there are really only three ways to do it:
1. The server side of client/server is centralized and run by corporations
2. The server side is decentralized, meaning everyone has their own server
3. Abandon the server, clients connect directly to each other without a server intermediating
Option 3 would be ideal, but would require significant technological advances - it'll be a lo0ong time before bandwidth is cheap enough that Kim Kardashian can serve photos and movies to all of her fans direct from her phone. Option 1 is what we have now, and is terrible in a variety of ways.
Option 2 would be hard but is not obviously impossible, so still our best bet - sure, it's not viable now, but it sure seems like it could be, if an iphone's worth of r&d were put in to it. I would honestly be amazed if no one at Amazon is working on such a thing, since no one would benefit more than AWS from a future in which a cloud VM becomes one of the things that most middle-class families rent monthly.
You're onto something. Team-BHP [1] is run exactly like this, and it seems to be working.
For those wondering, it's a car-enthusiasts website based in India. They've been around for around 18 odd years I think.
The moderators all have actual dayjobs.
When signing up you have to write a paragraph about why you're really a petrolhead (or dieselhead because Indians love European turbo-diesels :) ), and there's a human on the other end vetting your sign-up application! Plenty, including me, have been rejected atleast once. I got in on my 2nd attempt years later.
As a matter of principle they refuse to do car advertisements.
I don't know how well the site is engineered but it works. Check it out. But I suspect most non-Indians (such as most people on HN) wouldn't find it that useful as it's mostly about the Indian car scene.
Imagine writing a text book with a royalty publishing deal. Your publisher decides they're going to use your book, amongst others, to train an LLM that can answer questions on your subject, and they're not going to pay you anything.
It's a legal gray area and they've got teams of lawyers whereas you do not.
I wish this lesson could be learned once for all.
A long-lived community/repository cannot be built on a proprietary platform owned by some corporation. Full stop, no exception. It can't be done. A corporation will at some point need to maximize profit extraction which will ruin it for everyone. A corporation also won't support a platform forever nor can the entity itself survive forever. A single point of failure can't last forever.
> We need to decentralize our communications
Look at the solutions which have lasted longest. Email & mailing lists, going strong since the 1970s. Completely decentralized, interoperability defined by open standard protocols, anyone can build interoperable clients and servers. Nobody owns it. There's no "it" to own. That is what's needed for long term viability.
I'm kind of curious what is next.
Originally over UUCP (Unix Unix Copy Protocol) and done via dial ups at night (when the rest of the batch transfers were done - email too with the old bang path). The two servers would exchange all the batched email and news posts that were routed to the other side.
RFC 977 ( https://www.w3.org/Protocols/rfc977/rfc977 ) has an example of how files are copied between the two systems (section 4.6) including fetching and receiving mail.
Note that not all posts outbound are necessarily of interest to the other server. An IHAVE message could come back with either a "I want it" type response or a "not interested"
> The IHAVE command informs the server that the client has an article whose id is <messageid>. If the server desires a copy of that article, it will return a response instructing the client to send the entire article. If the server does not want the article (if, for example, the server already has a copy of it), a response indicating that the article is not wanted will be returned.
That's how some of the moderation worked - your server would say "I don't want anything that came by way of X host" or "not interested in that newsgroup."
One of the amusing things to me (looking back at this), if you're familiar with HTTP response codes, you'll likely get most of the way through the NNTP ones.
200 server ready - posting allowed
400 service discontinued
411 no such news group
500 command not recognized
I'd also suggest a read of RFC 850 ( https://www.w3.org/Protocols/rfc850/rfc850.html ) for some other background and section 5: The News Propagation Algorithm> run by people we trust
People change, or retire, just like corporation goals change.
Focusing on more independent is not enough. If you want truly unbreakable stuff first part of the puzzle is saving user's handle and identity in a way that can't be removed.
Then finding out a way to link that to their content so when place of hosting it goes away people can follow to the new place
Then just have all of that content be signed by that identity so users can verify that it is really that person.
And I can't believe I'm saying that unironically but blockchain might just be the solution for that.
Something like immutable log of:
* user declaring "I'm jeff@example.com, here are my public keys". Servers then validate via DNS record or some .well-known location entry whether user is allowed to declare they are from @example.com * user declaring "behold! jeff@example.com stuff is <here>, and <here> and <here> are addresses for various federation systems". Only passes if that request is signed with above privkey of course * user declaring "behold! My new public key is X and Y. And Z key is revoked!" * user declaring "behold! I am now george.effluent@company.com! Re-does checks but for new domain and users previously subscribed to jeff@example com get served redirect".
etc.
Then when server admin inevitably goes rogue you can take your posts and subscribers and go somewhere else.
And when @example.com owner decides "well I'm just gonna to redirect stuff to ads", you can just change your handle and direct people to right place, and other handle is forever taken.
And all Google did was build a search engine.
Were it really a site for helping developers to improve their skills and increase their productivity through the give-and-take model that SO was, at least once upon a time, SO should perhaps take a deep breath and realise that this might not change a thing apart from causing their contributors to feel like they were never part of it in the first place.
I'm not sure if I've correctly articulated that, but I do find SO's stance to be quite revealing. It feels to me like they're crying foul that ChatGPT and the how many other systems out there are stealing their revenue. None of the contributors (apart from the employee ones, I suppose) ever got paid any currency other than high-fives in the form of rep, medals, the gamified stuff, moderation rights, and at certain rep levels some swag in the form of t-shirts and the usual.
I never wanted any money from SO, but the revelation of this attitude has left me feeling, well, a little sad to say the least.
The economies of the internet are changing. Now with LLMs being accessible at an exponentially cheaper rate, we're seeing old models crumble and new models rising.
The era of moderated user content is changing drastically and the stalwarts of social networking, or adjacent, services are closing ranks to try and anticipate the change.
Thanks for the insight. I had a vague notion that these new policies were because of some recession or some other basic economic issue. I think a better theory is that the lowering economic cost of LLMs that are becoming available are the reason for all these changes.
History repeats itself albeit in a fascinating way which I'm still trying to grasp.
I don't blame SO. I think they are acting rationally and as anyone would facing such a threat.
Secretly, I want the internet ad economy of nothing to go away. I won't mention any names (cough taboola cough cough) but that might be the only upside to this tech. Let's see what happens six months or so from now.
Side note, I'm running llama on a really crappy old server and that was enough to convince me that I'll be able to run an LLM on my watch in the near future.
Yep, but that's not the point.
> SO didn't opt into being training data for ChatGPT, and I doubt they would have given the chance.
Neither did Wikipedia (at least to my knowledge). I thought the point of opening up information was to benefit the public, first and foremost, and without hidden terms which state something along the lines of "it's free and open information built by the community, but when something disrupts our ads-driven business model and we make it unfree".
It would have been nice if they had at least allowed their contributors to vote on this, or have some sort of a say.
I would love to see some kind of identity and reputation system where the "high-fives in the form of rep" could follow people across communities. It may not feel like much compensation if you've contributed over 2500 answers, but having reputation gained in your area of expertise grant you a high level of trust to interact in other communities could be valuable, at least in my opinion.
Assuming they're making this move to protect against AI / LLMs, I think SO is in an impossible situation here. When all the ChatGPT hype started, one of my first questions was "what happens to the incentive for contributors and creators?" Why would I want to contribute on a platform if I know an AI model is going to come in, take my contribution, and regurgitate it back to the masses in a way that I can't control?
Even if I get some attribution from the AI/LLM, do I even want it? If the LLM is blending content from multiple sources, which changes the context and presentation I put effort into, is the quality going to be high enough to match what I strive to achieve for myself when I'm trying to build a reputation as a high quality contributor? What if the AI is hallucinating objectively poor quality content and giving me partial attribution?
So, for me, part of the social contract with SO is that I provide answers, but I get to control the entire interaction; the context, the presentation (mark up), defending criticism in the comments, etc.. In addition to that, since the entire conversation happens inline, I can be corrected by someone even more knowledgeable than me and use that feedback for self improvement.
I think AI is going to be disruptive and the whole idea, for me anyway, behind disruption is that you break an existing system and then everyone is free to take a shot at claiming part of the new gold rush that occurs while trying to build the replacement. The problem with AI is that it's going to break a lot of services that do a good job of serving the community and shouldn't be broken. SO is a great example of a healthy community that doesn't need disruption, but the massive amount of high quality, curated content is going to make them a prime target for LLM training.
Personally I think the only solution is for "noai" variants of popular open source licenses so contributors have the ability to make it clear they don't want to contribute to AI/LLM companies. If SO had an option to flag contributions as CC-BY-SA-NOAI, I'd enable it on my stuff going forward.
Honestly I think that's an excellent idea - a rep "passport" of sorts which gains you a certain level of trust within certain communities.
> Assuming they're making this move to protect against AI / LLMs, I think SO is in an impossible situation here. When all the ChatGPT hype started, one of my first questions was "what happens to the incentive for contributors and creators?" Why would I want to contribute on a platform if I know an AI model is going to come in, take my contribution, and regurgitate it back to the masses in a way that I can't control?
Sadly, I think this is an unpreventable outcome of what is happening right now. I don't think anyone will have any control over this, at all. We can only hope it will never be the case that being active (actual human contributors) becomes a worthless pursuit.
> Even if I get some attribution from the AI/LLM, do I even want it? If the LLM is blending content from multiple sources, which changes the context and presentation I put effort into, is the quality going to be high enough to match what I strive to achieve for myself when I'm trying to build a reputation as a high quality contributor? What if the AI is hallucinating objectively poor quality content and giving me partial attribution?
Another excellent point, the prospect of this being possible today - AI attribution from a hallucinated version of a human's objective contribution sounds freaking terrifying to me. Not a world I want to live in, to be honest.
> I think AI is going to be disruptive and the whole idea, for me anyway, behind disruption is that you break an existing system and then everyone is free to take a shot at claiming part of the new gold rush that occurs while trying to build the replacement. The problem with AI is that it's going to break a lot of services that do a good job of serving the community and shouldn't be broken. SO is a great example of a healthy community that doesn't need disruption, but the massive amount of high quality, curated content is going to make them a prime target for LLM training.
As will every single human-created/curated content-source, IMHO. I think that "quality" will be really, really hard to objectively measure in the near future as the whole world of digital information becomes tainted with applied statistical models which can do a reasonably good job of predicting what people perceive to be high-quality reasoning, answers, content. I like the idea of underground speakeasies where there's no wifi, just humans.
> Personally I think the only solution is for "noai" variants of popular open source licenses so contributors have the ability to make it clear they don't want to contribute to AI/LLM companies. If SO had an option to flag contributions as CC-BY-SA-NOAI, I'd enable it on my stuff going forward.
That would be great, but I'm pretty sure that no LLM corporation would care about those flags, even with strict regulations in place from governments.
I hope the data that has been found so far is going to big enough going forward, but it's incredibly unfortunate that this is happening.
I hope all the people making these decisions wake up with a bad headache and severe heartburn tomorrow.
Suppose that deep-pocketed AI companies were paying Reddit, Stack Overflow, etc. to make it harder for other AI upstarts to access those data. I.e., to build a mote by denying competitors access to previously accessible data sets.
Would that violate antitrust laws in various major markets?
mote: a small particle, speck, atom, "mote of dust"
moat: a deep ditch, often filled with water, as a first line of defence around a castle.
What we need is a legal way for companies to keep the data open, but also require OpenAI and friends to pay them for it.
Are there any AGPL-like licenses that address this?
That trick was to simply scroll past the paywall. They had all the answers exposed so that google would index them. It was hilarious and silly.
Yes, it sucks that the SE sites are getting more draconian about allowing access to their content but the SE sites are well insulated against it completely disappearing precisely because they're under a libre/free license. Note that Reddit [2], nor HN I might add [3], have any such licensing terms that allow for commercial reuse.
Decentralization might be a viable option in the future, but for right now, centralized sites are the norm and the way to protect against the content from disappearing is to put it under libre/free licensing. Note that Wikipedia is centralized and it would certainly be a tragedy if they became more draconian about sharing their data but the content itself is and will be available to the general public, effectively the "commons", because of the licensing terms.
To me, this is yet another reminder of why we need to future proof with libre/free/open licensing terms. Or reform copyright, but I don't see that happening within my lifetime.
[0] https://stackoverflow.com/legal/terms-of-service/public#lice...
[1] https://creativecommons.org/licenses/by-sa/4.0/
[2] https://www.redditinc.com/policies/developer-terms#text-cont...
"Stack Exchange doesn't have the right to unilaterally change the license of previously submitted content." - https://meta.stackexchange.com/questions/333089/stack-exchan...
https://meta.stackexchange.com/questions/344491/an-update-on...
Obviously not legal precedent, but there is some discussion on the matter by the Creative Commons organization [0].
[0] https://creativecommons.org/faq/#what-happens-if-the-author-...
Praise the mega mind.
Maybe we won't even have to wait for LLMs to destroy the web we used to know.
I would be willing to bet that the driving force behind the decision was to make it less trivial for LLMs to say "the data was already there under an open license, so we legally undercut stack overflow".
But let's be real about the morality here: Stack Overflow is a badge-powered mechanical Turk. It uses 100% unpaid labor to go and search Google for answers and post them on SO, providing a "service"[1]. For it to moralize about the ownership or sanctity of data is irony.
[1] - There are exceptions, obviously. There are true experts who wander the virtual halls of StackOverflow and dole out wisdom. But overwhelmingly it is clear that answers primarily come from people who rush to Google and then copy/paste from blogs and tech papers. And while Stack Overflow dumps are CC because that's the agreement that it made with contributors, a lot of the content on the site was ripped without attribution and in defiance of IP. So...maybe not too many tears for SO.
> I was recently impacted by the Company's layoff.
> I'm offering what I can to uphold the Company's values of Transparency & being Community-centric.
I wouldn't offer transparency about a former employers internal operations. Let them respond or at least ping a current employee to respond.
Or, it's exactly the best time to do it. Doing it now allows your news to get blended in with the Reddit news. Doing it later after Reddit chatter settles down means all of the chatter is directed squarely at you.
A mod strike? I hadn't heard about this.
https://meta.stackexchange.com/questions/389811/moderation-s...
The models (AWD-LSTM and GPT-2) weren't good enough back then to usefully answer programming questions -- but it's super cool to see that vision realized with GPT-4 and other modern LLMs.
Today's data dumps/APIs foster easier access to train ML/AI models to put them on the path to irrelevance. They're pulling out all stops like there's no tmw, and there might not be, if they're willing to shake things up like this.
If the data dump is gone, that compact is broken and honestly it's time to stop contributing to SO.
it was always a broken system built on dodgy contracts, but it is still sad to see how unceremoniously everything implodes
will any lessons be learned? unlikely.
With one exception, there are no instances of anything crowdsourced/community-supported that aren't later paywalled, gatekept or destroyed to prevent exfiltration. It's always an advance-fee scheme. The longer the duration of time, the more the terms are corrupted until the people expecting delivery on the original promise end up being told "what promise?" (The exception is piracy sites. Ironically the illegal nature of the activity seems to keep the owners honest.)
Never work for free, for any promise of long-term future payout, "exposure," or any other bullshit. When they fuck you over--and they will, because you made it so easy--you'll be too broke (and broken) to sue. Every inch, every day you give them is just more time for them to find ways to cheat you.
(You'll learn this lesson the hardest way in making concessions to a high-conflict ex-spouse armed with a 50/50 child custody agreement...they get you to agree to let the kid stay with them during your scheduled time, more and more, until they can prove the kid is basically with them 100% of the time-- then you get slapped with a vastly-increased child support order. You can't claw anything back because they have commitments now. Thus, you get cheated out of both your relationship and your money.)
For example, my hobby search engine got started because I found out about these dumps and decided it would be an interesting challenge to try to work with them[1]. If I’d needed to build a scraper first the project would never have gotten off the ground.
[1]: https://search.feep.dev/blog/post/2021-09-04-stackexchange
We have companies like Reddit and Stackoverflow not being profitable, despite being wildly successful in usage and internet mind-share. Neither of these companies are particularly over-staffed.
We post our "valuable" contributions there. So valuable that nobody wants to pay for it (structurally). We block ads. AI does the daylight robbery. We expect free APIs and data dumps.
Perhaps this is our wake-up call. The limitations of the "free" model and companies running at a loss for 15 years straight. It was always an anomaly.
IF (and it's definitely an "IF") this is an intentional and permanent change by SE management, they are fundamentally changing the basic understanding between users and SE, and they have to understand that some subset of users are likely to quit using SE in response. Again, it's hard to say how many. Maybe enough to have a material impact, or maybe not. That would be the gamble they'd be taking though.
Given the way they've communicated in the last 2 weeks or so, this seems pretty clear. Before we had employees engaging as real human beings all over the place, and you were talking to Jon, Tim, Robert, Shog, etc. and not "Mr. Ericson, title such-and-such, representing Stack Exchange Inc."
Now all we have is a bunch of announcements, with no discussion, engagement, or even a recognition that anything is even being read. It feels like pissing in the wind – disagreement is one thing, reasonable people can disagree, but ignoring is so much worse; it's like you're not even taken serious.
Stack Exchange has gone through various phases (e.g. the "Jeff era" was different from the "stagnation era" that followed after he left), but the implied social contract was always that the community would offer their spare time and in return they would get a platform and some voice in how that platform is run. There have certainly been moments of friction in this relationship, but the basics of it never changed until now (not even with the whole debacle surrounding the firing of a moderator a few years back).
I certainly didn’t sweat it out helping people on SO to pay for Sam Altman’s fucking swimming pool.