This was an interesting dilemma because it was very clear that the money was way less than the loss in ad revenue due to traffic drop, but it was also clear that if we wouldn’t take the deal, a more desperate competitor would, which would result in the same traffic loss but without the extra google money. So the company took the deal.
History repeats itself here, with the difference that instead of paying for the data, the ai crawlers simply take it for free.
This discussion was broached originally when discussing whether or not search engines and aggregators had any compensation obligation in respect of news articles. This was a hot topic in the IP and policy circles for a few years.
When the Canadian government attempted to create a mechanism to compensate content creators for the scraped content, there was widespread outrage from tech circles, despite the same community agreeing, across extensive policy discussions, that action had to be taken to prevent this universal man-in-the-middle value capture by search engines.
I've had fairly extensive discussion with the individuals involved in the academic, policy and internal industry analysis of the issue. Watching industry agree to address the issue, then aggressively spend to shape public narratives in public was eye opening.
The recent shift into "AI is obviously going to hoover up all your data and there's nothing you can do now that the theft is laundered through an LLM" is just the latest example of the same trend of short sighted capital-over-everything decision making we've become used to in jurisdictions that have dysfunctional legislatures.
You’re free to block them, but the websites cloning your content won’t. So either way they’ll get the content they’re after.
Worse, when/if the time comes that LLMs source their claims they’ll refer you to the websites that cloned your content.
Let the big guys fight it out.
There is money in protecting the websites. If you host with OVH they are interested in you making money so you can pay them.
AI summarization has already causes issues for sites like rtings where people are no longer visiting the site but still making use of the data presented there. Leading to rtings not getting enough traffic to continue to post their data.
It is an existential crisis for websites and when they go away it'll be an existential crisis for AI.
I may be strange and unusual, but I just have never cared about my Google ranking. I know this makes me out of the ordinary among site owners but I have been humming along fine.
This certainly will disrupt traffic but for some of my sites I honestly think this is a good thing. I want you to want to be there, not just stumble upon my site because you happen to hit the right search keyword. Plus if it gets bad, this does create a new opportunity for others with cross linking and search.
Step 2, Google extinguishes the web and nobody has a reason to publish content, consumers lament but are trapped, Google has created a platform to serve content instead of links
Step 3 (or maybe 2a), Google is now monetizing their content machine
Step 4, Google offers people a way to contribute to the content machine, make some $$ per N views, whatever. People create content within the ecosystem
Step 5, Google is now the internet, more content is created overall, quality is lower overall perhaps, algorithmic echo chambers flourish even more than today, old heads on HN lament, everyone else just goes on living
And here I thought denying ad revenue to websites was the morally superior way to navigate the web...
Isn't Stack Exchange the emblematic case?
What about the stories of marketing managers who learned months after the fact that their credit card had expired and their google ad spend had ceased with no affect on traffic? Google isn't always an effective promotional vehicle.
this kills the entire internet vibe of the 90s, early 2k
FTFY: "couple of decades since has become". The vibes of passion-driven 1990s started to be overwhelmed by the din of money right when the Internet has become a major commerce venue, some time in early 2000s.
If your site is about your product, Google won't be able to serve the sign-up page from AI; the traffic would come your way. Same for a site that sell something: the traffic you're interested in would arrive at your checkout page.
Paid-content sites and ad-supported sites are screwed though, on top of their being screwed by archive.is and ad blockers.
That is not what "free mirror" is. Like, that is not the same thing at all.
(Torment Nexus rules apply here)
(It doesn't work for ad-funded writing, but while I have substantial sympathy there this has historically been an unpopular argument on HN)
This also could have been fine, it can bring back authenticity however for this to happen no one should be making money from it. Instead, only megacorps make money and they can just ignore your ideas and generate theirs. They control the distribution and the supply now.
It's the news media that will suffer the most.
Websites may go back to being simply labors of love.
The situation may be even worse. Back in the labor of love era, at least webmasters could get feedback from readers. In the LLM era, readers may not even know that the site exists. Without feedback/community, the overall quality of those sites will decrease over time.
ChatGPT/Claude does this today. I barely click or care for the source when they already have me the info I wanted.
My speculation is all information worth anything is going to be behind some kind of wall.
Similarly, if I use Gemini uses a website for an answer, it should pay something to those sites for the information it gathered. Sites would need to sign up to earn via Google, and I'd imagine there would be a certain threshold to cross to make it worth cutting checks... but that would make all these AI search tools feel much less scummy while providing site owners an incentive to keep sharing information on the internet.
Where a model like this would get messy is with sites like reddit. It's a very popular source for AI search, but the value comes from the users, not the platform itself.
If I'm gonna lose my job, at least give me that in return
If instead the purpose of your website is to manipulate users for financial gain (for instance by showing media attempting to manipulate their purchasing decisions, after receiving a bribe from a vendor), and the information is just a way to lure users, then maybe this malicious business model will finally be no longer possible.
As far as I know, you don't have a choice. They have no obligation to respect your wishes, and LLMs are legally allowed to scrape & republish your content.
I have no obligation to not send all scraper-looking traffic to a black hole full of zip bombs.
The counter argument is that sites are becoming more AI slop or may intentionally provide poison they don’t want to train on. There may be a cut off date after which training must be carefully curated; and the main body of data has already been collected.
Sites may still get traffic from agents searching for current information. Maybe even the resurgence of RSS? One can dream.
My income isn’t ads, just getting a cut of the sale on the complex products I help you buy. Even that sort of curation takes time and effort.
Even for all the things I do for free without any revenue whatsoever - most of it, really - I do want to feel some recognition. I don’t want the interaction to be mediated by an advertising company.
In that case, the consequence will be that people will stop having webs. It is already happening with personal and niche sites.
Google has always crawled your site and been an arse! Now you get to decide whether they are hallucinating!
You can drop pointers on Masto and other socials to your sites - that has not changed.
Do we need something else? ie you drop a link to somewhere else.
Mention
Site traffic
Mechanisms might exist to make you think you have one, the same way copywrite should prevent millions of books being gobbled up by TheZuck but ultimately do you really have a choice?
Rules and laws don't exists for you.
Making the information available that you put up your site for?
I spent 9 years of my life putting hard-earned information on the internet, and now big tech uses it to enrich themselves while putting me out of work. Even my backup plan - software development - is being devalued to hell. It's so damn depressing. We'll get the internet that we deserve.
I think if you look through this thread you’ll see a lot of skepticism of the AI results, and I think that is a fairly broadly held opinion. The obvious way to check the AI answer is to click through to some sources.
I think for Google to stop sending me traffic, it would have to be essentially perfect at AI answers. It will never get there, especially as so many searches are opinion-based like “what is the best mobile phone right now.”
They keep innovating even if it means cannabilizing their main revenue stream. Which increases the chances that you will not still be stuck producing film emulsions while everyone else is slowly making bank doing digital.
Websites will die on the vine if LLMs intermediate all the content.
The "website" of the future will be an API optimized for LLM crawlers, serving plain-text content that no end-user will ever view directly. The SEO game will change to LLMAO.
[1]: https://alternativeto.net/software/google-search/?license=fr...
[2]: https://alternativeto.net/software/google-search/?license=co...
https://www.epceurope.eu/post/epc-welcomes-landmark-cjeu-rul...
The current zeitgeist of them will, but I think not all.
My first website (GeoCities) was either before Google existed or very close to it. Connected to people via WebRings and directory listings. More recently, RSS feeds.
We had internet before we had browsers, then the browser took over as the main method of consuming the internet. It has a lot of problems and e.g. mobile apps are trying to fill the void, but they have their own problems. Next stage is the personal assistant agent, which will be the single entry point to the internet.
1) Sites will have mcp / APIs for LLMs. So that when I ask my AI Agent du jour. It can call any of the sites where I have subscriptions for information.
2) Sites that are passion projects will be harvested by our LLM overlords.
3) Sites that people don't type into their web browser and need ad revenue will die.
4) SEO will finally die.I recently search for some generic home appliance term and google's AI Overview blurb ended with "For more information about repairing home devices check samsung.com" (non clickable)
I am sure SEO companies will claim they can make that happen on purpose, and people will pay them for that.
On the contrary, it will flourish. It’s just that it’ll shift to whatever can trick LLMs into recommending your product.
https://www.anthropic.com/research/small-samples-poison
https://www.bbc.com/future/article/20260218-i-hacked-chatgpt...
This will happen especially with things like conspiracy theories because the choice might be to pollute the output or share the general consensus. Like searches for Apollo landing conspiracy theories can either chose to present “alternate facts” so that people can “do their own research” and conclude it is fake or LLM auto corrects to “Apollo landing happened”.
Newsletters have been around forever and never taken off like the open web and free blogging have. Slapping a Stripe integration on the backend hasn't led to Substack becoming a sustainable business not propped up by VC cash.
* A large fraction of people are realizing that some search engines are soft-censoring, already; * Another fraction of people will not accept AI agent slop as a replacement for website search; * Another fraction of users will get annoyed/tired with not getting directed anywhere; * Another fraction of users does not rely on AI-focused/AI-exclusive search engine.
Between the lot of those, the non-Google-covered Internet has, and will, live on. Yes, with _less_ traffic - nto _no_ traffic.
But I do agree it will be or already is a paradigm shift. And a painful one.
From that article:
> There’s a theory I’ve had for a long time that I’ve been calling “Google Zero” [...] Regular Decoder listeners have heard me talk a lot about Google Zero in the last year or two.
But the point here is not that he predicted it first, it's that he coined a term for it and has been extensively covering it.
The difference is where once they scraped, wrote a summary and invited users to go to the site, now they just provide the summary.
> Bachelor of Arts degree in political science from the University of Chicago.
His hot takes are best ignored, is just convenient click bait for their entire negativity angle.
In the future I don't even use Google but my bot does
While there are times where I want pure search (Kagi, Old Google) I mostly use LLMs to search now and have them provide me links for source data.
When I do use LLMs as a search engine I always want it integrated into my AI workflows with access to tools and scripts etc. I never want to have a conversation with a website that is geared towards advertising me products.
I was once very good at advanced Google search queries but they seem to no longer respect such queries - either showing irrelevant results or none at all (that should exist).
I don’t love LLMs, but they seem to not make up stuff very often these days and usually cite links to what they summarized. Sometimes the tone of the summary is slightly wrong “algorithm X was designed for Y” (when I know it wasn’t) but it’s otherwise very close to the mark.
What does amaze me, is the LLM seems to “understand” my question with very little context — I would have to give a human many more details about goals/intent.
I know damn LLMs are not capable of thought and are just a glorified search engine, but they do it well. Perhaps all my education made me little more…
I used to mock Sci-Fi movies where characters lazily dictated questions to the computer and it gave high quality answers.
We’re living in that world now.
Ah, but they are! Kagi is light years better than Google, and is a worthy replacement. You do have to pay for it, but I get my money's worth.
Though I will say I get much better results from the LLMs I pay for than the free ones with Google or DuckDuckGo, which seem to be way way way more prone to just make crap up based on your search and cite web pages that, when followed, don't have the claim being made in the AI search results at all. By contrast every "source" link I've followed in the for-money AIs has 100% backed what the AI said it backed. Don't judge by the free AIs the search engines put out, those things are probably starved of resources and are nearly useless.
(Which I did not intend as a commentary on Google's plans here, but it is a data point of interest... that pressure to cut costs on the "free" services is quite directly at odds with providing quality AI services for the forseeable future.)
And I’ve tried Google’s once or twice and seen it used once or twice, and used ChatGPT exactly once, last week, and I was not at all impressed by any of them. Their output, for what I’ve personally seen, has been nonsense, obvious, or unverifiable.
Same here. The free version probably gets orders of magnitude less of a compute budget, though, so I am not really surprised.
What I find really surprising though is how many people still have only ever used the free version of any LLM, even those that are heavy users and could easily afford it. It seems like a pretty big and basic product marketing mistake to me to limit capabilities instead of usage time in the free version! How are people supposed to learn what they'd get if they were to pay?
An increasing number of studies are indicating a reliance on "AI" leads to deleterious cognitive effects. I felt this acutely myself.
I've noticed a significant boost to my recall since shunning "AI" as much as possible.
You can't do something like this with search.
I've been trying to use LLMs for things and it makes mistakes all the time. Just this week i had multiple instances of various LLMs basically saying "just run the software with --flag-that-fixes-your-problem" or "edit the config and add solve-your-issue=true" hallucinating non-existant options. Even if i manually link the relevant documentation pages it will still just make basic mistakes. and if im having to read the documentation myself anyway to fix the AI's mistakes, why is the AI even in the loop.
its infecting search too, because blogspam/slop articles are managing to make their way into search results by just making up untrue information, claiming software can do things it cant, or has options that don't exist.
It's baffling that people have become so devoted to them as a source of information given how inaccurate they are. I've learned not to trust anything they say, ever, especially when it comes to technical subjects.
Using google search, will return roughly infinite recipe sites. The sites were generated to spam AI generated recipes surrounded by advertisements. None of them are really any good because they were generated by a script and not looked at by a human until I come along and click. The standard is for all recipes to have at least 10-15 screenfulls of vertical spam wrapped by ads for recipe pages. The internet, at least using Search, is now useless for food recipes. I would have better, faster luck driving to the public library and looking in a physical cookbook; at least those recipes were probably tested at least once by humans unlike the advertising spam sites. Nobody has 45 minutes to watch 44 minutes of filler material surrounded by ads on Youtube either. If you want to cook food, the internet is near dead at this time, unfortunately.
AI search will plagiarize the "Original Nestle Toll House" recipe from the back of every bag of chocolate chips ever made. Its a good recipe and I've baked them many times over the decades.
I wish the internet were more useful, but the people in charge of it don't want it to be useful; here have some ragebait and doomscroll while watching the ads.
This is a wild take, as someone that cooks a lot, and largely from the internet (though I do own a lot of cookbooks)
The reality is that just googling for recipes was never good to begin with. People have been complaining about SEO spam and ads on recipe sites forever, but those recipes were always trash even before they got to the absurd state they're in now. Serious eats, bon appetit, food 52, smitten kitchen, chefsteps, all have great recipes. Some of these have paywalls, although you can get around them. Serious eats though is totally paywall free and has a pretty wide range of recipes. There are other sites for more niche cuisines.
You'll still have ads, and you'll still have a wall of text before the recipe. But the ads are slightly less obtrusive, and the wall of text on the quality sites is why those SEO techniques exist in the first place: a recipe that is just "list of ingredients + instructions" and doesn't include any context is ultimately a crapshoot. The thinking that goes into a recipe shows that you're not going to be wasting your time because it's been tested and optimized.
I don't comprehend how the average person gets any useful information out of Google.
Currently, search engines are pretty bad at the second one because people try to use them as the first one
In other words, I have no use for an LLM summarizer; I want an LLM librarian, working with me to say "beep-boop, here are some resources that seem relevant to your query, feel free to resume this session later if you'd like to further refine your search".
Yet.
Is that useful enough to build a billion dollar advertising business around? My feeling now is not really.
Even for straight up searches, I find using an LLM to do a search and comb through the results is a better experience than Google is now for searching. If I'm specifically looking for esoteric web sites from 27 years ago on vintage computer hardware and software (thank god for Archive.org), Google is just ok for that.
> Can you find the girl who did a bunch of posts critisizing David Graeber's Debt? I thought it was really well done
> I saw a comment on hackernews a while ago about the optimal amount of credit card fraud being higher than zero because of game theory dynamics, can you find it.
In both cases it turned up the exact posts I was looking for in like 30 seconds which would have taken me much longer using traditional search. I've had similar success looking for technical documentation. It's downright magical how they're able to turn my vague idea of what I'm looking for into a pointer to the exact thing.
Surely we all understand that any commercial model is going to inevitably metastasize into this.
yeah man good thing LLMs are structurally incapable of being incentivized to sell you a product or render referral links, this is surely future-proof
Yeah, they probably aren't doing (most of) these now, but it doesn't take much mental energy to extrapolate once you factor nearly every other tech company's ethical trajectory and the current geopolitical environment. Substituting classic search entirely with LLMs is not a savvy move.
As soon as one gets annoying, expensive, advertiser heavy etc. you just rip it out and replace it with the other one. AFAICT there is zero lock-in or moat. I often am able to switch models in one click or command. This is why all the LLM providers are desperate for a product layer/comprehensive tool set.
Sure maybe they all end up that way, but there’s plenty of reasons corporate customers will want private LLM usage that is not skewed towards advertising. I am happy to pay for that.
Also, open source models are a bulwark against another search style ad Monopoly.
The question though: Why is that?
Is your Google search usage down because LLMs are "so much better"? Or because Google actively chose to destroy the quality of their search results to juice advertising revenue, and appears to continue to do so to juice AI adoption?
> and have them provide me links for source data.
And therein lies the answer: You don't care about the LLM, you're just using the LLM as a means to get the good links.
This is a common flow for me and works with other skills such “as find recent PR’s in our code base that are related to this research topic”
Also yes, I don't care about the LLM and I am just using it to get what I want because that is what LLMs are for.
ps: I'm not pro centralized corp. owning data and ai. But so far they are the cheap highway to answers
I've barely used Google for over 2 years.
I barely driven myself in a year.
I haven't written code in 6 months.
It makes me wonder why like 90% of the apps on my phone exist. I just want everything to be markdown files, skills, MCPs/API and then a nice TUI or voice to text.
Since this is how Google makes all their money, why are they killing it off? Do they think people will eventually pay for LLM search? Do they plan to stuff the results with ads, not even sharing the ad revenue with the content sources?
1. LLM Model providers are starting to charge real costs to users, revealing that AI usage is much more expensive than the subsidized rates we've been seeing for years.
2. Google is now using an LLM to answer every single google search that happens, for which Google bears the entire cost.
But I still want to also be able to do my normal, old school searching.
The advertisements fed the content, which fed the AI, which in turn feeds your AI workflows. AI is still not trusted unless it's output is grounded with sources.
My experience with AI searches is that they'll still be wrong a lot of times, but it will condense/flatten the content generating trash sites and give me alternatives from these deeper results. What I'm looking for is usally in there.
I already saw a article recently about how to set up a business domain which can reliably show up in a search result and dump overly positive reviews into anyone's context.
Even though the result is often good and combines information from multiple sources, it can also get things wrong by combining information from different eras or just plain outdated advice. AFAICT, without primary sources, the result is for entertainment purposes only.
And therein lies the rub; for years now Google's search results have returned useless SEO garbage. For now, it definitely seems like an LLM answer is better than what was being returned and I guess this is the reason why Google ripped it out.
Google Search has been terrible for a long time. But you could still dig through it and find those primary sources. That is, in my opinion, the primary purpose of a search engine. Replacing it with what an LLM has invented based on ingesting both reliable and unreliable sources is not viable for a large category of things. The main way we can judge the reliability of something is to loo at where it comes from. If I'm looking for, say, official US job market statistics, whether I trust the numbers I find depends on whether I find them published on a US government website or on a random person's blog. A number presented to me by a chat bot would not let me judge, so it's useless.
The best a language model could possibly do, by definition, is to find websites and link them to me, letting me judge their credibility. But then it's just a worse search engine.
Say I want to look up some game from my childhood, which I barely remember any details for. Going to google and trying is likely going to be very difficult unless I happen to get lucky with some key element. But if an LLM can get it right even a minority of the time, it can lead to me quickly finding the game I'm looking for.
This does depend upon the ability to evaluate the answer, like checking against source or some other option where you know a good answer from bad. If you can't, then it does become much more dangerous. Perhaps part of the reason AI seem to empower experts more than novices in some domains?
I worry that the LLMs are just the equivalent of a ‘lagging indicator’ of web quality though - that they will also soon be overwhelmed with the sheer volume of plausible nonsense that is the web now, just like search engines are.
Model collapse everywhere.
The other bots either make up links or simply don't provide any information that is distinguishable from the LLM predictive output.
Ironically Gemini is also very bad at this, while it should have been the best at Web search.
Gemini also does something very patchy, which is to provide "links" which are in fact GET queries into classic Google search. I'm guessing they did it this way because the links generated/hallucinated by the LLM were too unreliable.
I know that deepseek has links for every chain it makes where you can read the source and it's actually a good thing to check on that.
LLMs, that can supply valid links, give me a completely different variety of results. Either I am too dumb to search manually, too impatient or google search is just broken, but Gemini usually gives me something I can work with. I just wished I could blacklist some sources like medium.
I've been paying for Kagi for like four years. I like it but also resent that it's something I pay for now when I remember how good Google was 20 years ago.
This will remove any results from there for you.
Alternatively, site:news.ycombinator.com would search this website explicitly.
For most things I research, there is only secondary sources, reporting on an event, a trend…
Have you tried explicitly asking for links to primary sources?
From the past hundreds of Google searches I've done where I got an AI summary, I'd say the result is actually rarely good. At the very least 80% of the outputs contain critical mistakes, often exactly about the specific thing you're asking.
I have seen it hallucinate things confidently but that is usually when it has no direct sources to pin down the output.
Even though the result is often coherent and confidently synthesizes information from multiple experiences, it can also hallucinate, suffer from recency bias, or accidentally merge memories from different decades. AFAICT, without access to the underlying telemetry, human responses are for entertainment purposes only.
The real problem here is assuming this takes off what incentives will anyone have to provide the information to feed the beast?
An llm rephrashing / regurgitating other websites is imo different, because you loose the direct connection to the original source. Even if llms give sources they also directly give you a plausible (but unreliable) answer to your question. They are right often enough that you get lulled in to the false sense of security of not needing to read the original sites. I'd much prefer them to just give a clean list of sources like early google, but then why would you need an llm.
It's a pity that probably the main reason you'll need an llm to find anything on the web is to weed out all the llm-generated low quality garbage.
I disagree, Alta Vista had excellent an excellent search UI with Boolean logic. Google discarded that because it thought it 'knew better' in terms of Page Rank.
A-V could be fine-tuned to find a page with exactly the search terms, Google just gave fuzzy approximations from a very large search set.
The end of search traffic will kill all but the largest sites, and prevent countless new ones from being developed or getting traction. Given how global trends are going I expect the remaining sites to be increasingly monitored and censored/biased. I'm not looking forward to a world where social media means talking to some bots tuned specifically to addict you, and don't know too many people who are. Although big tech executives certainly seem to be in the latter group.
Did AltaVista get replaced by the owner of the site to justify a giant investment?
Now, the spam is back and it’s coming from Google itself.
Of course, even Google the search engine has gotten worse at surfacing interesting websites. First came the SEO spam websites, now the slop websites.
I'm glad that alternatives like Kagi exist.
Nearly all other search engines give better results with less annoying ads at the top. First thing I do when installing a new browser is switch the default search engine to duckduckgo. Duckduckgo's results are less good than google used to be, bu way better than google currently is.
I caught myself yesterday starting to ask Claude in my ide what ship did grace and Rocky take back to Rocky's homeworld.
If their leadership has an itch they'll scratch it until it's raw.
Did Meta patiently wait until exaggerated glass frames were viable in the market? Or did they get lucky?
Or did they have some Machiavellian plot to steer this fashion for years and pave the way for their product..? ;-)
It's very much a Prisoner's Dilemma. Legacy search and the open Internet was an equilibrium that only existed while the majority of people co-operated. Once you allow an individual actor the ability to create large chunks of the Internet, it dies. Your only option is to be that individual actor.
It doesn’t really say in the article search is going away.
A lot of Google search is in the format of “company X”, then clicking the third link down (after two paid ads) to open company X’s website. (I have no idea how much this is, but it’s gotta be a lot)
That’s like free money. It doesn’t look like they’re getting rid of search, but expanding the AI/conversational features.
According to Kagi I search 11-50 times a day, about 600 searches per month. I do about 10-20 AI/assistant conversations per week, so maybe 2-3 a day, and usually when search fails or I can’t get the right query words in. I do this over my AI apps because the Kagi index is faster/better.
I can’t imagine Google would give up the bulk queries that pull in easy ad revenue. But if Google can push/upsell you into a really high value referral where they can start pulling a claim in your purchase, I could see them pushing to get into that.
Is the idea that by making the new AI chat UX the default, that's how they're forcing people into it and making them not able to search? Or is there something I'm missing?
> Instead of returning a simple list of links, Google Search will drop users into AI-powered interactive experiences at times.
So basically you'd get redirect into a chatbot interface, rather than letting you browse search results as normal, "AI-powered interactive experience" tends to be euphemism for chatbot UIs, is my experience at least.
Yes, that is what every user ever wanted! A UI that just randomly changes!
Never give the customers what they want give them what makes you money.
Going all in like this carries a very real risk of burning users onto other platforms and the continued evolution of integrated search bars are already slicing off significant user segments.
People who wanted to ask a specific question now won't have that option. Instead, they'll simply be shown whatever Google thinks is most relevant to them at that moment. The "Chat" UI we've grown so accustomed to is on its way out.
Ads have been close enough to covering costs for conventional internet search that even though I'm clearly the product and not the customer the relationship has still generally worked. If AI makes the "searching" 50 times more expensive, though, that could shift the relationship pretty badly in a direction of "if you're not paying for this you're not getting honest results". Paying may not sufficient for honesty but it may be necessary.
Honest question. But anyone who wants to answer this and who looks at Google's income/profit/revenue and is bedazzled by the size, don't forget to divide out by the number of Google's customers and ponder what that means. The per-user numbers are the much more relevant numbers and much less likely to cause Large Number Syndrome.
This is the end. The fact that they had to say that this is "free of charge" means they are thinking about cost. Both to them now and us in the future. This sucks.
Sometimes I get SO questions from 13 years ago with a version of a library nobody uses anymore. If I search in my native language almost every result is a Reddit thread that was originally in English but was machine translated to Portuguese and Google is fine with that for some reason. Searching for images just gets you AI images.
If you need opinions on "what is the best X" you end up getting some content marketing from a website that offers some online service and probably has an .ai or .io domain.
No matter what you search you get an AI overview wasting space and slowly generating an answer that could be completely made up, just wasting your time in two ways at once.
Most long queries are simply completely ignored by Google. Almost every word ignored in order to show some sort of most popular result. You don't even know if there are no pages on the internet with what you searched for or if Google simply doesn't care to show any website that isn't sufficiently popular. In other words, never personal websites or blogs, only platforms and cloud services' content marketing blogs are allowed to appear in the results.
I've found myself several times asking Claude if there is "research" on a subject or another because I don't want to have to try to wade through the AI overview, sponsored results, SEO spam, reddit, repeated results on the second page, etc. just to find something that ressembles actual relevant information.
Hell yes, set your Google language to English and get auto-translated results. It appears that none in Google's leadership speaks more than one language. It's frustrating how wrong they get this with YouTube as well.
There’s not much room to squeeze in when your competitors hold the keys to 15 million top websites.
I find it wild that "at scale" we can bypass anti-bot measures, but just "normal" internet use (i.e Non-Google Browser or VPN) will throw a million captchas at you.
cgnat is pretty bad too.
Why would website authors _want_ to prevent crawling by other search engines?
So if another search engine does arise, it won't find anything useful, because the useful content on the web has been buried under slop, and largely removed. Your best bet today is a curated directory, sorta like the original Yahoo, where you allowlist the web to only real sites, download them, and make them searchable. I think this is actually Kagi's approach. But the open web as we knew and loved it is dead.
https://blogs.microsoft.com/blog/2023/02/07/reinventing-sear...
[1]: https://alternativeto.net/software/google-search/?license=co...
Very few of the smaller search engines actually do their own indexing for exactly this reason.
When I use google, usually from my phone, I am reminded of why I don't use google on desktop.
With the announcement of this move by them, I just manually removed google as an address bar search engine option in all my browsers on desktop and mobile.
Human produced content should be separated from sites primarily hosting slop. That seems solvable?
(For example, a random Redditor once said something, and the AI repeats it confidently and authoritatively, as if it is universal truth widely accepted by experts and applicable to the query.)
We'll see if it works. I use chatgpt for complex queries, and for throaway ones I use just don't log in to it.
I wouldn't use google for the same queries, since I normally use google to find specific things, not for a chatbot.
have i been A/B tested into something, or has this been live for months? this doens't seem new.
I'm aware that most people still use it, but it's nothing like the glory days when Google was far ahead of the pack.
Nowadays, if Google Search were to disappear, I would hardly notice. That would have been unthinkable 20 years ago. The alternatives to Google got better, and Google itself got a lot worse.
They still have the numbers but they don't have the product anymore and they're just juicing it until the end.
Time to switch to old style search engines which still return the 10 blue links, with an AI option.
I've been pretty sceptical about Kagi, feeling that it was a bit to expensive and perhaps just relying on other companies indexes to much and I spend to much time looking at how many searches I had left. After getting the subscription I just don't want to go back, the price is perfectly reasonable for the value. Being able to just search again and not sort through junk and spam and ads and just getting the pages I want and need is amazing.
Honestly it's a slightly weird feeling to look a the results from Kagi and notice it found exactly what you where looking for.
Once my gifted credits run out, that is going to be an easy renewal for me. I do not want to go back, even if I think Ecosia is a good option.
Even after the recent AI run-up, disk prices are about $20/TB for a 20TB, so you can store this index on 3-5 hard disks that will cost you about $1200-2000. For self-hosted use you don't need to serve them in 50ms, so you don't need to put the whole thing in RAM like Google did, you can serve off of disk.
ElasticSearch uses basically the same data structures and gives you the same infrastructure that Google's ~late-00s search stack did, and is actually more advanced in some respects (like ad-hoc queries, debuggability, and updateability), so software isn't much of an issue.
The big part missing that can't really be replicated today is the huge web of authentic hyperlinks. The reason Google was so good at search was because many humans effectively "tagged" a given webpage with a series of short, descriptive words and phrases. When they went to search for a page, Google could mine this huge treasure trove of backlinks to identify exactly what the page was good for, even if those search terms never appeared on the page. SEO and link farms kinda killed this, as did the rise of social media walled gardens, and so the Google of 2009 basically wouldn't work today anyway. Maybe if you pulled old versions of Common Crawl or archive.org you could reconstruct it, but the relevant pages are often offline anyway today.
At least if we're speaking a more generalist web search it requires dedicated hardware, that's pretty costly. Marginalia's production server cost about $20k back when RAM and SSDs were cheap. It used to run on $5k of PC hardware before, but that was very limiting.
So no data center, but at the same time, not everyone has that sort of cash to throw around.
If Google Search changes, then Kagi's search will be impacted directly.
> This is not a competitive market. It is a monopoly with a distant second place.
> The search index is irreplaceable infrastructure. Building a comparable one from scratch is like building a parallel national railroad. Microsoft spent roughly $100 billion over 20 years on Bing and still holds single-digit share. If Microsoft cannot close the gap, no startup can do it alone.
* Kagi seems to just scrape and provide a mix of other search engine's results, meaning it's really just a metasearch engine.
The ai generated summaries are slow, often miss the point of question and seem to be focused on user engagement, not in giving set of infos to sort out myself.
So there are two different types of queries, and when I want llm's answer, I ask chatgpt directly.
They are surely hearing themselves say the same things about how Google is “everything in one place” that every failed corporation parrots on their way out.
They are making the same mistake as Yahoo did. Ironic.
After I got tired of perplexity's nonsense I realized the workspace account (which I have for custom email domain) came with fancy gemini pro chat.
Was a fucking ripoff for the domain thing...but domain plus premium chat clearly marked as "we won't train on your data"...the math starts mathing better again.
I started using Google because the interface was far superior in the time before adblocking existed and after Flash existed.
Search results were better because they did not contain hidden paid results.
Search was measurably improved with the second generation of Wikipedia. Google did an excellent job understanding this and tended to just place the Wikipedia article at the top. Also helpful for Google was that Wikipedia's original search engine was useless, similar for YouTube whenever it came around.
Today, I use Google less than once per month. I'm not sure I've been there at all this year. Maybe at the end of last year I was using it and found nothing better than I found on other search engines.
"Did you mean?" + excluded word was a pretty clear indication they stopped caring to provide any meaningful search whatsoever.
Web 2.0 was Yahoo Pipes, public APIs, IFTTT, etc. while this new "Web 3.0" acknowledges that those capabilities would rather be gatekept behind AI instead of entirely removed.
At the very least we do get some of that functionality back without resorting to scraping anymore and it's now accessible to the layperson. I would think this would nudge the layperson to demand more and inevitably want the actual data without the training wheels or sandboxes. Is that not a "good" thing?
Is the pushback against this out of genuine concern or just ideological?
What we need now is back to the roots - just a simple grep for the internet augmented by pagerank and eventually some sort of ai and harness to sort the rubish out. The AI companies have the data and the harnesses.
Google killed themselves when they made sure you can't search direct quotes or outside of your region. If I am going to sort trough vague crap - it is better AI to do it. And AI doesn't look at ads.
There is real opening for a company that just crawls and gives access to other companies to build on top of the collected stuff.
I think we can concede the WWW vision of distributed libertarian publishing has been dead for a long time. LLMs were just the final straw.
We ended up concentrating syndication on a few media companies like Google, Social Media companies.
Look at the profit margins of advertising companies vs producers and you’ll get an idea as to why.
But at least I've experienced the golden age. I feel bad for all the kids who will never know what once was.
[1]: https://alternativeto.net/software/google-search/?license=fr...
[2]: https://alternativeto.net/software/google-search/?license=co...