If a customer asks the AI what product can solve their problem and it replies with our product that is a huge win.
If your business is SEO spam with online ads, chatgpt might eat it. But if your business is selling some product, chatgpt might help you sell it.
As of last week, impressions have also dropped. Maybe people not clicking on my links anymore is the result?
> The best refrigerator on the market varies based on individual needs, but top brands like LG and Samsung are highly recommended for their innovative features, reliability, and energy efficiency. For specific models, consider LG's Smart Standard-Depth MAX™ French Door Refrigerator or Samsung's smart refrigerators with internal cameras.
Optimizing your site for LLM means that you can direct their gestalt thinking towards your brand.
Yes, SEO can bring traffic to your site, but if your visitors see nothing of value, they'll quickly leave.
Humans get HTML, bots get markdown. Two tiny tweaks I’d make...
Send Vary: Accept so caches don’t mix Markdown and HTML.
Expose it with a Link: …; rel="alternate"; type="text/markdown" so it’s easy to discover.
- https://x.com/bunjavascript/status/1971934734940098971
1. The Jina reader API - https://jina.ai/reader/ - add r.jina.ai to any URL to run it through their hosted conversion proxy, eg https://r.jina.ai/www.skeptrune.com/posts/use-the-accept-hea...
2. Applying Readability.js and Turndown via Playwright. Here's a shell script that does that using my https://shot-scraper.datasette.io tool: https://gist.github.com/simonw/82e9c5da3f288a8cf83fb53b39bb4...
This is much cheaper to run on a server. For example: https://github.com/ozanmakes/scrapedown
Also, I doubt most large-scale scrapers are running in agent loops with tool calls, so this is probably necessary for those at a minimum.
It seems “obvious” to me that if you have a tool which can request a web page, you can make it so that this tool extracts the main content from the page’s HTML. Maybe there is something I’m missing here that makes this more difficult for LLMs, because before we had LLMs, this was considered an easy problem. It is surprising to me that the addition of LLMs has made this previously easy, efficient solution somehow unviable or inefficient.
I think we should also assume here that the web site is designed to be scraped this way—if you don’t, then “Accept: text/markdown” won’t work.
https://toffelblog.xyz/blog/gemini-overview/ https://news.ycombinator.com/item?id=23730408
https://gemini.circumlunar.space/ https://news.ycombinator.com/item?id=23042424
Yes, for prompts. Given how little XML is out on the public internet it'd be surprising if it also applies to data ingestion from web scraping functions. It'd be odd if Markdown works better than HTML to be honest, but maybe Markdown also changes the content being served e.g. there's no menu, header, or footer sent with the body content.