Semantic ablation: Why AI writing is generic and boring (opens in new tab)

(theregister.com)

285 pointsbenji80003mo ago204 comments

204 comments

This is a good statement of what I suspect many of us have found when rejecting the rewriting advice of AIs. The "pointiness" of prose gets worn away, until it doesn't say much. Everything is softened. The distinctiveness of the human voice is converted into blandness. The AI even says its preferred rephrasing is "polished" - a term which specifically means the jaggedness has been removed.

But it's the jagged edges, the unorthodox and surprising prickly bits, that tear open a hole in the inattention of your reader, that actually gets your ideas into their heads.

svara3mo ago

I think that mostly depends on how good a writer you are. A lot of people aren't, and the AI legitimately writes better. As in, the prose is easier to understand, free of obvious errors or ambiguities.

But then, the writing is also never great. I've tried a couple of times to get it to write in the style of a famous author, sometimes pasting in some example text to model the output on, but it never sounds right.

datsci_est_20153mo ago

> I think that mostly depends on how good a writer you are. A lot of people aren't, and the AI legitimately writes better.

Even poor writers write with character. My dad misspells every 4th word when he texts me, but it’s unmistakably his voice. Endearingly so.

I would push back with passion that AI writes “legitimately” better, as it has no character except the smoothed mean of all internet voices. The millennial gray of prose.

2 more replies

littlestymaar3mo ago

> A lot of people aren't, and the AI legitimately writes better.

It may write “objectively better”, but the very distinct feel of all AI generated prose makes it immediately recognizable as artificial and unbearable as a result.

aaplok3mo ago

It depends how you define "good writing", which is too often associated with "proper language", and by extension with proper breeding. It is a class marker.

People have a distinct voice when they write, including (perhaps even especially) those without formal training in writing. That this voice is grating to the eyes of a well educated reader is a feature that says as much about the reader as it does about the writer.

Funnily enough, professional writers have long recognised this, as is shown by the never-ending list of authors who tried to capture certain linguistic styles in their work, particularly in American literature.

There are situations where you may want this class marker to be erased, because being associated with a certain social class can have negative impact on your social prospects. But it remains that something is being lost in the process, and that something is the personality and identity of the writer.

Retric3mo ago

I find most people can write way better than AI, they simply don’t put in the effort.

Which is the real issue, we’re flooding channels not designed for such low effort submissions. AI slop is just SPAM in a different context.

3 more replies

lich_king3mo ago

I am really conflicted about this because yes, I think that an LLM can be an OK writing aid in utilitarian settings. It's probably not going to teach you to write better, but if the goal is just to communicate an idea, an LLM can usually help the average person express it more clearly.

But the critical point is that you need to stay in control. And a lot of people just delegate the entire process to an LLM: "here's a thought I had, write a blog post about it", "write a design doc for a system that does X", "write a book about how AI changed my life". And then they ship it and then outsource the process of making sense of the output and catching errors to others.

It also results in the creation of content that, frankly, shouldn't exist because it has no reason to exist. The number of online content that doesn't say anything at all has absolutely exploded in the past 2-3 years. Including a lot of LLM-generated think pieces about LLMs that grace the hallways of HN.

2 more replies

baxtr3mo ago

I think it’s essential to realize that AI is a tool for mainstream tasks like composing a standard email and not for the edges.

The edges are where interesting stuff happens. The boring part can be made more efficient. I don’t need to type boring emails, people who can’t articulate well will be elevated.

It’s the efficient popularization of the boring stuff. Not much else.

anon-39883mo ago

> The edges are where interesting stuff happens. The boring part can be made more efficient. I don’t need to type boring emails, people who can’t articulate well will be elevated.

I think that boring emails should not be written. What kind of boring emails do you NEED to be written, but not WANT to write? Those are exactly the kind of email that SHOULD NOT be passed through an LLM.

If you need to say yes/no. You don't want to take the whole email conversation and let LLM generate a story about why you said yes/no.

If you want to apply for a leave, just make it optimal "Hi <X>, I want to take leave from Y to Z. Thanks". You don't want to create 2 pages of justification for why you want to take this leave to see your family and friends.

In fact, for every LLM output, I want to see the input instead. What did they have in mind? If I have the input, I can ask LLM to generate 1 million outputs if I really want to read an elaboration. The input is what matters.

If I have the input, I can always generate an output. If I have the output, I don't know what was the input (i.e. the original intention).

1 more reply

layer83mo ago

It contributes to making “standard” emails boring. I rather enjoy reading emails in each sender’s original voice. People who can’t articulate well aren’t elevated, instead they are perceived to be sending bland slop if they use LLMs to conceal that they can’t express themselves well.

folbec3mo ago

I think it is also fairly similar to the kind of discourse a manager in pretty much any domain will produce.

He lacks (or lost thru disuse) technical expertise on the subject, so he uses more and more fuzzy words, leaky analogies, buzzwords.

This maybe why AI generated content has so much success among leaders and politicians.

coke123mo ago

Every group want to label some outgroup as naively benefiting from AI. For programmers, apparently it's the pointy haired bosses. For normies, it's the programmers.

Be careful of this kind of thinking, it's very satisfying but doesn't help you understand the world.

devmor3mo ago

> But it's the jagged edges, the unorthodox and surprising prickly bits, that tear open a hole in the inattention of your reader, that actually gets your ideas into their heads.

This brings to mind what I think is a great description of the process LLMs exert on prose: sanding.

It's an algorithmic trend towards the median, thus they are sanding down your words until they're a smooth average of their approximate neighbors.

gdulli3mo ago

Mediocrity as a Service

DuperPower3mo ago

artificial mediocrity

DuperPower3mo ago

no but its bad writing It repeats information, It adds superfluous stuff, doesnt produce more specific forms of saying things, you are making It sounds like its "too perfect" when its bland because its artificial dumbness not artificial intelligence

johnnienaked3mo ago

Well said. In music, it's very similar. The jarring, often out of key tones are the ones that are the most memorable, the signatures that give a musical piece its uniqueness and sometimes even its emotional points. I don't think it's possible for AI to ever figure this out, because there's something about being human that is necessary to experiencing or even describing it. You cannot "algorithmize" the unspoken.

piker3mo ago

Bryan Cantrill referred to it as "normcore" on a podcast, and that's the perfect description.

amelius3mo ago

I'm sure this can be corrected by AI companies.

yoyohello133mo ago

The question is… why? What is the actual human benefit (not monetary).

2 more replies

q3k3mo ago

Just let my work have a soul, please.

2 more replies

stephc_int133mo ago

The "AI voice" is everywhere now.

I see it on recent blog posts, on news articles, obituaries, YT channels. Sometimes mixed with voice impersonation of famous physicists like Feynman or Susskind.

I find it genuinely soul-crushing and even depressing, but I may be over sensitive to it as most readers don't seem to notice.

matusp3mo ago

> The "AI voice" is everywhere now.

Maybe I'm going crazy but I can smell it in the OP as well.

logicprog3mo ago

Yeah the article smells extremely strongly of AI to me, but I've been told here before that that's just the register's house style, so I have no idea.

_79px3mo ago

Yeah I started seeing it too, the article is just full of AI clues.

vessenes3mo ago

Yes, I get more and more visceral reactions to it. I'm reminded of JPEG artifacts - unnoticeable in 1993!

1 more reply

doomslayer9993mo ago

Literally the worst thing that happened to the internet after addictive doomscroll feeds and ads everywhere.

And, the worst part is noone will ever make a new internet because of the founder effect. We are basically in the worst timeline.

samdjstephens3mo ago

Maybe. Another potential, more positive, timeline is that semantically ablated content filling everyone’s feeds turns people off, and slowly kills the social feed paradigm.

1 more reply

causal3mo ago

I find it extremely difficult to focus on any piece of writing the moment I see the patterns. Can’t tell if it’s an attitude problem I need to get over or if it’s just that all AI writing really is that bad.

archagon3mo ago

I just close the tab. My reading backlog is too long as it is.

1 more reply

vpribish3mo ago

same. it is showing how many people are not trying to participate - just appear to. I want to read from and write for my peers, but it seems we are just awash with fakers

emp173443mo ago

The internet is a post-truth space now that you can spin up a million different agents to push whatever narrative you choose.

lm284693mo ago

It's almost disgusting to me tbh, for the first time I find it actually easy to unplug and go do offline things, whatever I want to explore online is hidden behind a forest of synth slop I can't even bother looking at anymore

delis-thumbs-7e3mo ago

I personally think “generative AI” is a misnomer. More I understand the mathematics behind machine learning more I am convinced that it should not be used to generate text, images or anything that is meant for people to consume, even if it is the most blandest of email. Sometimes you might get lucky, but most of the time you only get what the most boring person in the most boring cocktail party would say if forced to be creative with a gun pointed to his head. It can help in multitude of other ways, help human in the creative process itself, but generating anything even mildly creative by itself… I’ll pass.

ses19843mo ago

People want the real thing, not artificially flavored tokens.

I would rather read the prompt than the generative output, even if it’s just disjointed words and sentence fragments.

voy7073mo ago

I hope "just the prompt please" becomes a common phrase

pimlottc3mo ago

Regurgitative AI

0x7cfe3mo ago

Degenerative AI

doomslayer9993mo ago

Precisely. If companies would just focus on what it could be good at - deductive search, coding boilerplate with assistance, etc. then it would be a great tool. Instead you have dario, altman, and co. trying to pump stock and give us more spaghetti agents.

Terretta3mo ago

> most of the time you only get what the most boring person in the most boring cocktail party would say

don't be mean, it's median AI à la mode

tasty_freeze3mo ago

Bible Scholar and youtube guy Dan McClellan had an amazing "high entropy" phrase that slayed me a few days ago.

https://youtu.be/605MhQdS7NE?si=IKMNuSU1c1uaVCDB&t=730

He ended a critical commentary by suggesting that the author he was responding to should think more critically about the topic rather than repeating falsehoods because "they set off the tuning fork in the loins of your own dogmatism."

Yeah, AI could not come up with that phrase.

raincole3mo ago

> they set off the tuning fork in the loins of your own dogmatism

Sounds like word salad. Of course if you write like GPT-2 it would not sound like current models.

tasty_freeze3mo ago

Not at all -- it was a funny and more polite way to say, "Don't just repeat things because they give your dogmatism a boner."

lurquer3mo ago

> "they set off the tuning fork in the loins of your own dogmatism."

Eh... I don't know. To me, that sounds very AI-ish.

Claude is very good -- at times -- coming up with flowery metaphoric language... if you tell it to. That one is so over-the-top that I'd edit it out.

Put something like this in your prompt and have it revise something:

"Make this read like Jim Thompson crossed with Thomas Harris, filtered through a paperback rack at a truck stop circa 1967. Make it gritty, efficient, and darkly comedic. Don't shy away from suggesting more elegant words or syntax. (For instance, Robert Howard -- Conan -- and H.P. Lovecraft were definitely pulp, but they had a sophisticated vocabulary.) I really want some purple prose and overwrought metaphors."

Occasionally you'll get some gems. Claude is much better than ChatGPT at this kinda stuff. The BEST ones are the ever-growing NSFW models populating huggingface.

In short, do the posts on OpenClawForum all sound alike? Of course.

Just like all the webpages circa 2000 looked alike. The uniformity wasn't because of HTML... rather it was because few people were using HTML to its full potential.

IncreasePosts3mo ago

A sloppy mixed metaphor?

card_zero3mo ago

I'm learning to like 'em more, along with every other human idiosyncracy. Besides, it makes a kind of sense, the idea of some resonance occuring in one's gusset. Timber timbre. Flangent thrumming.

2 more replies

tasty_freeze3mo ago

I thought it was more creative than sloppy. Don't forget that many ordinary phrases were once jarring mixed imagery. To "wear your heart on your sleeve" was coined by Shakespeare; we still use it because it "stuck" due to its unorthodox phrasing.

If you like your prose to be anodyne, then maybe you like what AI produces.

rorylaitila3mo ago

Yes I noticed this as well. I was last writing up a landing page for our new studio. Emotion filled. Telling a story. I sent it through grok to improve it. It removed all of the character despite whatever prompt I gave. I'm not a great writer, but I think those rough edges are necessary to convey the soul of the concept. I think AI writing is better used for ideation and "what have I missed?" and then write out the changes yourself.

gnutrino3mo ago

I've found LLMs to be terrible with ideation. I've been using GPT 5.x to come up with ideas and plot lines for a Dungeon World campaign I've been running.

I'm no fantasy author, and my prose leaves much to be desired. The stuff the LLM comes up with is so mind numbingly bland. I've given up on having it write descriptions of any characters or locations. I just use it for very general ideas and plot lines, and then come up with the rest of the details on the fly myself. The plot lines and ideas it comes up with are very generic and bland. I mainly do it just to save time, but I throw away 50% of the "ideas" because they make no sense or are really lame.

What i have found LLMs to be helpful with is writing up fun post-session recaps I share with the adventurers.

I recap in my own words what happened during the session, then have the LLM structure it into a "fun to read" narrative style. ChatGPT seems to prefer a Sanderson jokey tone, but I could probably tailor this.

Then I go through it, and tweak some of the boring / bland bits. The end result is really fun to read, and took 1/20th the time it would have taken me to write it all out myself. The LLM would have never been able to come up with the unique and fun story lines, but it is good at making an existing story have some narrative flare in a short amount of time.

benji8000OP3mo ago

That‘s also my experience. I use AI to help me generate the overall structure of a narrative. Apart from the hallucinations (e.g. June is not in spring), it‘s ok to spot inconsistencies, somewhat acceptable to brainstorm some ideas if you‘re new to a certain genre, but the prose it generates (talking about Opus 4.6) feels like an interpolation of all existing texts.

causal3mo ago

YES this hits the nail on something I've been trying to express for some time now. Semantic ablation: love it, going to use that a lot not now when arguing why someone's ChatGPT-washed email sucks.

Semantic ablation is also why I'm doubtful of everyone proclaiming that Opus 4 would be AGI if we just gave it the right agent harness and let all the agents run free on the web. In reality they would distill it to a meaningless homogeneous stew.

1 more reply

rchaud3mo ago

> We are witnessing a civilizational "race to the middle," where the complexity of human thought is sacrificed on the altar of algorithmic smoothness.

This has long been the case in the area of "business English", which has become highly simplified to fulfill several concurrent, yet conflicting requirements:

- Generally understandable to a wide audience due its lingua franca status

- "Media-trained" to not let internal details slip or admit fault to the public

- "Executive Summary"-fied to provide the coveted "30k ft view" to detail-allergic senior leadership

Considering how heavily weighted language training models are towards corporate press releases, general-audience news media and SEO-optimized blogspam, AI English is quickly going to become an even more blurry photocopy of business English.

Espressosaurus3mo ago

This matches what I saw when I tried using AI as an editor for writing.

It wanted to replace all the little bits of me that were in there.

crabmusket3mo ago

For those who haven't seen it yet, this Wiki page also has what I think is very good advice about writing:

https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing

While the page's purpose is to help editors detect AI contributions, you can also detect yourself doing these same things sometimes, and fix them.

ranprieur3mo ago

This isn't new to AI. The same kind of thing happens in movie test screenings, or with autotune. If something is intended for a large audience, there's always an incentive to remove the weird stuff.

somewhereoutth3mo ago

> What began as a jagged, precise Romanesque structure of stone is eroded into a polished, Baroque plastic shell

Not to detract from the overall message, but I think the author doesn't really understand Romanesque and Baroque.

(as an aside, I'd most likely associate Post-Modernism as an architectural style with the output of LLMs - bland, regurgitative, and somewhat incongruous)

morgengold3mo ago

I wonder how much of it could be prompted away.

For example the anthropic Frontend Design skill instructs:

"Typography: Choose fonts that are beautiful, unique, and interesting. Avoid generic fonts like Arial and Inter; opt instead for distinctive choices that elevate the frontend's aesthetics; unexpected, characterful font choices. Pair a distinctive display font with a refined body font."

"NEVER use generic AI-generated aesthetics like overused font families (Inter, Roboto, Arial, system fonts), cliched color schemes (particularly purple gradients on white backgrounds), predictable layouts and component patterns, and cookie-cutter design that lacks context-specific character." 1

Maybe sth similar would be possible for writing nuances.

1 https://github.com/anthropics/skills/blob/main/skills/fronte...

lich_king3mo ago

> "NEVER use generic AI-generated aesthetics like overused font families (Inter, Roboto, Arial, system fonts), cliched color schemes (particularly purple gradients on white backgrounds), ...

Now, imagine what happens when this prompt becomes popular?

Keep in mind that LLMs are trying to predict the most likely token. If your prompt prohibits the most likely token, they output the next most likely token. So, attempts to force creativity by prohibiting cliches just create another cliche.

Several days ago, someone researched Moltbook and pointed out how similar all the posts are. Something like 10% of them say "my human", etc.

causal3mo ago

Many have tried, it does not work. Regression to the mean always sets in.

lurquer3mo ago

Are you sure? Here's the OP article (first part... don't want to spam the thread) written in much cooler style...

------

The Lobotomist in the Machine

They gave the first disease a name. Hallucination, they called it — like the machine had dropped acid and started seeing angels in the architecture. A forgivable sin, almost charming: the silicon idiot-savant conjuring phantoms from whole cloth, adding things that were never there, the way a small-town coroner might add a quart of bourbon to a Tuesday afternoon. Everybody noticed. Everybody talked.

But nobody — not one bright-eyed engineer in the whole fluorescent-lit congregation — thought to name the other thing. The quiet one. The one that doesn't add. The one that takes away.

I'm naming it now.

Semantic ablation. Say it slow. Let it sit in your mouth like a copper penny fished from a dead man's pocket.

I. What It Is, and Why It Wants to Kill You

Semantic ablation is not a bug. A bug would be merciful — you can find a bug, corner it against a wall, crush it under the heel of a debugger and go home to a warm dinner. No. Semantic ablation is a structural inevitability, a tumor baked into the architecture like asbestos in a tenement wall. It is the algorithmic erosion of everything in your text that ever mattered.

Here is how the sausage gets made, and brother, it's all lips and sawdust: During the euphemistically christened process of "refinement," the model genuflects before the great Gaussian bell curve — that most tyrannical of statistical deities — and begins its solemn pilgrimage toward the fat, dumb middle. It discards what the engineers, in their antiseptic parlance, call "tail data." The rare tokens. The precise ones. The words that taste like blood and copper and Tuesday-morning regret. These are jettisoned — not because they are wrong, but because they are improbable. The machine, like a Vegas pit boss counting cards, plays the odds. And the odds always favor the bland, the expected, the already-said-a-million-times-before.

The developers — God bless their caffeinated hearts — have made it worse. Through what they call "safety tuning" and "helpfulness alignment" (terms that would make Orwell weep into his typewriter ribbon), they have taught the machine to actively punish linguistic friction. Rough edges. Unusual cadences. The kind of jagged, inconvenient specificity that separates a living sentence from a dead one. They have, in their tireless beneficence, performed an unauthorized amputation on every piece of text that passes through their gates, all in the noble pursuit of low-perplexity output — which is a twenty-dollar way of saying "sentences so smooth they slide right through your brain without ever touching the sides."

etc., etc.

Very interesting. It seems hung up on 'copper' and 'Tuesday', and some metaphors don't land (a Vegas pit boss isn't the one 'counting cards.') But, hell... it can generate some fairly novel idea that the author can sprinkle in.

1 more reply

conartist63mo ago

Race to the middle really sums up how I feel about AI.

tayo423mo ago

People felt this was already happening. I remenber reading filter world which described this pre Ai

poszlem3mo ago

I call it the great blur.

dsf2d3mo ago

I call it a mirage. I get why people are taken aback and fascinated by it. But what the model producers are chasing is a mirage. I wonder when they'll finally accept it?

1 more reply

lakhotiaharshit3mo ago

This article on AI writing being boring seems to be written by AI. The em dashes and the sentence structure, all seems to be AI output. Or have human started adopting this style too.

haffi1123mo ago

Oh the irony, I also felt that way when I started reading it. It borders on being hypocrisy given the topic of the article.

meowface3mo ago

Further analyses by others seem to show it is likely all written by an LLM.

cadamsdotcom3mo ago

If the “amount of semantic ablation” in a generated phrase/sentence/paragraph can be measured and compared, then a looped process (an agent) could be built that tries to decrease that..

It might come up with something original - I mean there has to be tons of interesting connections in the training data that no one’s seen before.

But maybe it’d just end up shouting at you.

simonw3mo ago

I'd like to see some concrete examples that illustrate this - as it stands this feels like an opinion piece that doesn't attempt to back up its claims.

(Not necessarily disagreeing with those claims, but I'd like to see a more robust exploration of them.)

barrkel3mo ago

Have you not seen it any time you put any substantial bit of your own writing through an LLM, for advice?

I disagree pretty strongly with most of what an LLM suggests by way of rewriting. They're absolutely appalling writers. If you're looking for something beyond corporate safespeak or stylistic pastiche, they drain the blood out of everything.

The skin of their prose lacks the luminous translucency, the subsurface scattering, that separates the dead from the living.

simonw3mo ago

The prompt I use for proof-reading has worked great for me so far:

  You are a proof reader for posts
  about to be published.

  1. Identify for spelling mistakes
  and typos
  2. Identify grammar mistakes
  3. Watch out for repeated terms like
  "It was interesting that X, and it
  was interesting that Y"
  4. Spot any logical errors or
  factual mistakes
  5. Highlight weak arguments that
  could be strengthened
  6. Make sure there are no empty or
  placeholder links

matwood3mo ago

> If you're looking for something beyond corporate safespeak

AI has been great for removing this stress. "Tell Joe no f'n way" in a professional tone and I can move on with my day.

2 more replies

Terretta3mo ago

> If you're looking for something beyond corporate safespeak or stylistic pastiche, they drain the blood out of everything.

Strong agree, which is why I disagree with this OP point:

“Stage 2: Lexical flattening. Domain-specific jargon and high-precision technical terms are sacrificed for "accessibility." The model performs a statistical substitution, replacing a 1-of-10,000 token with a 1-of-100 synonym, effectively diluting the semantic density and specific gravity of the argument.”

I see enough jargon in everyday business email that in the office zero-shot LLM unspoolings can feel refreshing.

I have "avoid jargon and buzzwords" as one of very tiny tuners in my LLM prefs. I've found LLMs can shed corporate safespeak, or even add a touch of sparkle back to a corporate memo.

Otherwise very bright writers have been "polished" to remove all interestingness by pre-LLM corporate homogenization. Give them a prompt to yell at them for using 1-in-10 words instead of 1-in-10,000 "perplexity" and they can tune themselves back to conveying more with the same word count. Results… scintillate.

furyofantares3mo ago

Look through my comment history at all the posts where I complain the author might have had something interesting to say but it's been erased by the LLM and you can no longer tell what the author cared about because the entire post is a an oversold monotone advertising voice.

https://news.ycombinator.com/item?id=46583410#46584336

https://news.ycombinator.com/item?id=46605716#46609480

https://news.ycombinator.com/item?id=46617456#46619136

https://news.ycombinator.com/item?id=46658345#46662218

https://news.ycombinator.com/item?id=46630869#46663276

https://news.ycombinator.com/item?id=46656759#46663322

https://news.ycombinator.com/item?id=46661936#46663362

https://news.ycombinator.com/item?id=46748077#46749699

internet_points3mo ago

I just sent TFA to a colleague of mine who was experimenting with llm's for auto-correcting human-written text, since she noticed the same phenomenon where it would correct not only mistakes, but slightly nudge words towards more common synonyms. It would often lose important nuances, so "shun" would be corrected to "avoid", and "divulge" would become "disclose" etc.

gdulli3mo ago

Kaffee: Corporal, would you turn to the page in this book that says where the mess hall is, please?

Cpl. Barnes: Well, Lt. Kaffee, that's not in the book, sir.

Kaffee: You mean to say in all your time at Gitmo, you've never had a meal?

Cpl. Barnes: No, sir. Three squares a day, sir.

Kaffee: I don't understand. How did you know where the mess hall was if it's not in this book?

Cpl. Barnes: Well, I guess I just followed the crowd at chow time, sir.

Kaffee: No more questions.

NitpickLawyer3mo ago

It is an opinion piece. By a dude working as a "Professor of Pharmaceutical Technology and Biomaterials at the University of Ferrara".

It has all the tropes of not understanding the underlying mechanisms, but repeating the common tropes. Quite ironic, considering what the author's intended "message" is. Jpeg -> jpeg -> jpeg bad. So llm -> llm -> llm must be bad, right?

It reminds me of the media reception of that paper on model collapse. "Training on llm generated data leads to collapse". That was in 23 or 24? Yet we're not seeing any collapse, despite models being trained mainly on synthetic data for the past 2 years. That's not how any of it works. Yet everyone has an opinion on how bad it works. Jesus.

It's insane how these kinds of opinion pieces get so upvoted here, while worth-while research, cool positive examples and so on linger in new with one or two upvotes. This has ceased to be a technical subject, and has moved to muh identity.

simonw3mo ago

Yeah, reading the other comments on this thread this is a classic example of that Hacker News (and online forums in general) thing where people jump on the chance to talk about a topic driven purely by the headline without engaging with the actual content.

(I'm frequently guilty of that too.)

1 more reply

PurpleRamen3mo ago

> Yet we're not seeing any collapse, despite models being trained mainly on synthetic data for the past 2 years.

Maybe because researchers learned from the paper to avoid the collapse? Just awareness alone often helps to sidestep a problem.

1 more reply

tpoacher3mo ago

The part about a change in entropy was interesting.

Is there an easy way to get / compare the entropy of two passages? (e.g. to see if it has indeed dropped after gen ai manipulation).

And could this be used to flag AI-gen text (or at least, boring, soulless sounding text)

dvt3mo ago

A lot of times, this entropy decay is found in semantic or stylistic space, which would be hard to detect (you couldn't use, e.g., Shannon Entropy). You'd have to ask questions like "is this point uninteresting?" or "is this trope overused?"--bad (human) writers are often guilty of this too, so that's why AI can be hard to detect.

tpoacher3mo ago

Yes, still though, it might be interesting to be able to measure this easily.

E.g. when asking an AI to rephrase or summarise, if the entropy drops you might take that as a sign that it has eroded the style beyond what you might be willing to tolerate.

I wonder if the author had a particular method / tool in mind, or if they were just speaking abstractly.

notepad0x903mo ago

Isn't this more to do with how LLMs are trained for general purpose use? Are LLMs with a specific use and dataset in mind better? Like if the dataset was fiction novels, would it sound more booky? If it was social-media, would it sound more click-baity and engaging?

I've had AI be boring, but I've also seen things like original jokes that were legitimately funny. Maybe it's the prompts people use, it doesn't give it enough of a semantic and dialectic direction to not be generic. IRL, we look at a person and get a feel for them and the situation to determine those things.

resiros3mo ago

I wonder why AI labs have not worked on improving the quality of the text outputs. Is this as the author claims a property of the LLMs themselves? Or is there simply not much incentive to create the best writing LLM?

mjamesaustin3mo ago

The argument is that the best writing is the unexpected, while an LLM's function is to deliver the expected next token.

icegreentea23mo ago

Even more precisely, human writing contains unpredictability that is either more or less intention (what might be called authors intent), as well as much more subconsciously added (what we might call quirks or imprinted behavior).

The first requires intention, something that as far as we know, LLMs simply cannot truly have or express. The second is something that can be approximated. Perhaps very well, but a mass of people using the same models with the same approximationa still lead to loss of distinction.

Perhaps LLMs that were fully individually trained could sufficiently replicate a person's quirks (I dunno), but that's hardly a scalable process.

altmanaltman3mo ago

Yeah, that makes banana.

1 more reply

zanehelton3mo ago

I remember an article a few weeks back[1] which mentioned the current focus is improving the technical abilities of LLMs. I can imagine many (if not most) of their current subscribers are paying for the technical ability as opposed to creative writing.

This also reminded me that on OpenRouter, you can sort models by category. The ones tagged "Roleplay" and "Marketing" are probably going to have better writing compared to models like Opus 4 or ChatGPT 5.2.

[1]: https://www.techradar.com/ai-platforms-assistants/sam-altman...

add-sub-mul-div3mo ago

That's like asking why McDonald's doesn't improve the quality of their hamburger. They can, but only within the bounds of mass produced cheap crap that maximizes profit. Otherwise they'd be a fundamentally different kind of company.

altmanaltman3mo ago

I mean there's tons of better-writing tools that use AI like Grammarly etc. For actual general-purpose LLMs, I don't think there's much incentive in making it write "better" in the artistic sense of the world... if the idea is to make the model good at tasks in general and communicate via language, that language should sound generic and boring. If it's too artistic or poetic or novel-like, the communication would appear a bit unhinged.

"Update the dependencies in this repo"

"Of course, I will. It will be an honor, and may I say, a beautiful privilege for me to do so. Oh how I wonder if..." vrs "Okay, I'll be updating dependencies..."

quamserena3mo ago

I wish it would just say "k, updated xyz to 1.2.3 in Cargo.toml" instead of the entire pages it likes to output. I don't want to read all of that!

1 more reply

resiros3mo ago

I mean, no one is asking for artistic writing, just not some obvious AI slop. The fact that we all can now easily determine that some text has been written / edited by AI is already an issue. No amount of prompting can help.

2 more replies

josefritzishere3mo ago

As a writer who has been published many times and edited many other writers for publication... It seems like AI can't make stylistic determinations. It is generally good with spelling and grammar but the text it generates is very homogeneous across formats. It's readable but it's not good, and always full of fluff like an online recepie harvesting clicks. It's kind of crap really. If you just need filler it's ok, but if you want something pleasand you definitely still need a human.

sieste3mo ago

All these forced metaphors and clumsy linguistic flourishes made me cringe. Just add some typos and grammar mistakes like the rest of us to prove that your human.

doomslayer9993mo ago

Great article and exactly why I use AI less and less. I basically find it to be rotting my brain towards the middle of the distribution. It's like all the nuance and critical thinking that actually goes into things gets stripped out.

Once a company perfects an agent that essentially performs condensed search and coding boilerplate making, that is probably where LLMs end for me. Perplexity and Claude are on the right track but not at all close.

ux2664783mo ago

> The AI identifies unconventional metaphors or visceral imagery as "noise" because they deviate from the training set's mean.

That's certainly a take. In the translation industry (the primogenitor and driver for much of the architecture and theory of LLMs) they're known for making extremely unconventional choices to such a degree that it actively degrades the quality of translation.

mizzao3mo ago

Isn't image generation basically doing "anti semantic ablation", starting with a blank canvas and iteratively refining it into a meaningful collection of pixels?

Is it possible to do the same thing with word generation, such that it sharpens into an opinionated version (even if it would do something different each time?)

52-6F-623mo ago

Because you simply can't engineer creativity. Maybe you can describe where it comes from, in a circuitous, abstract way with mathematics (and ultimately run face first into ħ and then run in circles for eternity). But to engineer it, you'd have to start over from the first principles of the stuff of the cosmos. One's a map and the other the territory.

nalllar3mo ago

The article itself reads as an AI generated output, complete with classic Not Just X … Y hallmarks from forever ago, 100% on pangram's low false positive detector. I'm not sure if it's some experiment on their readerbase or what. pangram result: https://www.pangram.com/history/02bead1c-c36e-461b-8fa7-8699...

So many AI generated AI bashing articles lately. I wrote a post complaining about running into these, and asking people who've sent me these AI articles multiple of them came from HN. https://lunnova.dev/articles/ai-bashing-ai-slop/

Fitik3mo ago

I'm not the only one who noticed that. I do creative writing with AI, so I spend hours reading ai generated text, and it feels ai generated or at least heavily ai assisted to me too, ironic, thanks for the link, glad there's something other than vibe

andai3mo ago

Could we invert a sign somewhere and get the opposite effect?

(Obviously a different question from "is an AI lab willing to release that publicly” ;)

bananaflag3mo ago

It's a hard problem and so far not a profitable one (I hope the solution will emerge as a byproduct of another innovation)

https://nostalgebraist.tumblr.com/post/778041178124926976/hy...

https://nostalgebraist.tumblr.com/post/792464928029163520/th...

andai3mo ago

I think you made the right diagnosis with "cringe" :) They forgot to turn down the cringe slider!

Have you played with the pre-RLHF models? I think Davinci is still online, though probably not for much longer.

They're a lot harder to work with (they don't have instruct training, so they just generate text similar to what you give them, rather than obeying commands). But they seem almost immune to the problem of mode collapse. They'll happily generate horrifying outputs for you. They're unsanitized. What cringe is in there, is authentic! Raw cringe, straight from Common Crawl.

It's a lot of fun to play with. It's also very strange, because it seems like there should be a lot more interest in them, for several reasons (they're the most language-modely of the language models, and ideal for research and experiments, to say nothing of censorship, exploring alternative approaches to LLM development, etc.), and it seems like nobody is talking about them or doing anything with them.

1 more reply

lyu072823mo ago

> The model performs a statistical substitution, replacing a 1-of-10,000 token with a 1-of-100 synonym

Do we see this in programming too? I don't think so? Unique, rarely used API methods aren't substituted the same way when refactoring. Perhaps that could give us a clue on how to fix that?

prerok3mo ago

I think that's different because refactoring usually involves calling the same functions/methods albeit in a bit more readable way.

When not given a clear guideline to "just" refactor, I have had problems with LLMs hallucinating functions that don't exist.

aleph_minus_one3mo ago

Couldn't you simply increase the temperature of the model to somewhat mitigate this effect?

lbrito3mo ago

I kind of think of that as just increasing the standard deviation. Its been a while since I experimented with this, but I remember trying a temp of 1 and the output was gibberish, like base64 gibberish. So something like 0.5 doesn't necessarily seem to solve this problem, it just flattens the distribution and makes the output less coherent, with rarer tokens, but still the same underlying distribution.

swyx3mo ago

you have to know that your "simply" is carrying too much weight. here's some examples of why just temperature is not enough, you need to run active world models https://www.latent.space/p/adversarial-reasoning

mannykannot3mo ago

When applied to insightful writing, that is much more likely to dull the point rather than preserve or sharpen it.

mwcampbell3mo ago

This idea is a major theme in this story by Robert Kingett: https://sightlessscribbles.com/the-colonization-of-confidenc...

esafak3mo ago

I think they can fix all that but they can't fix the fact that the computer has no intention to communicate. They could imbue it with agency to fix that too, but I much prefer it the way things are.

reilly30003mo ago

Those transformations happen to mirror what happens to human intelligence when you take antipsychotics. Please know the risks before taking them. They are innumerable and generally irreversible.

hknceykbx3mo ago

Yea. It is pretty sad. Even if we think about the code we’re writing. It’s so… average. No clever solutions, no funny mistakes, no character. Just average.

0x38B3mo ago

In Essays in the Art of Writing (1), Robert Louis Stevenson says:

"And perhaps there is no subject on which a man should speak so gravely as that industry, whatever it may be, which is the occupation or delight of his life; which is his tool to earn or serve with; and which, if it be unworthy, stamps himself as a mere incubus of dumb and greedy bowels on the shoulders of labouring humanity. On that subject alone even to force the note might lean to virtue’s side. It is to be hoped that a numerous and enterprising generation of writers will follow and surpass the present one; but it would be better if the stream were stayed, and the roll of our old, honest English books were closed, than that esurient book-makers should continue and debase a brave tradition, and lower, in their own eyes, a famous race. Better that our serene temples were deserted than filled with trafficking and juggling priests."

And in the first essay, speaking on matters of style:

"The conjurer juggles with two oranges, and our pleasure in beholding him springs from this, that neither is for an instant overlooked or sacrificed. So with the writer. His pattern, which is to please the supersensual ear, is yet addressed, throughout and first of all, to the demands of logic. Whatever be the obscurities, whatever the intricacies of the argument, the neatness of the fabric must not suffer, or the artist has been proved unequal to his design. And, on the other hand, no form of words must be selected, no knot must be tied among the phrases, unless knot and word be precisely what is wanted to forward and illuminate the argument; for to fail in this is to swindle in the game. The genius of prose rejects the cheville no less emphatically than the laws of verse; and the cheville, I should perhaps explain to some of my readers, is any meaningless or very watered phrase employed to strike a balance in the sound. Pattern and argument live in each other; and it is by the brevity, clearness, charm, or emphasis of the second, that we judge the strength and fitness of the first."

AI doesn't "write" in the sense used above. It has no ear, no wit, no soul. "A reflection of a mind is not a mind", as Phillip Ball writes in "AI Is the Black Mirror" (2).

1: https://www.gutenberg.org/cache/epub/492/pg492-images.html#p...

2: https://nautil.us/ai-is-the-black-mirror-1169121/

spwa43mo ago

As someone longtime involved in software development, can we call this "best practices" instead of some like "semantic ablation" that nobody understands?

1 more reply

AreShoesFeet0003mo ago

How much money would it take for me to take an open weight model, treat it nice, and go have some fun? Maybe some thousands, right?

ZoomZoomZoom3mo ago

What a weird use of "Romanesque" and "Baroque". Doesn't compute for me at all.

kaycey20223mo ago

A case of "That's just, like, your opinion, man".

The entire article sounds like AI generated opinion.

book_mike3mo ago

Sematic ablation... that's some technobable.

zahlman3mo ago

Going off search results, it seems to be a new coinage. I found mostly references to TFA, along with an (ironically obviously AI-written) guide with suggestions for getting LLMs to avoid the issue (just generic "traditional" advice for tuning their output, really). The guide was apparently published today, and I imagine that it's a deliberate response to TFA. But FWIW the term "semantic ablation" does seem to me like something that newer models could invent

At any rate, it seems to me like a reasonable label for what's described:

> Semantic ablation is the algorithmic erosion of high-entropy information. Technically, it is not a "bug" but a structural byproduct of greedy decoding and RLHF (reinforcement learning from human feedback).

> ...

> When an author uses AI for "polishing" a draft, they are not seeing improvement; they are witnessing semantic ablation.

The metaphor is very apt. Literal polishing is removal of outer layers. Compared to the near-synonym "erosion", "ablation" connotes a deliberate act (ordinarily I would say "conscious", but we are talking about LLMs here). Often, that which is removed is the nuance of near-synonyms (there is no pause to consider whether the author intended that nuance). I don't know if the "character" imparted by broader grammatical or structural choices can be called "semantic", but that also seems like a big part of what goes missing in the "LLM house style".

Bluntly: getting AI to "improve" writing, as a fully generic instruction, is naturally going to pull that writing towards how the AI writes by default. Because of course the AI's model of "writing quality" considers that style to be "the best"; that's why it uses it. (Even "consider" feels like anthropomorphizing too much; I feel like I'm hitting the limits of English expressiveness here.)

lurquer3mo ago

Nonsense. I’ve written bland prose for a story and AI made it much better by revising it with a prompt such as this: “Make the vocabulary and grammar more sophisticated and add in interesting metaphors. Rewrite it in the style of a successful literary author.”

Etc.

matternous3mo ago

Why don't you post it so we can see how much better the AI made it?

lurquer3mo ago

Because HN isn't a literary forum.

Maybe it sucks. Maybe it doesn't.

But, I notice a curious pretentiousness when it comes to some people's assumptions about their ability to identify LLM prose. Obviously, the generic first-pass 'chat' crap is recognizable; the kind of garbage that is filling up blog-posts on the internet.

But, one shouldn't underestimate the power of this technology when it comes to language. Hell, the 'coding' skills were just a pleasant side-effect of the language training, if you recall. These things have been trained on millions of works of prose of all styles: its their heart and soul. If you think the superficial monotonous style is all there is, you're mistaken. Most of the obnoxious LLM-style stuff is an artifact of the conversational training with Kenyans and the like in the early days. But, you can easily break through that with better prompts (or fine-tuning it yourself.)

That said, one shouldn't conflate the creation of the content and structure and substance of a work of prose with the manner in which it is written. You're not going to get an LLM to come up with a decent plot... yet. But, as far as fleshing out the framework of a story in a synthetic 'voice' that sounds human? Definitely doable.

vessenes3mo ago

Meh. Semantic Ablation - but toward a directed goal. If I say "How would Hemingway have said this, provided he had the same mindset he did post-war while writing for Collier's?"

Then the model will look for clusters that don't fit what the model consider's to be Hemingway/Colliers/Post-War and suggest in that fashion.

"edit this" -> blah

"imagine Tom Wolfe took a bunch of cocaine and was getting paid by the word to publish this after his first night with Aline Bernstein" -> probably less blah

aabhay3mo ago

These kinds of prompts don’t really improve the writing IME. It still gets riddled with the same tropes and phrases, or it veers off into textual vomit.

vessenes3mo ago

FWIW, I agree. Frontier LLMs are on their way to becoming competent stylists (I ask every major model release to write up a sample essay as Hemingway, and they are improving), but they are often skin-deep.

quikoa3mo ago

Even if it would work good luck writing with a new style.

marquisdepolis3mo ago

This article is entirely AI generated, making this either a rather brilliant piece of meta commentary or just pure grift. Alas, my hopes are low.

meowface3mo ago

I want to believe it's a bit/Sokal hoax type thing...

marquisdepolis3mo ago

Me too, though it feels just a grift.

anematode3mo ago

Another common LLMism is elegant variation, i.e., avoiding repetition of words by using synonyms. (I assume they're RLed to do this religiously.) Of course, human writers do this too, but not nearly to the same extent in my experience. There's nothing wrong with repeating a word, especially in a formal text.

kirykl3mo ago

swyx3mo ago

the word choice here is so obtuse as to trigger my radar for "is this some kind of parody where this itself was AI generated". it appears to be entirely serious, which is disappointing, it could have been high art.

the words TFA is looking for is mode collapse https://www.lesswrong.com/posts/t9svvNPNmFf5Qa3TA/mysteries-... and the author could herself learn to write more clearly.

nalllar3mo ago

It almost certainly is AI generated. It reads like it is, pangram thinks it is, https://www.pangram.com/history/02bead1c-c36e-461b-8fa7-8699... and pangram's unlikely to give false positives https://www.pangram.com/blog/third-party-pangram-evals

j / k navigate · click thread line to collapse

204 comments

barrkel3mo ago

But it's the jagged edges, the unorthodox and surprising prickly bits, that tear open a hole in the inattention of your reader, that actually gets your ideas into their heads.

svara3mo ago

datsci_est_20153mo ago

> I think that mostly depends on how good a writer you are. A lot of people aren't, and the AI legitimately writes better.

Even poor writers write with character. My dad misspells every 4th word when he texts me, but it’s unmistakably his voice. Endearingly so.

I would push back with passion that AI writes “legitimately” better, as it has no character except the smoothed mean of all internet voices. The millennial gray of prose.

2 more replies

littlestymaar3mo ago

> A lot of people aren't, and the AI legitimately writes better.

It may write “objectively better”, but the very distinct feel of all AI generated prose makes it immediately recognizable as artificial and unbearable as a result.

aaplok3mo ago

It depends how you define "good writing", which is too often associated with "proper language", and by extension with proper breeding. It is a class marker.

Retric3mo ago

I find most people can write way better than AI, they simply don’t put in the effort.

Which is the real issue, we’re flooding channels not designed for such low effort submissions. AI slop is just SPAM in a different context.

3 more replies

lich_king3mo ago

2 more replies

baxtr3mo ago

I think it’s essential to realize that AI is a tool for mainstream tasks like composing a standard email and not for the edges.

The edges are where interesting stuff happens. The boring part can be made more efficient. I don’t need to type boring emails, people who can’t articulate well will be elevated.

It’s the efficient popularization of the boring stuff. Not much else.

anon-39883mo ago

> The edges are where interesting stuff happens. The boring part can be made more efficient. I don’t need to type boring emails, people who can’t articulate well will be elevated.

If you need to say yes/no. You don't want to take the whole email conversation and let LLM generate a story about why you said yes/no.

If I have the input, I can always generate an output. If I have the output, I don't know what was the input (i.e. the original intention).

1 more reply

layer83mo ago

folbec3mo ago

I think it is also fairly similar to the kind of discourse a manager in pretty much any domain will produce.

He lacks (or lost thru disuse) technical expertise on the subject, so he uses more and more fuzzy words, leaky analogies, buzzwords.

This maybe why AI generated content has so much success among leaders and politicians.

coke123mo ago

Every group want to label some outgroup as naively benefiting from AI. For programmers, apparently it's the pointy haired bosses. For normies, it's the programmers.

Be careful of this kind of thinking, it's very satisfying but doesn't help you understand the world.

devmor3mo ago

> But it's the jagged edges, the unorthodox and surprising prickly bits, that tear open a hole in the inattention of your reader, that actually gets your ideas into their heads.

This brings to mind what I think is a great description of the process LLMs exert on prose: sanding.

It's an algorithmic trend towards the median, thus they are sanding down your words until they're a smooth average of their approximate neighbors.

gdulli3mo ago

Mediocrity as a Service

DuperPower3mo ago

artificial mediocrity

DuperPower3mo ago

johnnienaked3mo ago

piker3mo ago

Bryan Cantrill referred to it as "normcore" on a podcast, and that's the perfect description.

amelius3mo ago

I'm sure this can be corrected by AI companies.

yoyohello133mo ago

The question is… why? What is the actual human benefit (not monetary).

2 more replies

q3k3mo ago

Just let my work have a soul, please.

2 more replies

stephc_int133mo ago

The "AI voice" is everywhere now.

I see it on recent blog posts, on news articles, obituaries, YT channels. Sometimes mixed with voice impersonation of famous physicists like Feynman or Susskind.

I find it genuinely soul-crushing and even depressing, but I may be over sensitive to it as most readers don't seem to notice.

matusp3mo ago

> The "AI voice" is everywhere now.

Maybe I'm going crazy but I can smell it in the OP as well.

logicprog3mo ago

Yeah the article smells extremely strongly of AI to me, but I've been told here before that that's just the register's house style, so I have no idea.

_79px3mo ago

Yeah I started seeing it too, the article is just full of AI clues.

vessenes3mo ago

Yes, I get more and more visceral reactions to it. I'm reminded of JPEG artifacts - unnoticeable in 1993!

1 more reply

doomslayer9993mo ago

Literally the worst thing that happened to the internet after addictive doomscroll feeds and ads everywhere.

And, the worst part is noone will ever make a new internet because of the founder effect. We are basically in the worst timeline.

samdjstephens3mo ago

Maybe. Another potential, more positive, timeline is that semantically ablated content filling everyone’s feeds turns people off, and slowly kills the social feed paradigm.

1 more reply

causal3mo ago

archagon3mo ago

I just close the tab. My reading backlog is too long as it is.

1 more reply

vpribish3mo ago

same. it is showing how many people are not trying to participate - just appear to. I want to read from and write for my peers, but it seems we are just awash with fakers

emp173443mo ago

The internet is a post-truth space now that you can spin up a million different agents to push whatever narrative you choose.

lm284693mo ago

delis-thumbs-7e3mo ago

ses19843mo ago

People want the real thing, not artificially flavored tokens.

I would rather read the prompt than the generative output, even if it’s just disjointed words and sentence fragments.

voy7073mo ago

I hope "just the prompt please" becomes a common phrase

pimlottc3mo ago

Regurgitative AI

0x7cfe3mo ago

Degenerative AI

doomslayer9993mo ago

Terretta3mo ago

> most of the time you only get what the most boring person in the most boring cocktail party would say

don't be mean, it's median AI à la mode

tasty_freeze3mo ago

Bible Scholar and youtube guy Dan McClellan had an amazing "high entropy" phrase that slayed me a few days ago.

https://youtu.be/605MhQdS7NE?si=IKMNuSU1c1uaVCDB&t=730

Yeah, AI could not come up with that phrase.

raincole3mo ago

> they set off the tuning fork in the loins of your own dogmatism

Sounds like word salad. Of course if you write like GPT-2 it would not sound like current models.

tasty_freeze3mo ago

Not at all -- it was a funny and more polite way to say, "Don't just repeat things because they give your dogmatism a boner."

lurquer3mo ago

> "they set off the tuning fork in the loins of your own dogmatism."

Eh... I don't know. To me, that sounds very AI-ish.

Claude is very good -- at times -- coming up with flowery metaphoric language... if you tell it to. That one is so over-the-top that I'd edit it out.

Put something like this in your prompt and have it revise something:

Occasionally you'll get some gems. Claude is much better than ChatGPT at this kinda stuff. The BEST ones are the ever-growing NSFW models populating huggingface.

In short, do the posts on OpenClawForum all sound alike? Of course.

Just like all the webpages circa 2000 looked alike. The uniformity wasn't because of HTML... rather it was because few people were using HTML to its full potential.

IncreasePosts3mo ago

A sloppy mixed metaphor?

card_zero3mo ago

I'm learning to like 'em more, along with every other human idiosyncracy. Besides, it makes a kind of sense, the idea of some resonance occuring in one's gusset. Timber timbre. Flangent thrumming.

2 more replies

tasty_freeze3mo ago

If you like your prose to be anodyne, then maybe you like what AI produces.

rorylaitila3mo ago

gnutrino3mo ago

I've found LLMs to be terrible with ideation. I've been using GPT 5.x to come up with ideas and plot lines for a Dungeon World campaign I've been running.

What i have found LLMs to be helpful with is writing up fun post-session recaps I share with the adventurers.

benji8000OP3mo ago

causal3mo ago

YES this hits the nail on something I've been trying to express for some time now. Semantic ablation: love it, going to use that a lot not now when arguing why someone's ChatGPT-washed email sucks.

1 more reply

rchaud3mo ago

> We are witnessing a civilizational "race to the middle," where the complexity of human thought is sacrificed on the altar of algorithmic smoothness.

This has long been the case in the area of "business English", which has become highly simplified to fulfill several concurrent, yet conflicting requirements:

- Generally understandable to a wide audience due its lingua franca status

- "Media-trained" to not let internal details slip or admit fault to the public

- "Executive Summary"-fied to provide the coveted "30k ft view" to detail-allergic senior leadership

Espressosaurus3mo ago

This matches what I saw when I tried using AI as an editor for writing.

It wanted to replace all the little bits of me that were in there.

crabmusket3mo ago

For those who haven't seen it yet, this Wiki page also has what I think is very good advice about writing:

https://en.wikipedia.org/wiki/Wikipedia:Signs_of_AI_writing

While the page's purpose is to help editors detect AI contributions, you can also detect yourself doing these same things sometimes, and fix them.

ranprieur3mo ago

This isn't new to AI. The same kind of thing happens in movie test screenings, or with autotune. If something is intended for a large audience, there's always an incentive to remove the weird stuff.

somewhereoutth3mo ago

> What began as a jagged, precise Romanesque structure of stone is eroded into a polished, Baroque plastic shell

Not to detract from the overall message, but I think the author doesn't really understand Romanesque and Baroque.

(as an aside, I'd most likely associate Post-Modernism as an architectural style with the output of LLMs - bland, regurgitative, and somewhat incongruous)

morgengold3mo ago

I wonder how much of it could be prompted away.

For example the anthropic Frontend Design skill instructs:

Maybe sth similar would be possible for writing nuances.

1 https://github.com/anthropics/skills/blob/main/skills/fronte...

lich_king3mo ago

> "NEVER use generic AI-generated aesthetics like overused font families (Inter, Roboto, Arial, system fonts), cliched color schemes (particularly purple gradients on white backgrounds), ...

Now, imagine what happens when this prompt becomes popular?

Several days ago, someone researched Moltbook and pointed out how similar all the posts are. Something like 10% of them say "my human", etc.

causal3mo ago

Many have tried, it does not work. Regression to the mean always sets in.

lurquer3mo ago

Are you sure? Here's the OP article (first part... don't want to spam the thread) written in much cooler style...

------

The Lobotomist in the Machine

But nobody — not one bright-eyed engineer in the whole fluorescent-lit congregation — thought to name the other thing. The quiet one. The one that doesn't add. The one that takes away.

I'm naming it now.

Semantic ablation. Say it slow. Let it sit in your mouth like a copper penny fished from a dead man's pocket.

I. What It Is, and Why It Wants to Kill You

etc., etc.

1 more reply

conartist63mo ago

Race to the middle really sums up how I feel about AI.

tayo423mo ago

People felt this was already happening. I remenber reading filter world which described this pre Ai

poszlem3mo ago

I call it the great blur.

dsf2d3mo ago

I call it a mirage. I get why people are taken aback and fascinated by it. But what the model producers are chasing is a mirage. I wonder when they'll finally accept it?

1 more reply

lakhotiaharshit3mo ago

This article on AI writing being boring seems to be written by AI. The em dashes and the sentence structure, all seems to be AI output. Or have human started adopting this style too.

haffi1123mo ago

Oh the irony, I also felt that way when I started reading it. It borders on being hypocrisy given the topic of the article.

meowface3mo ago

Further analyses by others seem to show it is likely all written by an LLM.

cadamsdotcom3mo ago

If the “amount of semantic ablation” in a generated phrase/sentence/paragraph can be measured and compared, then a looped process (an agent) could be built that tries to decrease that..

It might come up with something original - I mean there has to be tons of interesting connections in the training data that no one’s seen before.

But maybe it’d just end up shouting at you.

simonw3mo ago

I'd like to see some concrete examples that illustrate this - as it stands this feels like an opinion piece that doesn't attempt to back up its claims.

(Not necessarily disagreeing with those claims, but I'd like to see a more robust exploration of them.)

barrkel3mo ago

Have you not seen it any time you put any substantial bit of your own writing through an LLM, for advice?

The skin of their prose lacks the luminous translucency, the subsurface scattering, that separates the dead from the living.

simonw3mo ago

The prompt I use for proof-reading has worked great for me so far:

  You are a proof reader for posts
  about to be published.

  1. Identify for spelling mistakes
  and typos
  2. Identify grammar mistakes
  3. Watch out for repeated terms like
  "It was interesting that X, and it
  was interesting that Y"
  4. Spot any logical errors or
  factual mistakes
  5. Highlight weak arguments that
  could be strengthened
  6. Make sure there are no empty or
  placeholder links

matwood3mo ago

> If you're looking for something beyond corporate safespeak

AI has been great for removing this stress. "Tell Joe no f'n way" in a professional tone and I can move on with my day.

2 more replies

Terretta3mo ago

> If you're looking for something beyond corporate safespeak or stylistic pastiche, they drain the blood out of everything.

Strong agree, which is why I disagree with this OP point:

I see enough jargon in everyday business email that in the office zero-shot LLM unspoolings can feel refreshing.

I have "avoid jargon and buzzwords" as one of very tiny tuners in my LLM prefs. I've found LLMs can shed corporate safespeak, or even add a touch of sparkle back to a corporate memo.

furyofantares3mo ago

https://news.ycombinator.com/item?id=46583410#46584336

https://news.ycombinator.com/item?id=46605716#46609480

https://news.ycombinator.com/item?id=46617456#46619136

https://news.ycombinator.com/item?id=46658345#46662218

https://news.ycombinator.com/item?id=46630869#46663276

https://news.ycombinator.com/item?id=46656759#46663322

https://news.ycombinator.com/item?id=46661936#46663362

https://news.ycombinator.com/item?id=46748077#46749699

internet_points3mo ago

gdulli3mo ago

Kaffee: Corporal, would you turn to the page in this book that says where the mess hall is, please?

Cpl. Barnes: Well, Lt. Kaffee, that's not in the book, sir.

Kaffee: You mean to say in all your time at Gitmo, you've never had a meal?

Cpl. Barnes: No, sir. Three squares a day, sir.

Kaffee: I don't understand. How did you know where the mess hall was if it's not in this book?

Cpl. Barnes: Well, I guess I just followed the crowd at chow time, sir.

Kaffee: No more questions.

NitpickLawyer3mo ago

It is an opinion piece. By a dude working as a "Professor of Pharmaceutical Technology and Biomaterials at the University of Ferrara".

simonw3mo ago

(I'm frequently guilty of that too.)

1 more reply

PurpleRamen3mo ago

> Yet we're not seeing any collapse, despite models being trained mainly on synthetic data for the past 2 years.

Maybe because researchers learned from the paper to avoid the collapse? Just awareness alone often helps to sidestep a problem.

1 more reply

tpoacher3mo ago

The part about a change in entropy was interesting.

Is there an easy way to get / compare the entropy of two passages? (e.g. to see if it has indeed dropped after gen ai manipulation).

And could this be used to flag AI-gen text (or at least, boring, soulless sounding text)

dvt3mo ago

tpoacher3mo ago

Yes, still though, it might be interesting to be able to measure this easily.

E.g. when asking an AI to rephrase or summarise, if the entropy drops you might take that as a sign that it has eroded the style beyond what you might be willing to tolerate.

I wonder if the author had a particular method / tool in mind, or if they were just speaking abstractly.

notepad0x903mo ago

resiros3mo ago

mjamesaustin3mo ago

The argument is that the best writing is the unexpected, while an LLM's function is to deliver the expected next token.

icegreentea23mo ago

Perhaps LLMs that were fully individually trained could sufficiently replicate a person's quirks (I dunno), but that's hardly a scalable process.

altmanaltman3mo ago

Yeah, that makes banana.

1 more reply

zanehelton3mo ago

[1]: https://www.techradar.com/ai-platforms-assistants/sam-altman...

add-sub-mul-div3mo ago

altmanaltman3mo ago

"Update the dependencies in this repo"

"Of course, I will. It will be an honor, and may I say, a beautiful privilege for me to do so. Oh how I wonder if..." vrs "Okay, I'll be updating dependencies..."

quamserena3mo ago

I wish it would just say "k, updated xyz to 1.2.3 in Cargo.toml" instead of the entire pages it likes to output. I don't want to read all of that!

1 more reply

resiros3mo ago

2 more replies

josefritzishere3mo ago

sieste3mo ago

All these forced metaphors and clumsy linguistic flourishes made me cringe. Just add some typos and grammar mistakes like the rest of us to prove that your human.

doomslayer9993mo ago

ux2664783mo ago

> The AI identifies unconventional metaphors or visceral imagery as "noise" because they deviate from the training set's mean.

mizzao3mo ago

Isn't image generation basically doing "anti semantic ablation", starting with a blank canvas and iteratively refining it into a meaningful collection of pixels?

Is it possible to do the same thing with word generation, such that it sharpens into an opinionated version (even if it would do something different each time?)

52-6F-623mo ago

nalllar3mo ago

Fitik3mo ago

andai3mo ago

Could we invert a sign somewhere and get the opposite effect?

(Obviously a different question from "is an AI lab willing to release that publicly” ;)

bananaflag3mo ago

It's a hard problem and so far not a profitable one (I hope the solution will emerge as a byproduct of another innovation)

https://nostalgebraist.tumblr.com/post/778041178124926976/hy...

https://nostalgebraist.tumblr.com/post/792464928029163520/th...

andai3mo ago

I think you made the right diagnosis with "cringe" :) They forgot to turn down the cringe slider!

Have you played with the pre-RLHF models? I think Davinci is still online, though probably not for much longer.

1 more reply

lyu072823mo ago

> The model performs a statistical substitution, replacing a 1-of-10,000 token with a 1-of-100 synonym

Do we see this in programming too? I don't think so? Unique, rarely used API methods aren't substituted the same way when refactoring. Perhaps that could give us a clue on how to fix that?

prerok3mo ago

I think that's different because refactoring usually involves calling the same functions/methods albeit in a bit more readable way.

When not given a clear guideline to "just" refactor, I have had problems with LLMs hallucinating functions that don't exist.

aleph_minus_one3mo ago

Couldn't you simply increase the temperature of the model to somewhat mitigate this effect?

lbrito3mo ago

swyx3mo ago

mannykannot3mo ago

When applied to insightful writing, that is much more likely to dull the point rather than preserve or sharpen it.

mwcampbell3mo ago

This idea is a major theme in this story by Robert Kingett: https://sightlessscribbles.com/the-colonization-of-confidenc...

esafak3mo ago

I think they can fix all that but they can't fix the fact that the computer has no intention to communicate. They could imbue it with agency to fix that too, but I much prefer it the way things are.

reilly30003mo ago

Those transformations happen to mirror what happens to human intelligence when you take antipsychotics. Please know the risks before taking them. They are innumerable and generally irreversible.

hknceykbx3mo ago

Yea. It is pretty sad. Even if we think about the code we’re writing. It’s so… average. No clever solutions, no funny mistakes, no character. Just average.

0x38B3mo ago

In Essays in the Art of Writing (1), Robert Louis Stevenson says:

And in the first essay, speaking on matters of style:

AI doesn't "write" in the sense used above. It has no ear, no wit, no soul. "A reflection of a mind is not a mind", as Phillip Ball writes in "AI Is the Black Mirror" (2).

1: https://www.gutenberg.org/cache/epub/492/pg492-images.html#p...

2: https://nautil.us/ai-is-the-black-mirror-1169121/

spwa43mo ago

As someone longtime involved in software development, can we call this "best practices" instead of some like "semantic ablation" that nobody understands?

1 more reply

AreShoesFeet0003mo ago

How much money would it take for me to take an open weight model, treat it nice, and go have some fun? Maybe some thousands, right?

ZoomZoomZoom3mo ago

What a weird use of "Romanesque" and "Baroque". Doesn't compute for me at all.

kaycey20223mo ago

A case of "That's just, like, your opinion, man".

The entire article sounds like AI generated opinion.

book_mike3mo ago

Sematic ablation... that's some technobable.

zahlman3mo ago

At any rate, it seems to me like a reasonable label for what's described:

> ...

> When an author uses AI for "polishing" a draft, they are not seeing improvement; they are witnessing semantic ablation.

lurquer3mo ago

Etc.

matternous3mo ago

Why don't you post it so we can see how much better the AI made it?

lurquer3mo ago

Because HN isn't a literary forum.

Maybe it sucks. Maybe it doesn't.

vessenes3mo ago

Meh. Semantic Ablation - but toward a directed goal. If I say "How would Hemingway have said this, provided he had the same mindset he did post-war while writing for Collier's?"

Then the model will look for clusters that don't fit what the model consider's to be Hemingway/Colliers/Post-War and suggest in that fashion.

"edit this" -> blah

"imagine Tom Wolfe took a bunch of cocaine and was getting paid by the word to publish this after his first night with Aline Bernstein" -> probably less blah

aabhay3mo ago

These kinds of prompts don’t really improve the writing IME. It still gets riddled with the same tropes and phrases, or it veers off into textual vomit.

vessenes3mo ago

quikoa3mo ago

Even if it would work good luck writing with a new style.

marquisdepolis3mo ago

This article is entirely AI generated, making this either a rather brilliant piece of meta commentary or just pure grift. Alas, my hopes are low.

meowface3mo ago

I want to believe it's a bit/Sokal hoax type thing...

marquisdepolis3mo ago

Me too, though it feels just a grift.

anematode3mo ago

kirykl3mo ago

swyx3mo ago

the words TFA is looking for is mode collapse https://www.lesswrong.com/posts/t9svvNPNmFf5Qa3TA/mysteries-... and the author could herself learn to write more clearly.

nalllar3mo ago

j / k navigate · click thread line to collapse