Suddenly, while talking with ChatGPT, it appears almost in every other conversation...
Is it because I talk (sometimes) about PHP/Laravel/Eloquent, and it somehow fixed the "Eager loading of relationships," or is the language changed, and I did not notice?
Did you notice other words like this?
I found this article about the word "delve":
https://www.theguardian.com/technology/2024/apr/16/techscape...
Little off-topic: Lastly, I learned that children in Portugal have started to speak the Brazilian variant of Portuguese, as videos from there are flooding the Internet. It is interesting how technology affects our lives in more surprising areas.
So the chameleon is a bad chameleon because it looks too much like a leaf, too much like the thing it is attempting to imitate?
A few years ago I was tasked with editing the performance reviews for my unit (military). Every supervisor had a finite number of characters in which to describe each soldier's performance. I went through and removed all the extra/useless words. Oh the anger! While supervisors agreed I had improved their writing, they now felt obligated to fill up all the blank space I had created.
[and so on].
I hooked ChatGPT up to a speech recognizer and far field mics etc, trying to build my own alexa, and I had to add "Please be terse" to the prompt. And that wasn't enough, so I said "Please limit yourself to as few words as possible to convey the answer. Be terse. Try to keep your speech very short." before I finally started getting reasonable replies.
> We study vocabulary changes in 14 million PubMed abstracts from 2010–2024, and show how the appearance of LLMs led to an abrupt increase in the frequency of certain style words.
Since there isn't a single English (English learners generally get informed about the choice of UK vs US English only, but most English is spoken outside of UK and USA in other places and other dialects), but multiple different Englishes, any English speaker will probably find something to be surprised by, and there is an economic incentive to get data from people other than the relatively expensive native speakers of UK or USA English.
It was submitted as https://news.ycombinator.com/item?id=40623629
Again, there is effectively zero real data showing this. Further, RLHF isn't likely to reinforce such word selection regardless.
A more logical, likely scenario is that training data is biased heavily towards higher grade level material, so word selection veers towards writings that you find in those realms.
Have a look at BBC translation to get a taste, and tell me its not hoax: https://www.bbc.com/pidgin
On another note, while the paper itself is pretty cool, in discussions on it I thought people where kind of looking down on using LLM's to help you write. There's a philistine moat in many fields around writing style. While writing well is in my experience correlated with paper quality, it is not predicated by it. And introducing tools that help people write more readable papers is probably a net benefit overall.
What made LLMs suddenly interesting was that the responses were much more like answers and much less like additional questions in the same vein as the prompt.
If it can be overcome in existing models, it’s probably going to involve different aspects including vocabulary, style, and organization.
Eager isn't an especially uncommon word (eg "eager beavers" is a somewhat common saying), even though it's not used in most convos.
I feel like "delve" is a YouTube phenomenon (as in "let's delve into this topic") as a weird proxy for "deep dive". Maybe a side effect of D&D's resurgence over the last decade, where it's often used to describe small adventures/dungeons...?
(Not kidding, from today’s NYT crossword column: https://www.nytimes.com/2024/08/11/crosswords/daily-puzzle-2...)
I'd say it's very common, at least in my part of the US. It's one of the words I hear on a daily basis, anyway.
"Delve" used to be a very commonly used term before "deep dive" largely replaced it. I'm sure there are a whole lot of writings online that use "delve" because of the time period they were produced in.
As a graybeard, I'm personally still much more likely to say "delve" than "deep dive".
No judgment! I'm delighted, however, that language is so supple ("leverages domain-local synergies")
- it's conversationally-aligned with dumping large amounts of information
- it's an easy emotional state to hold unilaterally (without factoring in the other participant)
- it's unlikely to offend or cause a PR nightmare
- it's flattering!
(we use these in ainews summaries so that we dont delve too much https://buttondown.email/ainews)
But generally, 'eager' isn't particularly rare in English.
Not uncommon pre-gpt either.
Hence we suddenly started using two words “reaching out” rather than one “contact”.
Be the text came out of an llm the real question for the user is, does this technical term actually to this situation.
If it does, then it's an appropriate word choice carrying additional information.
As a non-native speaker of English, I speak and write some weird mix of British and US English, and I always keep forgetting how strong the words "bugger" and "cunt" are in each context. Here's globalization for you.
For texts, it uses "furthermore" more than any other word followed by "lastly" imho.