Artificial intelligence is losing hype

780 comments

This should really be retitled to “The AI investment bubble is losing hype.” LLMs as they exist today will slowly work their way into new products and use cases. They are an important new capability and one that will change how we do certain tasks.

But as to the hype, we are in a brief pause before the election where no company wants to release anything that would hit the news cycle in a bad way and cause knee-jerk legislation. Are there new architectures and capabilities waiting? Likely some. Sora showed state of the art video generation, OpenAI has demoed an impressive voice mode, and Anthropic has teased that Opus 3.5 will be even more capable. OpenAI also clearly has some gas in the tank as they have focused on releasing small models such as GPT-4o and 4o mini. And many have been musing about agents and methods to improve system 2 like reasoning.

So while there’s a soft moratorium on showing scary new capability there is still evidence of progress being made behind-the-scenes. But what will a state of the art model look like when all of these techniques have been scaled up on brand new exascale data centers?

It might not be AGI, but I think it will at least be enough for the next hype Investment bubble.

DiscourseFan1y ago

We’ll see, but I doubt its because of the election; as another commenter said companies can’t afford to lose that much money by waiting around for months for the right “moment” to release a product. GPT-4o is good, I’ll grant you that, but its fundenmentally the same tech as GPT3.5 and the fundenmental problem, “hallucination,” is not solved, even if there are more capabilities. No matter what, for anything besides code that may or may not be any good, someone has to go through and read the response and make sure its all tight so they don’t fuck up or embarass themselves (and even then, using AI for coding will introduce long term problems since you’ll have to go back through it for debugging anyway). We all, within a month of ChatGPT getting introduced, caught it in a contradiction or just plain error about some topic of specialist knowledge, and realized that its level of expertise on all topics was at that same level. Sam Altman royallly fucked up when he started believing his own bullshit, thinking AGI was on the horizon and all we needed was to waste more resources on compute time and datacenters.

Its done, you can’t make another LLM, all knowledge from here on out is corrupted by them, you can never deliver an epistemic “update.” GPT will become a relic of the 21st century, like a magazine from the 1950s.

2 more replies

nottorp1y ago

> But as to the hype, we are in a brief pause before the election where no company wants to release anything that would hit the news cycle in a bad way and cause knee-jerk legislation.

The knee-jerk legislation has mostly been caused by Altman's statements though. So I wouldn't call it knee-jerk, but an attempt by OpenAI to get a legally granted monopoly.

1 more reply

dubcanada1y ago

So you’re suggesting all innovation/new functionality releases are paused because US has elections coming up? I find that hard to believe.

3 more replies

FrustratedMonky1y ago

"there’s a soft moratorium on showing scary new capability"

Yes. There is also the Hype of the "End of the Hype Cycle". There is Hype that the Hype is ending.

When really, there is something amazing being released weekly.

People are so desensitized that just because we don't have androids walking the streets or suddenly have Blade Runner like space colonies staffed with robots, that somehow AI is over.

jajko1y ago

Nope. I couldn't care less about some elections (I care about global consequences, but no point wasting time & energy now, when it comes it comes). Thats US theater to focus population on that freak show you guys made out of election process, rather than on actually important stuff and concrete policies or actions.

What people, including me, are massively fed up is all the companies (I mean ALL) jumping on AI bandwagon in a beautiful show of how FOMO works and how even CEOs/shareholders are not immune to basic instincts. Literal hammer looking desperately for nails. Very few companies have amazing or potentially amazing products, rest not so much.

I absolutely don't want every effin' thing infused with some AI, since it will be used to 1) monitor my usage or me directly for advertising / credit & insurance scoring purposes, absolutely 0 doubt there; and 2) it may stop working once wifi is down, product is deprecated or company changes its policy (Sonos anyone?). Internet of Things hate v2.0.

I hate this primitive AI fashion wave, negative added value in most cases, 0 in the rest, yet users have to foot the bill. Seeing some minor industry crash due to unfulfilled expectations is just logical in such case.

1 more reply

jgalt2121y ago

> The AI investment bubble is losing hype

I disagree. Palantir is trading at 200X earnings.

IAmGraydon1y ago

I think we found the guy with NVDA calls.

ChaitanyaSai1y ago

I've trained as a neuroscientist and written a book about consciousness. I've worked in machine learning and built products for over 20 years and now use AI a fair bit in the ed-tech work we do.

So I've seen how the field has progressed and also have been able to look at it from a perspective most AI/engineering people don't -- what does this artificial intelligence look like when compared to biological intelligence. And I must say I am absolutely astonished people don't see this as opening the flood-gates to staggeringly powerful artificial intelligence. We've run the 4-minute mile. There are hundreds of billions of dollars figuring out how to get to the next level, and it's clear we are close. Forget what the current models are doing, it is what the next big leap (most likely with some new architecture change) will bring.

In focusing on intelligence we forget that it's most likely a much easier challenge than decentralized cheap autonomy, which is what took the planet 4 billion years to figure out. Once that was done, intelligence as we recognize it took an eye-blink. Just like with powered-flight we don't need bioliogical intelligence to transform the world. Artificial intelligence that guzzles electricity, is brittle, has blind spots, but still capable of 1000 times more than the best among us is going to be here within the next decade. It's not here yet, no doubt, but I am yet to see any reasoned argument for why it is far more difficult and will take far longer. We are in for radical non-linear change.

randcraw1y ago

I've worked in AI for the past 30 years and have seen enthusiasm as robust as yours go bust before. Just because some kinds of narrow AI have done extraordinarily well -- namely those tasks that recognize patterns using connections between FSMs -- does not mean that same mechanisms will continue to scale up to human-level cognition, much less exceed it any time soon.

The breakthroughs where deep AI have excelled -- object recognition in images, voice recognition and generation, and text-based info embedding and retrieval -- these require none of the multilevel abstraction that characterizes higher cognition (Kahneman's system 2 thinking). When we see steady progress on such frontiers, only then a plausible case can and should be made that the essentials of AGI are indeed within our grasp. Until then, plateauing at a higher level of pattern matching than we had expected -- which is what we have seen many times before from narrow AI -- this is not sufficient evidence that the other requisite skills needed for AGI are surely just around the corner.

spacemanspiff011y ago

So I am a neophite in this area, but my thesis for why "this time is different" compared to previous AI bubbles is that this time there exist a bunch of clear products (or paths to products) that work and only require what is currently available in terms of technology.

Coding assistants today are useful, image generation is useful, speach recognition/generation is useful.

All of these can support businesses, even in their current (early) state. Those businesses have value in funding even 1% improvements in engineering/science.

I think that this is different than before, where even in the 80s there were less defined products, amd most everything was a prototype that needed just a bit more research to be commercially viable.

Where as in the past, hopes for the technology waned and funding for research dropped off a cliff, today's stuff is useful now, and so companies will continue to spend some amount on the research side.

5 more replies

KoolKat231y ago

All Kahneman's system 2 thinking is just slow deliberate thinking. And these models do indeed have this characteristic to an extent, as evidenced with chain of thought reasoning.

You can see this in action with multiplication. Much like humans when asked to guess the answer, they'll get it wrong, unless they know the answer from rote learning multiplication tables, this System-1 thinking. In many cases when asked they can reason further and solve it, by breaking it down and solving it step by step, much like a human, this is system-2 thinking.

In my opinion, it seems nearly everything is there for it it to take the next leap in intelligence, it's just putting it all together.

1 more reply

tim3331y ago

If you look at AI history there is often fairly steady progress in a given skill area for example chess programs improved in a steady way on ELO scores and you could project pretty well the future by drawing a line on a graph. Similarly large language models seem to be progressing from toddler like to high school student like (now) to PhD like - shortly. There are skills AI are still fairly bad at like the higher level reasoning you mention, and in robot form being able to pop to the shops to get some groceries say but I get the impression those are also improving in a steady way and it won't be so long.

sandspar1y ago

For readers' edification, would you mind making a strong hypothetical argument for why this time it actually is different, from an expert's perspective?

b_be_building1y ago

I have been using Chat-GPT has a full time expert and I can unequivocally tell you that its a transformative piece of technology. The technology isn't hyped.

2 more replies

tasuki1y ago

"AGI" is a nonsense term anyway. Humans don't have "general" intelligence either: our intelligence is specialized to our environment.

2 more replies

floppiplopp1y ago

When Weizenbaum demonstrated Eliza to his colleagues, some thought there was an intelligent consciousness at the heart of it. Few even continued to believe this after they were shown the source code, which they were able to read and understand. Human consciousness is full of biases and the most advanced AI cannot reliably determine which of two floats is bigger or even solve really simple logic puzzles for little kids. But I can see how these things mesmerize true believers.

K0balt1y ago

While it is true that LLM’s lack agency and have many weaknesses, they form a critical part of what machine learning has lacked until transformers became all of the rage.

The things that LLM’s are bad at are largely solved problems using much simpler technology. There is no reason that LLM’s have to be the only component in an intelligent agent. Biological brains have Specialized structures for specialized tasks like arithmetic. The solution is probably integration of LLMs as a part of a composite system that includes database storage, a code execution environment, and multiple agents to form a goal directed posit - evaluate loop.

I’ve had pretty remarkable success with this architecture running on 12b models and I’m a nobody with no resources.

LLM’s by themselves just come up with the first thing that crosses their”mind”. It shouldn’t be surprising that the very first unfiltered guess about a solution might be suboptimal.

There is a vast amount of knowledge embedded in our cultural matrix, and a lot of that is captured in the common crawl and other datasets.llms are like a search engine for that data , based on meaning rather than semantics.

ben_w1y ago

> the most advanced AI cannot reliably determine which of two floats is bigger

Some of the most advanced AI are tool users and can both write and crucially also execute python, and embed the output in their responses.

> or even solve really simple logic puzzles for little kids.

As given in a recent discussion: https://chatgpt.com/share/ee013797-a55c-4685-8f2b-87f1b455b4...

(Custom instructions, in case you're surprised by the opening of the response).

slashdave1y ago

Especially if you remember that the change needed for the first "breakthrough" (GPT4) was RLHF. That is, a model that was specifically trained to mesmerize.

hnfong1y ago

At this point bringing up the ELIZA argument is basically bad faith gaslighting…

Finding bugs in some models doesn’t mean you have a point about intelligence. If somebody could apply a similar argument to dismiss human intelligence, you don’t have a point. And here it goes: the most advanced human intelligence can’t reliably multiply large numbers or recall digits of Pi. Obviously humans are dumber than pocket calculators.

1 more reply

phito1y ago

> but I am yet to see any reasoned argument for why it is far more difficult and will take far longer

I am yet to see any reasoned argument for why it is easy to build real AI and that it will come fast.

As you said, AI has been there for decades and stagnated for pretty much the whole time. We've just had a big leap, but nothing says (except BS hype) that we're not in for a long plateau again.

bamboozled1y ago

Well he gave you a list of credentials of why you should believe him. Isn’t that enough ?

5 more replies

tasuki1y ago

> I am yet to see any reasoned argument for why it is easy to build real AI and that it will come fast.

We have "real ai" already.

As for future progress, have you tried just simple interpolation of the progress so far? Human level intelligence is very near. (Though of course artificial intelligence will never exactly match human intelligence: it will be ahead/behind in certain aspects...)

1 more reply

tim3331y ago

>nothing says ... that we're not in for a long plateau again

The thing that's different this time is the hardware capacity in TFLOPs and the like passing human brain equivalence.

There's a massive difference between much worse than human AI - a bit meh, and better than human AI - changes everything.

>any reasoned argument for why it is easy to build real AI and that it will come fast

It probably won't be easy but the huge value of better than human AI will ensure loads of the best and brightest working on it.

dartos1y ago

> I am yet to see any reasoned argument for why it is far more difficult and will take far longer.

For language models specifically, they are trained on data and have historically been improved by increasing the size of the model (by number of parameters) and by the amount and/or quality of training data.

We are basically out of new, non-synthetic text to train models on and it’s extremely hard work to come up with novel architecture that performs well against transformers.

Those are some simple reasons why it will be far more difficult to improve general language models.

There are also papers showing that training models on synthetic data causes “model collapse” and greatly reduces output quality by magnifying errors already present in the model, so it’s not a problem we can easily sidestep.

It’s an easy mistake to see something like chatgpt not exist, then suddenly exist and assume a major breakthrough happened, but behind the scenes there has been like 50 years of R&D that led to it, it’s not like suddenly there was a breakthrough and now the gates are open.

A general intelligence for CS is like the elixir of life for medicine.

slidehero1y ago

>We are basically out of new, non-synthetic text to train models

this is not even remotely true.

There is an astronomical amount of data siloed by publishers, professional journals etc. that is yet to be tapped.

OpenAI is making inroads by making deals with these content owners for access to all that juicy data.

2 more replies

0points1y ago

> There are hundreds of billions of dollars figuring out how to get to the next level, and it's clear we are close.

Are we really now?

The smart people I've spoken to on the subject seem to agree the current technology based on LLM are at the end of the road and that there are no breakthrough in sight.

So what is your take on the next level?

KoolKat231y ago

Define breakthrough, there's plenty of room to scale and optimize without any need for a breakthrough (well my definition of breakthrough). Emergent properties so far have been obtained purely from scaling.

1 more reply

washadjeffmad1y ago

Agreed. This is something I didn't think I'd see in my lifetime, let alone be poised to be able to run locally. The alignment is fortuitous and staggering.

People focused on the products are missing out on the dawn of an epoch. It's a failure of perspective and creativity that's thankfully not universal.

Breza1y ago

> Just like with powered-flight we don't need bioliogical intelligence to transform the world.

Powered flight offers a cautionary tale for AI. The first confirmed powered flight was in 1903. For the next 60 years, someone broke the airspeed record almost every year. The current record was set in 1976. Nobody has broken that record for 48 years. There are concerns that the state of AI will show a similar pattern, with rapid improvements followed by a plateau.

surfingdino1y ago

What are you talking about? What autonomy? Try the latest Gemini Pro 1.5 and ask it for the list of ten places to visit in Spain. Then ask it for the Google Maps URLs for those places. It will make up URLs that point to nowhere. This os of zero value for personal or business use. I have dozens of examples of such crappy outcomes from all "latest", "most powerful" products. AI is smoke and mirrors. It is being sold as a very expensive solution to a non-existent problem and is not getting any better in the future. Some wish AI had someone like Steve Jobs to properly market it, but even Steve Jobs could not make a crappy product sell. The whole premise of AI goes against what generations of users were told--computers always give correct answers and given the same input parameters return the same output. By extension, we were also taught that GIGO (Garbage-In, Garbage-Out) is what we can blame when we are not happy with the results computers generate. AI peddlers want us to believe in and pay for VIGO (Value-In, Garbage-Out) and I'm sorry but there is not a valid business model where such tools are required.

seoulmetro1y ago

What's the next big step? What will it do? Why do we need or want it? Surely you have the answer.

This means you are sure we are close to automated driving, engineering and hospitality?

edanm1y ago

> This means you are sure we are close to automated driving [...]

We already have "automated driving" in some sense. Some cities have fully autonomous taxi services that have operated for a year or more, iirc.

1 more reply

seydor1y ago

From a neuroscience perspective , current AI has not helped explain much about real brains. It did however validate the connectionist model of intelligence and memory, to the point that alternate theories are much less believable nowadays. It is interesting to watch the deep learning field evolve, hoping that at some point it will intersect with brain anatomy.

greenthrow1y ago

You sound like you don't actually understand anything about LLMs and are buying into the hype. They are not cognizant let alone conscious. They don't understand anything. The tokens could be patterns of colored shapes with no actual meaning, only statistical distributions and nothing about how the LLMs work would change.

stevenhuang1y ago

> The tokens could be patterns of colored shapes with no actual meaning, only statistical distributions and nothing about how the LLMs work would change.

I can put your brain in a vat and stimulate your sensory neurons with a statistical distribution with no actual meaning, and nothing about how your brain works would change either.

The LLM and your brain would attempt to interpret meaning with referent from training, and both would be confused at the information-free stimuli. Because during "training" in both cases, the stimuli received from the environment is structured and meaningful.

So what's your point?

By the way, pretty sure a neuroscientist with 20 years of ML experience has a deeper understanding of what "meaning" is than you do. Not to mention, your response reveals a significant ignorance of unresolved philosophical problems (hard problem of consciousness, what even is meaning) which you then use to incorrectly assume a foregone conclusion that whatever consciousness/meaning/reasoning is, LLMs must not have it.

I'm partial to doubting LLMs as they are now have the magic sauce, but it's more that we don't actually know enough to say otherwise, so why state that we do know?

We can't even say we know our own brains.

2 more replies

IAmGraydon1y ago

>I've trained as a neuroscientist

Can you explain what this means? Do you have a degree in neuroscience?

lasc4r1y ago

>We've run the 4-minute mile.

>We are in for radical non-linear change.

We aren't running miles much quicker than 4 mins though. The last record was 3m:43s set by Hicham El Guerrouj in 1999.

PaulRobinson1y ago

While this is true, I think you’re not appreciating the metaphor.

Humankind tried to break the 4 minute mile for hundreds of years - since measuring distance and time became accurate enough to be sure of both in the mid-18th century, at least - and failed.

In May 1954, Roger Bannister managed it. By late June it was done again by a different runner. Within 20 years the record was under 3:45, and today there are some runners who have achieved it more than 100 times and nearly 1800 runners who have done it at all.

Impossible for hundred of years, and then somebody did it, and people stopped thinking it was impossible and started doing it themselves. That’s the metaphor: sometimes we think of barriers that are really mental, not real.

I’m not sure that applies here either, but the point is not that progress is continuously exponential, but that once a barrier is conquered, we take on a perspective as if the barrier were never real in the first place. Powered flight went through this. Computing hardware too. It’s not an entirely foolish notion.

blitzar1y ago

The 1-minute mile must be right around the corner, and when that inevitably gets broken, the 1-second mile will follow swiftly.

1 more reply

benterix1y ago

> and it's clear we are close

I'd like to believe it more than you do. Unfortunately, in spite of these millions of dollars, the progress on LLMs has stalled.

ypeterholmes1y ago

Awesome comment. I've written a piece here about the relationship between AI and human consciousness. Would love some feedback if you're able. Thanks! https://peterholmes.medium.com/the-conscious-computer-af5037...

PS. I'm buying your book right now.

matteoraso1y ago

I don't know anything about neuroscience, but is there anything in the brain even remotely like the transformer architecture? It can do a lot of things, but I don't think that it's capable of emulating human intelligence.

warkdarrior1y ago

I don't know anything about biology, but is there anything in birds even remotely like the airplane? It can do a lot of things, but I don't think that it's capable of emulating bird flight.

cosmicradiance1y ago

On the current state of AI - do you believe it has "intelligence" or is the underlying system a "prediction machine"?

What signs do you see that make you believe that the next level (biological intelligence) is on the horizon?

zaptrem1y ago

We are but prediction machines https://www.psy.ox.ac.uk/news/the-brain-is-a-prediction-mach...

3 more replies

sandspar1y ago

The potential rewards are so great that you might be overestimating the odds this will come about. Even lottery skeptics might buy a lottery ticket if the prize is a billion dollars.

dogcomplex1y ago

Fascinating background - would love to pick your brain on how you see current LLMs/ML comparing to neuroscience. What do you see that's missing still, if anything?

If I had to bet, I would start with:

- Error-correcting specialized architectures for increasing signal-to-noise (as far as I can tell these are what everyone is racing to build this year, and should be doable with just conventional programming systems wrapping LLMs)

- Improved energy efficiency (as yes, human brains are currently much more efficient! But - there are also simple architecture improvements (both software and hardware) that are looking to save 100x. Specialized ASIC ternary chips using 1999's tech should be here quite soon, a lot more efficient in price and energy.)

- No Backwards-propagation. (As yes, the brain does seem to do it all with forward-propagation only. Though this is possible and promising in neural networks like the Forward-Forward algorithm too, they haven't been trained to the same scales as backprop-heavy transformers (and likely have a lower performance in terms of noise/accuracy). Though if I'm not mistaken, the brain does have forward-backward loops, but the signals go through separate neurons for each direction (rather than reusing one) - if so that's close to backprop by itself, but probably imposes a tradeoff as the same signal can't be perfectly reproduced backwards, yet it can perhaps be enhanced to be just the most relevant information by the separate specialized neuron. I'm obviously mostly ignorant of the neuroscience here but halfway-knowledgeable on the ML theory haha

But yes, I completely agree - the flood gates are already open. This is a few architecture quibbles away from an absolute deluge of artificial intelligence that will dwarf (drown?) anything we've known. Good point on decentralized cheap autonomy - the real accomplishment of life. Intelligence, as it appears, is just a fairly generous phenomenon where any autonomous process continually improves its signal-to-noise ratio... many ways to accomplish that one! Looking forward to seeing LLMs powered by ant colonies and slime molds, though I suspect by then there will be far more interesting and terrifying realities unlocked.

nprateem1y ago

It's obvious there's potential. It's also obvious it requires at least one other major breakthrough. But no one knows how far away that is.

1 more reply

fire_lake1y ago

The argument that we are going to see massive progress soon is weak in my view. It seems to be:

- we had some big breakthroughs recently

- some AI “godfathers” are “really worried”

limit499karma1y ago

Why are you throwing in 'consciousness' in a comment regarding mechanical intelligence?

whatever11y ago

There is no guarantee that we will not get stuck with these probabilistic parrots for 50 more years. Definitely useful, definitely not AI.

And by the way I can copy your post character by character, without hallucinating. So I am definitely better than this crop of "AI" in at least one dimension.

pdimitar1y ago

> And I must say I am absolutely astonished people don't see this as opening the flood-gates to staggeringly powerful artificial intelligence.

This looks like a cognitive dissonance and they are addressed by revisiting your assumptions.

No flood-gates have been opened. ChatGPT definitely found uses in a few areas but the number is very far from what many people claimed. A few things are really good and people are using them successfully.

...But that's it. Absolutely nothing even resembling the beginnings of AGI is on the horizon and your assumption that the rate of progress will remain the same -- or even accelerate -- is a very classic mistake of the people who are enthusiasts in their fields.

> There are hundreds of billions of dollars figuring out how to get to the next level, and it's clear we are close.

This is not clear at all. If you know something that nobody else does, please let us know as well.

layer81y ago

We don’t know what the next big leap will bring and when it will happen. The occurrence of a singular previous big leap cannot serve as any reliable predictor.

reportgunner1y ago

> I am absolutely astonished people don't see this as opening the flood-gates to staggeringly powerful artificial intelligence.

Perhaps it's confirmation bias ?

_acco1y ago

AI (specifically Claude Sonnet via Cursor) has completely transformed my workflow. It's changed my job description as a programmer. (And I've been doing this for 13y – no greenhorn!)

This wasn't the case with GPT-4/o. This capability is very new.

When I spoke to a colleague at Microsoft about these changes, they were floored. Microsoft has made themselves synonymous with AI, yet their company is barely even leveraging it. The big cos have put in the biggest investments, but also will be the slowest to change their processes and workflows to realize the shift.

Feels like one of those "future is here, not evenly distributed yet" moments. When a tool like Sonnet is released, it's not like big tech cos are going to transform over night. There's a massive capability overhang that will take some time to work itself through these (now) slow-moving companies.

I assume it was the same with the internet/dot-com crash.

Calavar1y ago

I feel like I'm living in a different universe sometimes. The consensus on HN seems to be that you can be pretty productive with LLMs as coding assistants, but every time I try I find it borderline impossible to get functional code even for pretty straightforward prompts.

I decided to fire up GPT-4o again today to see if maybe things have gotten better over the past few months.

I asked GPT to write code to render a triangle using Vulkan (a 3D graphics API). There are about 1000 tutorials on this that are almost certainly in GPT-4's training data. I gave GPT two small twists so it's not a simple case of copy/paste: I asked it 1) to apply a texture to the triangle and 2) to keep all the code in a single function. (Most tutorials break the code up into about a dozen functions, but each of these functions are called only once, so it should be trivial to inline them.)

Within the first ten lines, the code is already completely nonfunctional:

GPT-4o declares a pointer (VkPhysicalDevice) that is uninitialized. It queries the number of graphics devices on the host machine. A human being would allocate a buffer with that number of elements and store the reference in the pointer. GPT-4o just ignores the result. Completely ignores it. So the function call was just for fun, I guess? It then tries to copy an entire array of VkPhysicalDevice_T objects into this uninitialized pointer. So that's a guaranteed memory access violation right off the bat.

mns1y ago

Sometime I think there's something wrong with me. I've used copilot, I'm paying for ChatGPT and we're also having the Jetbrains AI, and there's just something so off about all of this.

Some basic things are fine, but once you get into specialised things, everything gets just terribly wrong and weird. I can't even put it into words, I see people saying how they are 10x more productive (I'd like to see actual numbers and proof for this), but I just don't see how. Maybe we're working on very custom stuff, or very specific things, but all of these tools seem to give very deep or confident answers that are just plain wrong and shallow. Just yesterday I used GPT 4o for some basic help with Puppet, and the examples it printed, even though I would say it's quite basic, were just wrong, but in the sense of having to debug it for 2 hours just to figure out how ridiculous the error was.

I fear the fact that people will end up releasing unsafe, insecure and simply wrong code every day, code that they never debug and not even understand, that maybe works for a basic set of details, but once the real world hits it, it will fail like those self driving cars driving full speed into a trailer that has the same color as the road or sky.

6 more replies

huhkerrf1y ago

I feel the same way. Anytime someone says they don't find LLMs all that useful, the exact same comments come out:

"They clearly aren't using the right model!"

"It's obvious they don't know how to prompt, or they would see the value."

"Maybe it can't do that today, but GPT-5 is just around the corner."

I feel more and more that people have just decided that this is a technology that will do everything you can imagine, and no evidence to the contrary will change their priors.

6 more replies

mikhmha1y ago

I feel the exact same way. I have felt 0 need to use an LLM in my current workflow. If I could explain concisely what my problem is in English - then I would already know the answer. In that case why would I be asking AI the question. And I don't want opinionated answers to subjective questions. Why run that through a filter, I can investigate better myself through links on the internet.

By the way - I think AI or ML whatever has some valid uses right now. but mostly in image processing domain - so like recognizing shapes in some bounded domain OK yea. Generative image - NOT bad but theres always this "AI GLOW" to each image. Something is always off. Its some neat tools but a race to the bottom and mostly users want to generate explicit content lets be real. and they will become increasingly more creative and obtuse to get around the guards. nothing is stopping you from entering the * industry and making tons off money. but that industry is always doing good.

a friend recently suggested to use AI to generate generic icons for my game. Thats a really good use case. but does that radically change the current economy?

[BTW GENERIC STUFF ONLY UNTILL I could hire someone because i prefer that experience way more. you can get more interesting results. 4 eyes are better than 2.

2 more replies

structural1y ago

I've been very productive using LLMs, without any expectations of them "writing functional code". Instead, I've mostly used them as if I was working with a human research librarian.

For example, I can ask the LLM things like "What are the most common mistakes when using the Vulkan API to render a triangle with a texture?" and I'll very rapidly learn something about working with an API that I don't have deep understanding of, and I might not find a specific tutorial article about.

As another example, if I'm an experienced OpenGL programmer, I can ask directly "what's the Vulkan equivalent of this OpenGL API call?" and get quite good results back, most of the time.

So I'm asking questions where an 80% answer is still very valuable, and it's much faster than searching for documentation and doing a lot of comparison and headscratching, and it works well enough even when there's no specific article I could find in a variety of searches.

Anything better that the technology gets from here just makes things even easier yet!

magicalhippo1y ago

I was skeptical like you, but recently decided to try it out. I wasn't expecting much, and as such I was slightly surprised.

For example, just now my NAS stopped working because the boot device went offline. So I got to thinking about writing a simple syslog server. I've never looked at the syslog protocol before, and I've never done any low-level TCP/UDP work in C# yet.

So I asked ChatGPT to generate some code[1], and while the result is not perfect it's certainly better than nothing, and would save me time to get going.

As another example, a friend who's not very technical wanted to make an Arduino circuit to perform some automated experiment. He's dabbled with programing and can modify code, but struggles to get going. Again just for kicks, I asked ChatGPT and it provided a very nice starting point[2].

For exploratory stuff like this, it seems to provide a nice alternative to searching and piecing together the bits. Revolutionary is a quite loaded word, but it's certainly not just a slight improvement on what we had before LLMs and instead feels like a quantum leap.

[1]: https://chatgpt.com/share/f4343939-74f1-404d-bfac-b903525f61... (modified, see reply)

[2]: https://chatgpt.com/share/fc764e73-f01f-4a7c-ab58-f43da3e077...

4 more replies

insane_dreamer1y ago

My experience with CoPilot has been similar. I don't think it's given me a single piece of code that just worked. It always took several back and forth of me telling the "AI" that I need it to do X instead of Y (which was included in the original instructions but ignored).

It seems to work best if I start with something very simple, and then layer on instructions ("now make it do X").

Where I have found it saves me time is in having to look up syntax or "gotchas" which I would otherwise search StackOverflow for. But as far as "writing code" -- it still feels a long way from that.

drogus1y ago

I have a similar experience, but I still use LLMs, just a bit differently. I pretty much never ask it to generate complex code. I also rarely ask for definitions or facts, cause of the tendency to generate garbage answers. What I use it for is usually tedious stuff that is easy to do, but would take me more time to type, rather than ask the LLM.

For example:

* I need a simple bash script for file manipulation or some simple tasks like setting up a project (example: download a secret from AWS SSM, check if an executable exist, if it doesn't write instructions on how to install it on most popular systems etc)

* I need a simple HTTP API, nothing fancy, maybe some simple database usage, maybe running some commands, simple error handling

* I need a YAML file for Kubernetes. I say what I need and usually, it gets most of it right

* I want an Ansible task for some simple thing. Ansible is quite verbose, so it's often saving me time

* I have a Kubernetes YAML file, but I want to manage it in terraform - I'll then ask to convert YAML to a terraform entry (and in general converting between formats is nice, cause even if you have only a piece of what you want to convert, LLMs will most of the time get it right)

* Surprisingly, it often gets openssl and ffmpeg commands right - something I always have to google anyway, especially openssl certificates generation or manipulation

* Giving it a function I wrote and writing test samples after providing a list of what it should test (and asking if it can come up with more, but sadly it rarely does generate anything useful on top of what I suggest)

chrisandchris1y ago

Same here.

A friend, whose SQL knowledge is minimal, used an LLM to query data from a database over a couple of tables. Yes, after a lot of trial and error he (most probably) got the correct data, however the only one being able to read the query is the LLM itself. It's full of coalesce, subselects that repeat the same joins again and again.

LLM will do a lot for you, but I really hate this "this will [already did] solve everything". No, it did not and no, because it's quality is those of a junior dev, at max.

1 more reply

somenameforme1y ago

Lots of companies are directly involved with LLMs, or working to leverage it for startups or their existing projects. And I think a fair chunk of all the employees working at these places probably post on HN (a crazy high percent of the recent batches of YC applicants were using LLM stuff, for instance). That's going to lead to a sizable number of perspectives and opinions that are not especially free of bias, simply because you want to believe that what you're working on is ultimately viable.

And I think it'd be extremely easy to convince oneself of this. Look at where 'AI' was 5 years ago, look at where it is today and then try to imagine where it will be in another 5 years. Of course you have to completely blind yourself to the fact that the acceleration has clearly sharply stalled out, but humans are really good at cognitive dissonance, especially when your perception of your future depends on it.

And there's also the point that even though I'm extremely critical of LLMs in general, they have absolutely 'transformed' my workflow in that natural language search of documentation is really useful. Being able to describe a desired API, but in an overly broad way that a search engine can't really pick up on, but that an LLM [often] can, is just quite handy. On the other hand, this is more a condemnation of search engine tech being frozen 20 years in the past than it is about an imminent LLM revolution.

1 more reply

speleding1y ago

Where LLM shine is as a faster alternative to Google and Stack overflow. "How do I reverse an array in language X?". This will give you the right answer in seconds without having to click through garbage.

Especially if it's a question that's hard to Google, like "I remember there is more than one way to split an array in this language, list them". This saves me minutes every day.

But it's especially helpful if you are working on projects outside your own domain where you are a newbie.

1 more reply

roseway41y ago

Something worth noting is that the parent comment refers to using Cursor, not ChatGPT/Claude.ai. The latter are general-purpose chat (and, in the case of ChatGPT, agentic) applications.

Cursor is a purpose-built IDE for software development. The Cursor team has put a lot of research and sweat into providing the used LLMs (also from OpenAI/Anthropic) with:

- the right parts of your code

- relevant code/dependency documentation

- and, importantly, the right prompts.

to successfully complete coding tasks. It's an apple and oranges situation.

anonzzzies1y ago

> but every time I try I find it borderline impossible to get functional code even for pretty straightforward prompts.

I work in 9 different projects now and I would say that around 80% of functional code comes from Sonnet (like GP) for these projects. These are not (all) trivial either; there is a very niche (for banking) key/value store written in Go for instance which has a lot of edge cases etc, all the plumbing (x,err = etc aka stuff people find annoying) comes from sonnet and works one-shot. A lot of business logic comes from sonnet too; it works but usually needs a little tweaking to make it correct.

Tests are all done by Sonnet. I think 80% is low balling it on Go code really.

We have a lot of complex code generator stuff and DSLs in TS which also works well often. Sometimes it gets some edge cases wrong, but either re-prompting with more details or fixing it ourselves, will do it. At a fraction of the time/money of what a fully human team would deliver.

I wrote a 3d editor for fun with Sonnet in a day.

I have terrible results with gpt/copilot (copilot is good for editing instead of complete files/functions; chatgpt is not good much compared with sonnet); it doesn't get close at all; it simply keeps giving me the same code over and over again when I say it's wrong; it hardcodes things specifically asked to make flexible etc. Not sure why the difference is so massive all of sudden.

Note: I use the sonnet API, not the web interface, but same for gpt so...

1 more reply

kraig9111y ago

Essentially to me it feels like almost all LLM's are either mid -> terrible if it's any system programming. Especially I've not had luck with anything outside of web programming and moreso anything NOT javascript / python. (jeez I wrote this sentence terribly)

varjag1y ago

I had a decent experience putting together frontend code for a demo control system with Mixtral within a couple days. I'm a seasoned programmer but I don't do JS. It stumbled a dozen times but it fulfilled the task of me avoiding to learn JS.

However once you step outside JS or Python, the models are essentially useless. Comprehension of pointer semantics? You wish. Anything with Lisp outside its training corpus of homework assignments? LOL. Editing burden quickly exceeds any possible speed-up.

zamadatix1y ago

LLM coding assistants are more like another iteration in the jump from nothing -> Stack Overflow type resources than a replacement for you doing coding work as a programmer just because you can phrase the task as a prompt. If you measured the value of using resources like tutorials Stack Overflow posts by blindly merging the first 2 hundred line examples of the things you want to do, finding the result wasn't ideal, and declaring it a useless way to get functional code people would (rightfully) scratch their heads at you when you say they are living in a different world. Of course it didn't work right away, perhaps it was still better than figuring all of the boilerplate out on your own the first time you did it?

wsc9811y ago

For LOVR (Lua based 3D/VR framework), I found these LLMs pretty much useless, both ChatGPT and Claude. Seems all is trained on old APIs, so it takes quite a bit of effort to make any suggestions work with newest LOVR version.

1 more reply

maccard1y ago

Gpt seems to have gotten worse, Claude is the new hotness.

But, I agree with your sentiment that asking it to do stuff like that often doesn’t work. I’ve found that what it _can_ do is stuff like “here’s a Model object, write a query to fetch it with the schema I told you about ages ago”. It might not give perfect results, but I know how to write that query and it’s faster to edit Claude’s output than it is to write it from scratch.

brailsafe1y ago

I'd go one step further and suggest that one of the few things LLMs are good at is acting as astroturfing agents for themselves. I must be crazy to not be completely changing my workflow and job title with [THE_GOOD_ONE] and not that other one, and wow so many other people vocally feel the same way on every other forum that's extremely easy to manipulate.

Fwiw, I've had some helpful successful prompts here and there, and in some very narrow scopes I'll get something usable, like parsing JSON or scaffolding some test cases, which is real saved time, but I stopped thinking about these tools long ago.

To get real value out of something like your example, I'd be using it as a back and forth to help me understand how some concepts work or write example questions I can drill on my own, but nothing where precision matters

XCSme1y ago

It's the same for me. It has to work 100%. I have 15+ years of coding experience, so I can code arguably fast. If I ask the LLM to code something, and it only partially works, then debugging and fixing the generated not-so-optimal code takes me more time than simply writing it from scratch.

There are also some more gotchas, like the generated code using a slightly different package versions than the installed ones.

bulbosaur1231y ago

> I feel like I'm living in a different universe sometimes. The consensus on HN seems to be that you can be pretty productive with LLMs as coding assistants, but every time I try I find it borderline impossible to get functional code even for pretty straightforward prompts.

Same, it can't even fix an xcode memory leak bug in a simple app. It will keep trying and breaking it non-stop. Garbage

pif1y ago

> The consensus on HN seems to be that you can be pretty productive with LLMs as coding assistants

If you define "productive" as writing a simple CRUD web application that your 13-year-old cousin could write between two gaming sessions, then you'll consider LLMs as sacred monsters.

Snake oil vendors always had great appeal over people who didn't know better.

1 more reply

atum471y ago

Chat gpt has helped me with some complex sql queries. I had to correct it a couple of times but in the end it worked.

tomjen31y ago

Counterpoint: I have gotten real value out of dumping broken SQL into ChatGPT and have it fix it for me. I could 100% have done that by myself, but it would have meant I have to go and google the right syntax.

AI is great for me, but it is more like a junior developer you are pairing with than a replacement.

protocolture1y ago

Chat GPT seems really good at frequently asked questions.

Like simple python scripts for Home Assistant it just nails first go.

badgersnake1y ago

There’s a lot of people on here with a financial interest in AI doing well. So they hype it continuously.

phyalow1y ago

Thats because GPT products are in a different (much worse) universe to Anthropics Sonnet/Opus, they are truly phenomenal.

Give Anthropic a shot (its even better via the API console.anthropic.com/workbench).

OpenAI is yesterdays news.

ryanjshaw1y ago

> I gave GPT two small twists so it's not a simple case of copy/paste

Why? I see it like querying a database of human knowledge. I wouldn't expect a SQL database to infer information it's never seen before, why would I expect an LLM to do so?

I use it where I know a solution exists but I'm stumped on the syntax or how to implement it in an unfamiliar environment, or I want to know what could have caused a bug based on others' experience etc.

hattmall1y ago

I've almost gotten GPT-4o to do some accurate regex. Probably 75% of the time it's close enough that I can tweak it and make it work.

darkerside1y ago

I'm guessing AI won't be great with anything nontrivial related to pointers for a while since they require actual thinking

1 more reply

fragmede1y ago

what did it do when you told it all of those things? was it able to fix the problems when you pointed them out? did you give it one prompt and expect perfect code out on the first try? is that how you code? all your code complies and runs flawlessly first try? I'm jealous. it usually takes me a bunch of passes before I get things right.

here's a chat for a uc and LCD chip that I picked at random (and got the name wrong for) (and didn't want raspberry pi code for so it stopped it short on that response)

https://chatgpt.com/share/2004ac32-b08b-43d7-b762-91543d656a...

benrutter1y ago

Yes me too! I don't have any stake for finsing LLMs unhelpful, and would love to have a tool to make me more productive.

Would be really interesting if anyone had blog posts on their actual workflow with LLMs, in case there's something I'm doing different.

sod1y ago

Right now AI is like a chissel. It's a very useful tool, but not useful for everything. Banging your head against the wall of capabilities will give you an intuition when you will pull this tool. Just like you learned how to use a search engine effectively over the last 20 years.

When you are familiar with LLMs, then a question from someone who doesn't use AI is very obvious. It's the same feeling you have when you roll your eyes and say "you could have googled that in 10 seconds".

It's either explaining code where you don't even know the lingo for or what the question could be. Or touching code with a framework you never used. Or tedious tasks like convert parts of text into code or json. Or sometimes your mind is stuck or drifts off. Ask AI for an idea to get the ball rolling again.

Yes, discovering what works and what doesn't is tedious and slower then "just doing it yourself". Like switching IDEs. But if you found a handful of usecases that solve your problems, it is very refreshing.

thelittleone1y ago

Try something like maestro which uses agents and orchestrator. The looping quality checks have been very helpful. Claude-engineer from same developer is also good to experience how superior that ia to regular chat.

Midnight19381y ago

It has helped, at the cost of making me more dependent on the assistants

fumeux_fume1y ago

Yeah, these people claiming AI has been a transformative experience are just full of sh*t. I ask various models questions all the time because it's often better than googling, but all of them make a lot silly mistakes. Frequently, it can be a bit of a process to get useful results.

2 more replies

ghm21801y ago

Do you find that you accept GitHub copilits line completions like 90% of the time?

CLiED1y ago

If your productivity is near-null, LLMs can make you 100x more productive.

exe341y ago

one trick is to let it come up with what it wants (lots of functions, no texture), then run the code, give it the errors until that's fixed. then ask it to inline them, then add the texture, etc.

dkjnneuih1y ago

I look at this contrast as what I call the difference between a "programmer" and a "software engineer". These jobs are really two different universes in practice, so you're not wrong.

I saw an LLM demo at one point where it was asked to write FFT and add unit tests for it which really drove this point home for me.

A programmer is a nicer term for a code monkey. You ask them to write FFT and they'll code it. All problems can be solved with mode code. They can edit code, but on the whole it's more just to add more code. LLMs are actually pretty good at this job, in my experience. And this job is important, not all tasks can be engineered thoroughly. But this job has its scaling limits.

A software engineer is not about coding per se, it's about designing software. It's all about designing the right code, not more code. Work smarter, not harder, for scale. You ask them to write FFT and they'll find a way to depend on it from a common implementation so they don't have to maintain an independent implementation. I've personally found LLMs very bad at this type of work, the same way you and others relying to you describe it. (Ok, maybe FFT is overly simple, I'm sure an LLM can import that for you. But you get the idea.) LLMs have statistical confidence, not intellectual confidence. But software engineering generally works with code too complex for pure statistical confidence.

No offense to the LLM fans here, but I strongly suspect most of them are closer to the programmer category of work. An important job, but one more easily automated away by LLMs (or better software engineering long-term). And we can see this by how a lot of programming has been outsourced for decades to cheap labor in third-world countries: it's a simpler type of job. That plus the people biased because their jobs and egos depend on LLMs succeeding.

richardw1y ago

I stop it from coding, just ask it to brainstorm with me on a design or plan. After a few iterations it knows what I want and then I ask it for specific outputs. Or I code those myself, or some combination.

I’ve found that it’s sometimes amazing and sometimes wastes a lot of my time. A few times it’s really come up with a good insight I hadn’t considered because the conversation has woken up some non-obvious combination. I use ChatGPT, Claude, Perplexity and one or two IDE tools.

Oras1y ago

Try deepseek coder

1 more reply

aschobel1y ago

Give it another shot but with Claude Sonnet 3.5. It’s my daily driver for coding tasks.

It seems especially strong with Python but a bit medium with Swift.

1 more reply

layer81y ago

Does your work not depend on existing code bases, product architectures and nontrivial domain contexts the LLM knows nothing about?

Every thread like this over the past year or so has had comments similar to yours, and it always remains quite vague, or when examples are given, it’s about self-contained tasks that require little contextual knowledge and are confined to widely publicly-documented technologies.

What exactly floored your colleague at Microsoft?

_acco1y ago

Context is the most challenging bit. FWIW, the codebases I'm working on are still small enough to where I rarely need to include more than 12 files into context. And I find as I make the context bigger beyond that, results degrade significantly.

So I don't know how this would go in a much larger codebase.

What floored him was simply how much of my programming I was doing with an LLM / how little I write line-by-line (vs edit line-by-line).

If you're really curious, I recorded some work for a friend. The first video has terrible audio, unfortunately. This second one I think gives a very realistic demonstration – you'll see the model struggle a bit at the beginning:

https://www.loom.com/share/20d967be827141578c64074735eb84a8

2 more replies

novaleaf1y ago

I think that Greptile is on the right track. I made a repo containing the c# source code for the godot game engine, and it's "how to do X", where X is some obscure technical feature (like how to create a collision query using the godot internal physics api) is much better than all the other ai solutions which use general training data.

However there are some very frustrating limitations to greptle, so severe that I basically only use it to ask implementation questions on existing codebases, not for anything like general R&D: 1) answers are limited to about 150 lines. 2) it doesn't re-analyze a repo after you link it in a conversation (you need to start a new conversation, and re-link the repo, then wait 20+ min for it to parse your code) 3) it is very slow (maybe 30 seconds to answer a question) 4) there's no prompt engineering

I think it's a bit strange that no other ai solution lets you ask questions about existing codebases. I hope that will be more widespread soon.

1 more reply

Havoc1y ago

> What exactly floored your colleague at Microsoft?

Speaking of understand context… They floored him not other way round

pembrook1y ago

I see so many people hyping Claude Sonnet + Cursor on Twitter/X, yet in real world usage, I find it no better than GitHub Copilot (presumably GPT 4o) + VScode.

Cursor offers some super marginal UX improvements over the latter (being that it’s a fork of VScode), since it allows you to switch models. But Claude and GPT have been interchangeable at least for my workflows, so I’m not sure the hype is really deserved.

I can only imagine the excitement comes from the fact that cursor has a full-fat free trial, and maybe most people have never bothered paying for copilot?

_acco1y ago

Hmm, I've definitely always used paid copilot models.

Perhaps it's my language of choice (Elixir)? Claude absolutely nails it, rarely gives me code with compilation errors, seems to know and leverage the standard library very well, idiomatic. Not the same with GPTs.

2 more replies

synergy201y ago

Copilot user here, never tried Cursor yet, competition is always good.

did a quick check, it's $20/month, and it has a vim plugin: https://github.com/pasky/claude.vim

going to give it a spin

1 more reply

anonzzzies1y ago

I have copilot too; it's good for edits, but one shot complete function generating, I don't find it very good at (while sonnet is imho).

WalterSear1y ago

Apart from autocomplete, I got zero utility from Copilot, whereas I'm doing most coding via prompts in Cursor.

Maybe GPT4o changed things.

sakopov1y ago

I think my general perception is that AI is a great assistant for some occupations like software engineering, but due to its large room for error it's very impractical for majority of business applications that require accuracy. I'm seeing this trend at my company, which operates in the medical field and recently mandated that all engineers use CoPilot. At the same time it's a struggle to see where we can improve our business processes with AI - outside of basic things like transcriptions and spell checking - without getting ourselves into a massive lawsuit.

bartread1y ago

I've been using (GitHub) Copilot and ChatGPT since they've been widely available. I started using ChatGPT for coding after 4 came out.

I was an early advocate for Copilot but, honestly, nowadays I really don't find it that useful, compared to GPT-4o via ChatGPT.

ChatGPT not being directly integrated into my editor turns out to be an advantage. The problem with Copilot is it gets in the way. It's too easy to unintentionally insert a line or block completion that isn't what you want, or is out and out garbage, and it's constantly shoving up suggestions as I type, which can be distracting. It's particularly irritating when I'm trying to read or understand a piece of code, or maybe do a refactor, and I leave my caret in one position for half a second too long, and suddenly it's ghost-inserted a block of code as a suggestion that's moved half of what I'm reading down the screen and now I have to find my place again.

Whereas, with ChatGPT being separate, it operates at a much less intrusive cadence, and only responds when I ask it too, which turns out to be much more useful and productive.

I'm seriously considering binning of my Copilot subscription as a result.

3 more replies

Loughla1y ago

It's useful for analyzing data, but only if it can be verified. We use it (higher education) to glance at data trends that may need further exploration. So it's a fancy pivot table, I guess.

Most software vendors are selling their version of AI as hallucination free though. So that's terrifying.

1 more reply

jsemrau1y ago

>I think my general perception is that AI is a great assistant for some occupations like software engineering

I think that's why I like to compare the current state of AI to the state of the CPU industry maybe around the 286-486 era going towards the Pentium.

elicksaur1y ago

How does that mandate work? Do they just check if it’s turned on? How many suggestions get accepted?

1 more reply

slashdave1y ago

Software engineering doesn't require accuracy?

1 more reply

greenthrow1y ago

Software engineer for multiple decades here. None of the AI assistants have made any major change to my job. They are useful tools when it's time to write code like many useful tools before them. But the hard work of being a senior+ software engineer comes before you start typing.

nerdjon1y ago

I will admit, if I need to do some one off task and write a quick python script to do something I will likely go to Claude or something and write it. I am talking 20-40 lines. I think it's fine for that, it doesn't need a ton of context, it's easy to test, easy to look at and understand, etc.

But outside of that, beyond needing to remember a certain syntax, I have found that any time I tried to use it for anything more complex I am finding myself spending more time going back and forth trying to get code that works than I would have if I had just done it myself in the first place.

If the code works, it just isn't maintainable code if you ask it to do too much. It will just remove entire functionality.

I have seen a situation of someone submitting a PR, very clearly copying a method and sticking it in AI and saying "improve this". It made changes for no good reason and when you ask the person that submitted the PR why they made the change we of course got no answer. (these were not just Linter changes)

Thats concerning, pushing code up that you can't even explain why you did something?

Like you said with the hard work, sure it can churn out code. But you need to have a complete clear picture of what that code needs to look like before you start generating or you will not like the end result.

paretoer1y ago

I am not a software engineer and these tools allow me to make a giant mess of an app in a weekend that kind of does what I want but I only get that weekend. Once I come back to it after any length of time, since I have no idea what I am doing or what is going on it is impossible to update or add to the app without breaking it.

Now I have all these new ideas but I am back to square one because it just seems easier to start over.

I look forward to more powerful models in the future but I do wonder if it will just mean I can get slightly farther and make an even larger mess of an app in a weekend that I have no way to add to or update without breaking.

The main utility seems like it would be for content creation to pretend I made an app with all these great features as a non-software engineer but conveniently leave out the part about it being impossible to update.

1 more reply

JeremyNT1y ago

They help me most when using a framework, API, or language I'm not super familiar with. Beats stack overflow for that stuff.

But it's weird to me seeing people talk about these changing their jobs so much. Maybe I'm holding it wrong but I'm almost always bottlenecked by "big picture" challenges and less on the act of actually typing the code.

2 more replies

lambdaba1y ago

I think the non obvious benefit is using LLMs nudge you into putting your thoughts in narrative form and training that ability, something that someone with more experience does subconsciously.

meeb1y ago

I also find AI coding assistants useful, particularly when working on a framework heavy codebase (Django, Spring etc.). While I don’t use it to generate massive blocks of code from descriptions just the mostly accurate tab completion of boilerplate code probably saves me 10-20% of my coding time. It’s not a groundbreaking advancement but certainly a decent optimisation.

mdavid6261y ago

How did it change your workflow? I'm also a developer with 10+ years of experience. Generating code is not what I miss in my daily life. Understanding what you're doing and creating a mental model of it is the difficult part. Typing in the code is easy. I'm not sure how would a coding assistant help with that.

anonzzzies1y ago

I have the same experience; I hope some open source/weights model gets to the same standard as I find a company like Anthropic having too much power to just shut out parts of the world (it did in the beginning when not available in the EU so I had to use proxies etc).

It is of vital importance (imho) to get open models at the same level before another jump comes (if it comes of course, maybe another winter, but at least we'll have something I use every day/all day; so not all hype I think).

tippytippytango1y ago

BigCos won't let their employees use these tools to write code because you can't let code leak off prem. To keep from falling behind I use it on my weekend projects. I sense no urgency to remedy this inside, but I'm not sure how these tools would help with millions of lines of legacy code. They are awesome for creating new things or editing small things where you can build up context easily.

Havoc1y ago

> BigCos won't let their employees use these tools to write code because you can't let code leak off prem.

Ridiculous blanket statement. A bunch of places use external LLMs.

ch4s31y ago

What do you see as your biggest wins using Claude?

_acco1y ago

It helps me stay in flow by keeping me one layer up.

In pair programming, it's ideal to have a driver (hands on keyboard) and a navigator (planning, direction).

Claude can act as the driver most of the time so I can stay at the navigator level.

This is so helpful, as it's easy as programmers to get sucked into implementation details or low-level minutiae that's just not important.

machiaweliczny1y ago

I feel like currently the group feeling the most impact from Claude is frontend devs as tools like v0.dev, Claude etc. can almost create frontend from API schema (I've tested only page per page) which is great. It's probably because there's examples on the internet and that's why it works well.

m3kw91y ago

Coding assistance ain’t useful till it has around 50mb of context or whatever the size of 80% of code bases uses

maxglute1y ago

I'm a dummy with very rudimentary script kiddie tier skills, and Claude has helped me do a bunch of personal projects that would normally take months of research and stackoverflow begging in a few days. For me the hype is real, but maybe not trillions $$$ worth of value real.

MetaWhirledPeas1y ago

I think tools like Copilot are best for people learning a popular programming language from scratch. You get better answers and easier problems to solve. The more experience you have the more you can outpace it in terms of correctness and appropriateness.

bpiroman1y ago

Not a fan of Copilot - having a chat bot separate from your coding environment feels much cleaner.

paraschopra1y ago

How much do you end up paying for Claude + Cursor? I’m assuming cursor uses Claude api

mlloyd1y ago

The most valuable thing AI has done for me with coding is commenting my code and creating documentation from it. It saves so much time doing a task that hardly any of us actually want to do - and it does it well.

DaiPlusPlus1y ago

> yet their company is barely even leveraging it

...do you not see the obnoxious CoPilot(TM) buttons and ads everywhere? It's even infected the Azure Portal - and every time I use it to answer a genuine question I have I get factually-incorrect responses (granted, I don't ask it trivial or introductory-level questions...).

_acco1y ago

I said leveraging, not hawking!

candiddevmike1y ago

How did it "change your job description"?

_acco1y ago

As I mentioned in a sibling comment, I now "pair program" all day. Instead of being the driver and navigator all day, I can mostly sit "one layer up" in the navigator seat.

2 more replies

celestialcheese1y ago

100% this - Cursor's new inline multi-file editing with claude is staggeringly good. And it's _so_ new still.

adastra221y ago

What is different vs. GPT-4o & Copilot?

febed1y ago

It’s just a souped up version of Aider with an easier interface, isn’t it?

elephantears1y ago

Here's the test for experienced engineers to see if the LLM hype is bullshit or not -- use cursor with Claude Sonnet. Build an app in a programming language you don't know at all. Don't stop prompting until you build it.

elephantears1y ago

Here's a test for experienced engineers to see if the LLM hype is bs or not --

1. Use cursor with Claude Sonnet

2. Pick a programming language you don't know at all

3. Build an app in that language prompting only, don't stop prompting until you've run it successfully

deadbabe1y ago

This is just Dunning-Kruger effect.

aussieguy12341y ago

If anything, i'm getting more hyped up over time. Here are the things i've used LLMs for, with success in all areas as a solo technical founder.

Business Advice including marketing, reaching out to investors, understanding SAFE notes (follow up questions after watching the Y Combinator videos), customer interview design. All of which, as an engineer, I had never done before.

Create SQL queries for all kinds of business metrics including Monthly/Daily Active users, breakdown of users by country, abusive user detection and more.

Automated unit test creation. Not just the happy path either.

Automated data repository creation, based on a one shot example and MySQL text output describing the tables involved. From this, I have super fast data repositories that use raw SQL to get/write data.

Helping with challenging code problems that would otherwise need hours of searching google or reading the docs.

Database and query optimization.

Code Review. This has caught edge case bugs that normal testing did not detect.

I'm going to try out aider + claude sonnet 3.5 on my codebases. I have heard good things about it and some rave reviews on X/twitter. I watched a video where an engineer had a bug, described it to some tool (which wasn't specified, but I suspect aider), then Claude created a test to reproduce the bug and then fixed the code. The test passed, they then did a manual test and the bug was gone.

arcticbull1y ago

> Helping with challenging code problems that would otherwise need hours of searching google or reading the docs.

I'm glad this has been working for you -- generally any time I actually have a really difficult problem, ChatGPT just makes up the API I wish existed. Then when I bring it up to ChatGPT, it just apologizes and invents new API.

barrkel1y ago

LLMs aren't good when you drift out of the training distribution. You want to be hitting the meat in the middle and leveraging the LLM to blast through it quickly.

That means LLMs are great for scaffolding, prototypes, the v0.1 of new code especially when it's very ordinary logic but using a language or library you're not 100% up to speed on.

One project I was on recently was translation: converting a JS library into Kotlin. In-editor AI code completion made this really quick: I pasted a snippet of JS for translation in a comment, and the AI completed the Kotlin version. It was frequently not quite right, but it was way faster than without. In particular, when there was repeated blocks of code for different cases that different only slightly, once I got the first block correct, the LLM picked up on the pattern in-context and applied it correctly for the remaining blocks. Even when it's wrong, if it has an opportunity to learn locally, it can do so.

1 more reply

rcarmo1y ago

Has to be something pretty generic. I'm trying to write a little C program that talks to an LCD display via the SPI bus--something I did before a few times, but not with this particular display and MCU. There is no LLM that can even begin to reason this out since they've been mostly trained on web dev content.

1 more reply

skriticos21y ago

I found that ChatGPT needs to be rained in with the prompts, and then it does a very impressive job. E.g. you can create a function prototype (with input and output expectations) and in the body tell the logic you are thinking about in meta-code. Then tell it to write the actual code. It's also good if you want to immerse yourself into a new programming language and outline what kind of program you want, and expect the results to be different from what you throught, but insightful.

Now if you throw larger context or more obscure interface expectations at it, it'll start to discard code and hallucinate.

chromanoid1y ago

Do you provide code examples? In my experience the more specific you get with your problem the more specific are the provided solutions (probably a "natural" occurence in LLMs). Hallucianted APIs are sometimes a problem for me, but then I just specify which API to use.

3 more replies

ericmcer1y ago

That makes sense, LLM training data probably has tons of common problems in it, but maybe only a few or no instances of really niche, difficult ones. So it just comes up with a bunch of garbage.

My experience has been similar, it is amazing for stuff I am a beginner at but kinda useless for my actual work. It was invaluable today when I was trying to grasp CA zoning laws, but its almost useless for coding.

This also points to why it will never (imo) be "intelligent". It will never be able to take all its knowledge and use that to solve a problem it doesn't have training data for.

sebstefan1y ago

It's nice that everybody is trying to help with the way you're prompting but just use Bing Copilot or Phind for this, not ChatGPT

It'll generate a bunch of queries to Google (well, "to Bing" I guess in that case) based on your question, read the results for you, base its answer on the results and provide you with sources that you can check if it used anything from that webpage.

I only use ChatGPT for documentation when I have no idea where I'm going at all, and I need a lay of the land on best practices and the way forward.

For specifics, Bing Copilot. Essentially a true semantic web search

zaptrem1y ago

I assume it knows the big stuff like the PyTorch API/major JS and React libs then just paste the docs or even impl code for any libs it needs to know beyond that.

Havoc1y ago

Phrase the question as a backend problem rather than as a front end problem. Same for lang your asking in. Less JS/TS more Python

martin821y ago

That's actually an awesome feature!

It means that you are working on something that no one else has ever done before...

...or you aren't able to describe your problem correctly.

13of401y ago

There was a movie that came out in 2001 called "Artificial Intelligence", at a time when we were still figuring out how things like search engines and the online economy were going to work. It had a scene where the main characters went to a city and visited a pay-per-question AI oracle. It was very artistically done, but it really revealed (in hindsight) how naive we were about how "online" was going to turn out.

When I look at the kinds of AI projects I have visibility into, there's a parallel where the public are expecting a centralized, all knowing, general purpose AI, but what it's really going to look like is a graph of oddball AI agents tuned for different optimizations.

One node might be slow and expensive but able to infer intent from a document, but its input is filtered by a fast and cheap one that eliminates uninteresting content, and it could offload work to a domain-specific one that knows everything about URLs, for example. More like the network of small, specialized computers scattered around your car than a central know-it-all computer.

1 more reply

assimpleaspossi1y ago

fwiw, here's my results.

Every question I've asked of chatGPT, meta and Gemini have returned results that were either obvious or wrong. Pointing out how wrong the answers returned got the obvious, "I apologize" response which returned an obvious answer.

I consider all these AI engines to be interactive search engines where the results need to be double checked. The only thing these engines do, for me perhaps, is save some search time so I don't have to click around on a lot of sites to scroll for some semblance of an answer to verify.

4 more replies

meiraleal1y ago

Bad programmers have bad experience using LLMs to code. They then blame the LLM, not themselves.

2 more replies

gennarro1y ago

I tried to do some AI database clean up this weekend - simple stuff like zip lookup and standardizing spacing, and caps - and ChatGPT managed to screw it ip over and over. It’s the sort of thing there a little error means the answer is totally wrong so I spent an hour refining the query and then addressing edge cases etc. I could have just done it all in excel in less with less chance of random (hard to catch) errors.

minkles1y ago

Similar experience.

In fields I have less experience with it seems feasible. In fields I am an expert in, I know it's dangerous. That makes me worry about the applicability of the former and people's critical evaluation ability of the whole idea.

I err on the side of "run away".

JamesBarney1y ago

If the SQL took you an hour to just clean up and you're an expert that is some pretty complex SQL. I could understand how it could get it wrong.

anon2911y ago

The point is that these problems will follow the same growth trajectory as every other tech bug. In other words, they will go away eventually.

But the Rubicon is still crossed. There is a general purpose computer system that understands human language and can write real sounding human language. That's a sea change.

rpcope11y ago

> "understands human language"

I've got some oceanfront property in Wyoming to sell you.

smt881y ago

> will follow the same growth trajectory as every other tech bug

What you're referring to isn't a bug. It's inherent to the way LLMs work. It can't "go away" in an LLM model because...

> understands human language

...they don't. They are prediction machines. They don't "understand" anything.

3 more replies

mrinfinitiesx1y ago

Good. It's decent for summarizing and giving me bullet points and explaining things like I'm 5, makes it easy to code things that I don't want to code or spend time figuring out how to do with new languages, other than that, I see no real world applications outside of listening to burger king orders and putting them on a screen for people to make them. Simple support requests, and of course making buzzword-esque documents that you can feed in to a deck-maker for presentations and stuff.

All in all, it helps assist us in new ways. Had somebody take a picture of a car part that had no markings and it identified it, found the maker/manufacturer/SKU and gave all the details etc. That stuff is useful.

But now we're looking at in-authentic stuff. Artists, writers being plagiarized, job cuts (for said marketing/pitches, BS presentations to downsize teams). It's not just losing its hype, its losing any hype in building humanity for the better. It's just more buzzwords, more 'glamour' more 'pop' shoved in our faces.

The layoffs aren't looking pretty.

Works well to help us code though. Viva, sysadmins unite.

parpfish1y ago

Im really hoping that when this hype cycle ends and the next AI winter starts that all the generative stuff gets culled but we still see good work and tech using all the other advances (that would be described as “mere” deep learning).

Document embedding from transformers are great and fit into existing search paradigms.

Computer vision and image segmentation is at a level I thought impossible 10 years ago.

Text to speech that sounds natural? I might actually use Siri and Alexa! (Ok, that one might be considered “generative”)

janalsncm1y ago

The research never ended. AI money was flooding in, but mostly going directly to Nvidia. If that cash flow turns off there will still be research happening because it was mostly unaffected in the first place.

kombookcha1y ago

The hype dying off will be good for literally everybody except the investors. It'll mean fewer people trying to jam it into products as feature bloat where it has no business being, or trying to make it do tasks that it's unsuited for.

The sooner people start to find it boring, the sooner we can stop wasting time on all the hot air and just use the bits that work.

ianbutler1y ago

This supposed “cycle” has been crazy it’s been about 1.5 years since gpt4 came out, which is really the first generally capable model. I think a lot of this “cycle” is media’s wishful thinking. Humans, especially humans in large bureaucracies, just don't move this quickly. Enterprises have barely had time to dip their toes in.

For what it’s worth hype doesn’t mean sustainability anyway. If all the jokers go onto a new fad it’s hardly the skin off the back of anyone taking this seriously, they’ve been through worse times.

yieldcrv1y ago

I’ve had a lot of corporate clients this year

Large and small, entire development teams are completely unaware of the basics of “prompt engineering” for coding, and corporate has an entirely regressive anti-AI policy that doesnt factor in the existence of locally run language models, and just assumes ChatGPT and cloud based ones digesting trade secrets. People arent interested in seeing what the hype is about, and are disincentived from bothering on a work computer. I’m on one team where the Engineering Manager is advocating for Microsoft CoPilot licenses, as in, its a concept that hasnt happened and needs buy in to even start considering.

I would say most people really haven't looked into it. Work is work, the sprint is the sprint, on to the next part of the product, rinse repeat. Time flies for those people, its probably most of the people here.

al_borland1y ago

I was getting a lot of mixed messaging at my job for 6-12 months.

On the one hand, we got an email from high up saying not to use Copilot, or other such tools, as they were trying to figure out the licensing. But at the same time, we had the CIO getting up in front of the company every other month talking about nothing but GenAI, and how if we weren’t using it we were stupid (not in those exact words, but that was the general vibe, uncontrolled AI hype).

We were left sitting there saying, “what do you want from us? Do as you say or do as you do?”

Eventually they signed the deal with MS and we got Copilot, which then seemed forced on us. There is even a dashboard out there for it, listing all people using it, their manager, rolling all the way up to the CEO. It tells the percentage of reports from each manager using it, and how often suggestions are accepted. It seems like the kind of dashboard someone would make if they were planning to give out bonuses based on Copilot adoption.

I’ve gotten regular surveys about it as well, to ask how I was using it. I mostly don’t, due to the implementation in VS Code. I might use it a few times per month at best.

Maybe that would be different if the rollout wasn’t so awkward, or the VS Code extension was more configurable.

1 more reply

LarsDu881y ago

You absolutely do not need to be getting Microsoft copilot licenses:

Open weight and open source models can be hosted on your own hardware nowadays, and its incredibly easy.

https://dublog.net/blog/open-weight-copilots/

You can even use something like RayServe + VLLM to host on a big chonky machine for a small team if you're concerned about data exfiltration.

nonethewiser1y ago

I think most people outside of tech have barely even touched it.

Obviously there are some savy users across all age groups and occupations. But from what Ive see its just not part of most people’s workflow.

1 more reply

danielmarkbruce1y ago

I've seen this too and it's so weird. The vast majority are totally clueless on it.

mr_toad1y ago

Go back ten+ years, replace AI with cloud and it was the same. I saw ‘no cloud’ policies everywhere. But most of the anti-cloud people have since retired, so even the most hidebound of organisations are adopting it. It will probably take another round of retirements for AI to adopted in the more conservative environments.

1 more reply

seanmcdirmid1y ago

I’ve been hearing about the AI bubble being about to pop for more than decade now. And then just a couple of years ago AI took a huge leap…so now another AI winter is even more likely?

Prickle1y ago

I think we will see something similar to the last boom with Neural Net chatbots back in 2014(?).

Public discource will simmer down, as current language models either fizzle out or improve. Some will become background noise as the models gets integrated into search engines or leads to large scale lawsuits.

Unlike the previous neural nets though, those models have an actual tangible use. So you will see them around a lot more.

Then I think we will see another big explosion.

1 more reply

h_tbob1y ago

To be honest, I was surprised by ChatGPT. I didn’t think we were close.

We are running out of textual data now to train on… so now they have switched to VIDEO. Geez now they can train on all the VIDEOS on the internet.

And when they finally get bots working, they will have limitless streams of TACTILE data…

Writing it off as the next fad seems fun. But to be honest, I was shocked by what openai did the first time. So they have my respect. I don’t think many of us saw it coming. And I think writing their creativity off again may not be wise.

So when they say the bubble is about to break… I get it. But I don’t see how.

I hardly ever pay for anything.

But I gladly spend money on ai to get the answers I need. Just makes my work work!

Also I would say the economic benefit of this tech for workers is that it will 2x the average worker as they catch on. Seriously I am a 2x coder compared to what I was because of this.

Therefore if me a person who hardly ever spends money has to buy it… I think eventually all businesses will realize all their employees need it. This driving massive revenue for those who sell it.

But it may not be the companies we think.

bawolff1y ago

I think all this can be true, and we are still in a massive AI bubble that may pop at any moment.

ChatGPT truly is impressive. Nonetheless, i still think most companies integrating "AI" into their products is buzzword bs that is all going to collapse in on itself.

djaouen1y ago

How are you using AI to double your coding productivity? Are you using ChatGPT, or Claude, or GitHub Copilot? I am an AI-skeptic, so I am curious here. Thanks!

naet1y ago

I've tried various AI coding solutions and have found at best a mild boost but not the amazing multipliers I hear about online.

Copilot gives you some autofill that sometimes can be helpful but often not that helpful. I think the best it did for me was helping with something repetitive where I was editing a big list of things in the same way (like adding an ID to every tag in a list) and it helped take over and finish the task with a little less manual clicking.

ChatGPT has helped with small code snippets like writing a regular expression. I never got 100% regex mastery, usually I would have to look up a couple things to write one but GPT can shortcut that process. I get a little paranoid about AI provided code not actually working so I end up writing a large number of tests to check it, which could be a good thing but can feel tedious.

I'm also curious how other people are leveraging them to get more than I am. I honestly don't try too hard. At one point I did try really hard to get AI to do more heavy code lifting but was disappointed with my results so I stopped... but maybe things have improved a bit since then.

3 more replies

h_tbob1y ago

Ok I jumped on copilot when it first came out so I have been using it for a long time.

Since I have been using it so long, I have a really good intuition of what it is “thinking” in every scenario and a pretty good idea of what it can do for me. So that helps me get more use out of it.

So for example one of the projects I’m doing now is a flutter project - my first one. So I don’t remember all the widgets. But I just write a comment:

// this widget does XYZ

And it will write something that is in the right direction.

The other thing it knows super well is like rote code, and for context, it reads the whole file. So like Dart, for example is awful at json. So you have to write “toMap” for each freaking class where you do key values to generate a map which can be turned into json. Same goes for fromMap. So annoying.

But with copilot? You just write “toMap” and it reads all your properties and suggests a near perfect implementation. So much time saved!

1 more reply

el_benhameen1y ago

I’m not the OP and I wouldn’t say that AI has doubled my productivity, but the latest Claude models in particular have made me less of a skeptic than I was a few months ago.

I’m an experienced backend dev who’s been working on some Vue frontend projects, and it’s significantly accelerated my ability to learn the complexities of e.g. Vue’s reactivity model. I can ask a complex question that involves several niche concepts and get a response that correctly synthesizes those concepts. I spent an hour the other night trying to understand a bug in a component to no avail; once I understood the problem well enough to explain it in a few sentences, Claude diagnosed the issue and explained it with more clarity than the documentation and various stack overflow answers.

My default is no longer to assume that the model has a coin flip’s chance of producing bs. I still verify and treat answers with a certain degree of skepticism, but I now reach for it as my first tool rather than a last resort or a gimmick.

3 more replies

Mc911y ago

I don't use AI at work at all.

I pay for Leetcode, which usually gives editorial examples in Python and Java and such, and paste it into ChatGPT and say "translate this to a language I am more familiar with" (actually I have other programs that have been doing this for some language to language conversions for years, without AI). Then I say "make it more compact". Then again "make it more compact". So soon I have a big O(n) time, big O(1) space solution to Leetcode question #2718 or whatever in a language I am familiar with. Actually sometimes it becomes too compact and unreadable, and I back it up a little.

Sometimes it hallucinates, but it has been helpful. In the past I had problems with it, but not recently.

danielmarkbruce1y ago

What does it mean to be a skeptic here? Have you tried ChatGPT? Copilot?

1 more reply

slashdave1y ago

Good point about robots. But there will be a throughput issue. You cannot accelerate physical movement.

icholy1y ago

> Seriously I am a 2x coder compared to what I was because of this.

You probably shouldn't advertise that.

Flop73311y ago

2x loc!

1 more reply

rahimnathwani1y ago

They're a 20x coder now.

CooCooCaCha1y ago

I am highly skeptical that a competent coder sees a 2x boost.

5 more replies

alexander20021y ago

lol this made me chuckle

MrVandemar1y ago

> Seriously I am a 2x coder compared to what I was because of this.

Isn't the energy consumption of this technology pretty catastrophic? Do you consider the issue of energy consumption so abstracted you don't worry about it? Do you do anything to offset your increased carbon emissions?

danielmarkbruce1y ago

They certainly are not providing these services at less than electricity costs. So if you are spending $20 a month on it, they are spending less than that on electricity. It's very low compared to any person in the first world's energy spend.

2 more replies

h_tbob1y ago

I’m working on getting a place with solar panels. I think that’s important for sustainability, plus who wants to have to be connected to the grid anyway?

0x0081y ago

Offsetting carbon emissions - what does it even mean? There is no feasible way to remove carbon from the athmosphere as far as I know. Would you care to explain what you mean by that?

grogenaut1y ago

isn't the energy consumption of travel, driving, and shipping food to you pretty catastrophic? Do you consider the issue of energy consumption so abstracted you don't worry about it? Do you do anything to offset your increased carbon emissions?

1 more reply

LtWorf1y ago

You think silicon valley types care?

1 more reply

anonyfox1y ago

The sweet spot of the current LLMs (not whatever the next gen might or might not improve on) for me is similar to suddenly having an army of idiots at my fingertips.

There are a lot of smallish tasks/problems people/systems needs to deal with, some of them even waste notable real engineering capacity, and a highschooler could do manually quite easily by hand.

Example: find out if a text contains an email address, including all kinds of shenanigans people do to mask it (may not be allowd, ... whatever). From a purely coding standpoint, this is a cats-and-mouse game of improving regex solutions in many cases to also find the more sophisticated patterns, but there will always be uncatched/new ways or simply errors that produce false positives. But a highschooler can be given a text and instantly spot the email address (or confirm none is in there).

In order to "solve" these types of small problems, LLMs are pretty much fantastic. It needs to only be reliable enough to produce a structured answer within a few attempts and cheap enough to not be a concern for finance/operations. Thats why for me it makes absolutely sense that the #1 priority for OpenAI since GPT4 has been building smaller/faster/cheaper models. Automators need exactly that, not genius-level AGI.

Also for me I think we're not even scratching the surface still about many tasks can be automated away within the current constraints/flaws of LLMs (hallucination, accuracy, ...). Everyone tries to hype up some super generic powerful future (that usually falls flat after a while), whereas the true value of LLMs is in the many small things where hardcoding solutions is expensive but an intern could do it right away.

ekabod1y ago

The problem is that these idiots are very expensive for now.

1 more reply

julienchastang1y ago

As usual, when we see a thread on this topic on HN, the reactions tend to be bimodal: either "Yes, AI has transformed my workflow" (which is where I mostly fall), or "No, it's over-hyped." The latter often comes with an anecdote about how an LLM failed at a relatively simple task. I speculate that this diversity in opinion might be related to whether or not the user is employing a pro-tier LLM. Personally, I've been very impressed by ChatGPT-4 across a wide range of tasks, from debugging K8s logs to coding and ideation. I also wonder if some of the negative reactions stem from bad luck with an initial "first contact" with an LLM, where the results fell flat for any number of reasons (e.g., poor prompting), leading the user to conclude that it's not worth their time.

codexon1y ago

I'm using the latest top paid models, gpt4o and claude 3.5.

They work when there's a lot of examples on github or google, but once you get into something that doesn't have a lot of examples like closed source code or rarely used libraries, it will start hallucinating and even mixing up different API versions to create a mess that doesn't work at all.

I don't believe LLMs will get any better than this without a new major breakthrough, but this is already better than using Google search.

siodine1y ago

I had the same problem with using the latest neo4j API and Sonnet 3.5 -- except it wasn't really a problem. I just created a project where I explain that its knowledge on neo4j is outdated and to instead use an API reference and changelog that I add to the project.

It's not magic, you need to think what you would need in a similar situation and then provide that to the LLM. It definitely does suffer from severe overconfidence, though -- if you were to think of it as a person.

Also, you need to break up your project into manageable portions and provide context to the other portions (without providing the entirety of them) for it to effectively work on the portion you want to work on.

anon2911y ago

> don't believe LLMs will get any better than this without a new major breakthrough, but this is already better than using Google search

I mean .. every single org is invested in that research right now

2 more replies

MaybiusStrip1y ago

I don't think so, I think it's more about openness. I've noticed older software engineers tend to be more anti-LLM and quick to dismiss.

The shortcomings are aplenty, but they don't bother me. The things it can do weren't possible 2 years ago. I'll leverage those and take the bad with the good.

Similar experience with Tesla FSD. I know other Tesla owners who tried it a few times and think it's trash because they had to disengage. I disengage preemptively all the time but the other 90% of my drive being done for me is not something that used to be possible. I tried to give up my subscription because it's expensive and couldn't hold out two days.

mplewis1y ago

Your self-driving car is so unreliable you have to manually disable it 10% of the time to stay safe?

1 more reply

fuzztester1y ago

>I don't think so, I think it's more about openness. I've noticed older software engineers tend to be more anti-LLM and quick to dismiss.

Wow, a highly ageist comment, if there ever was one.

Congrats. Trying for a job and looking for less competition, maybe?

Notice that your statement is as full of assumptions as mine. That was intentional on my part, to bring out my point.

yoyohello131y ago

I’m in the middle. I use paid models. It’s great for scripts, reminding me of commands, and surprisingly good at debugging error messages. But if I’m working in our main code base I’m basically never able to just use the code it spits out. It can get me in the ballpark of what I want, but it often does weird convoluted shit on the first pass. I can keep prompting to refine the code down to something closer to what I want, but at that point it would have just been faster to write the thing myself.

I kind of like LLMs for learning new languages. Claude or ChatGPT are good for asking questions. But copilot really stunts learning for me. I find nothing really sticks in my brain when I have copilot running. I feel like I just turn my brain off, which seems kind of dangerous to me in the long run.

sandspar1y ago

I'm concerned about the "turning the brain off" effect too. When we put backup cameras on cars it means that everyone can parallel park perfectly. Now take away the backup cameras and see what happens to our parallel parking.

1 more reply

Havoc1y ago

The way people put food on their table also matters.

I see a lot of negative reactions from programmers precisely because it is good at what they do. If you’re feeling threatened you’re much more likely to focus on the things it can’t do

dogcomplex1y ago

To be fair there is a certain level of blindly-pragmatic "so AI can help me with programming - so what! It takes just as long to converse with it as for me to code it myself" - which is mostly true. But I think those people fail to grok the sheer ridiculous power of what they're saying - that these tools are close enough they're nearly at parity. Sure, the error-correcting architectures aren't sufficient yet for them to run unsupervised, but this is crazy already.

1 more reply

fragmede1y ago

You can usually tell that a lot of people just go off rumors they read once off Twitter or reddit or somewhere about hallucinations or doing math, against a weaker model, without every validating what they read online or updating their model of how well latest models work.

Just have to learn to let it go, despite xkcd 386.

codexon1y ago

I use gpt4o and claude 3.5 while coding every day. The rumors on twitter and reddit are accurate.

I constantly run into incorrect answers from the LLMs every day. Just recently I asked whether I needed to reverse the bit shift to mask the upper 24 bits in an IP address on a little endian platform and it incorrectly told me no probably because most of the answers on Google appeared to answer no to similarly phrased questions.

1 more reply

sandspar1y ago

Maybe you're using a strawman.

1 more reply

wslh1y ago

Yes, I think if we ask about a more trivial thing we will have the same responses. For example, “Do you put the toilet paper roll with the paper coming over the top or hanging under the bottom?”.

Regarding LLMs I bet we will see them evolving. Don't forget about https://news.ycombinator.com/item?id=41269791 there are many problems that LLMs are no good for but they are being better that many Google search results, and that means something from the economic point of view.

I am sure that The Economist analytics is having a good moment /s.

cs7021y ago

The OP is not about AI as a field of research. It's about whether the gobs of money invested in "AI" products and services in recent years, fueled by hype and FOMO, will earn a return, and whether we are approaching the bust of a classic boom-bust over-investment cycle.

Seemingly every non-tech company in the world has been trying to figure out an "AI strategy," driven by hype and FOMO, but most corporate executives have no clue as to what they're doing or ought to be doing. They are spending money on poorly thought-out ideas.

Meanwhile, every tech company providing "AI services" has been spending money like a drunken sailor, fueled by hype and FOMO. None of these AI services are generating enough revenue to cover the cost of development, training, or even, in many cases, inference.

Nvidia, the dominant software-plus-hardware platform (CUDA is a big deal), appears to be the only financial beneficiary of all this hype and FOMO.

According to the OP, the business of "AI" is losing hype, suggesting we're approaching a bust.

dogcomplex1y ago

Investors need to stop looking at AI as a path to profits, and more as an enormous risk to the profitability of every other business. If AGI gets hit, it's probably not going to stay monopolized for long, nor is it going to be market neutral. Every business is going to need an AI infusion to compete, which many will get, cutting costs and raising efficiencies of each business in an ongoing competitive spiral to the bottom line - which... will be eventually measured in just: robotic labor, energy, and compute. The profit margins on all those businesses shrink to nothing as they become basically utilities. The net effect? Possibly a massive deflation, and "crashing" of the stock market, even as the total utility and value of the system skyrockets.

This isn't a bull bet, it's a bear. AI would need to be perfectly monopolized to capture all the gains, and it's increasingly looking like that won't be the case - as all the component pieces are already open source at competitive levels, and any final architecture improvements that cross the final thresholds could be leaked in a 50GB file. Whoever gets to it first has a few months head start, at most, and probably not enough time or control to sell products - or shovels. After that it's a neverending race to zero, to the benefit of the consumer and the detriment of the investor.

Nvidia is a great example case. They currently dominate the GPU market, an "essential hardware for AI", yet ternary asic chips specialized for transformer-only architectures are looking quite viable at 1999s tech levels. Wouldn't bet on that monopoly sticking around much longer.

rifty1y ago

> Nvidia appears to be the only financial beneficiary

It depends how you look at it. A lot of the spend by big tech can be seen as protecting what they already have from disruption. Its not all about new product revenues it’s about keeping the revenue share in the markets they already have.

mark_l_watson1y ago

Yes and no. Hype over ‘API wrapper’ projects and startups will crash a bit, I think.

On the other hand we are no where near approaching hard limits on LLMs. When LLMs start to be trained for smaller subject areas with massive hand curated examples for solving problems, then they will reach expert performance in those narrow tech areas. These specialized models will be combined in general purpose MoEs.

Then new approaches beyond LLMs, RL, etc. will be discovered, perfected, made more efficient.

Seriously, any hard limits are far into the future.

ikjasdlk22341y ago

I agree. We're seeing great results for very narrow use cases using smaller LLMs too. It's no different than classical ML and emerging AI over the past 10 years. If you don't have a well scoped use case, you're not going to succeed.

Now the one API wrapper projects that I love are my meeting transcription and summarization apps. You can tear those from my cold, dead hands.

mrmetanoia1y ago

Yeah I think there's an empty marketing driven hype that will die off, but I think we're going to start to see it continue to integrate into peoples real life workflows and competition's going to heat up in delivering more consistently reliable results.

with regard to art AI, I think the debates are going to die off and the artists and people making stuff are going to just keep doing that, and some of them will use AI in ways that will challenge people in ways good art often does.

keiferski1y ago

It’s certainly possible that AI is being overhyped, and I think in some cases it definitely is - but being tired of hearing about it in no way correlates to its actual usefulness.

In other words, lot of people seem to think that human attention spans are what determine everything, but the technological cycles at work here are much much deeper.

Personally I have used Midjourney and ChatGPT in ways that will have huge impacts on many activities and industries. Denying that because of media trendiness about AI seems shortsighted.

pdimitar1y ago

> It’s certainly possible that AI is being overhyped, and I think in some cases it definitely is - but being tired of hearing about it in no way correlates to its actual usefulness.

Please tell that to all types on HN who downvote anything related to Rust without even reading past the title. :D

> In other words, lot of people seem to think that human attention spans are what determine everything, but the technological cycles at work here are much much deeper.

IMO no reasonable person denies this, it's just that the "AI" technology regularly over-promises and under-delivers. At one point it's no longer discrimination, it's just good old pattern recognition.

> Personally I have used Midjourney and ChatGPT in ways that will have huge impacts on many activities and industries. Denying that because of media trendiness about AI seems shortsighted.

Some examples with actual links would go a long way. I for one am skeptical of your claim but I am open to have my mind changed (f.ex. my CFO told me once that ChatGPT helped him catch several bad contract clauses).

keiferski1y ago

I don't understand how someone could think that ChatGPT or Midjourney aren't going to radically change many, many industries, and frankly to think this just seems like straight up ignorance or laziness. It's not that hard to find real examples of this stuff.

But if you insist...here are two very small examples from my personal experience with AI tools.

1. I work as a technical writer. Recently I needed to add a summary section to the introduction of a large number of articles. So, I copied the article into ChatGPT and told it to summarize the piece into 3-4 bullet points. Were I doing this task a few years ago, I would have read each article and then written the bullet points myself – nothing particularly difficult, but very time-consuming to do for dozens of articles. Instead, I used ChatGPT and saved myself hours upon hours of mundane work.

This is a quite minor and mundane example, but you can (hopefully) see how this will have major effects on any kind of routine text-creation.

2. I am working on a side project which requires the creation of a large number of custom images. I've had this project idea for a few years, but previously couldn't afford to spend $20k hiring an illustrator to make them all. Now with Midjourney, I am able to create essentially unlimited images for $30-100 a month. This new AI tool has quite literally unlocked a new business idea that was previously inaccessible.

1 more reply

bufferoverflow1y ago

AI is not one thing at the moment. We have multiple systems that are being developed in parallel:

• text generators

• code generators

• image generators

• video generators

• speech generators

• sound/music generators

• various robotics vision and control systems (often trained in virtual environments)

• automated factories / warehouses / fulfillment centers

• self-driving cars (trucks/planes/trains/boats/bikes/whatever)

• scientific / reasoning / math AIs

• military AIs

I find all of these categories already have useful AIs. And they are getting better all the time. The progress might slow down here and there, but it keeps on going.

Self-driving was pretty bad a year ago, and now we have Tesla FSD driving uninterrupted for multiple hours in complex city environments.

Image generators now exceed 99.9% of humans in painting/drawing abilities.

Text generators are decent. There are hallucination issues, and they are not creative at the best human level, but I'd say they write better than 90% of humans. When it comes to poetry/lyrics, they all still suck pretty badly.

Video generators are in their infancy - we get decent quality, but absolutely mental imagery.

Reasoning is the weakest point, in my opinion. Current gen models are just not good at reasoning. Sometimes they are brilliant, but then they make very silly mistakes that a 10-year old child wouldn't make. You just can't rely on their logical abilities. I have really high hopes for that area. If they can figure out reasoning, our science research will become a lot more reliable and a lot more fast.

skydhash1y ago

> Self-driving was pretty bad a year ago

The threshold for acceptable self-driving is genuine effort from the automated system to avoid accidents as we can't punish it for bad driving. And I want auditable proof of that.

> Image generators now exceed 99.9% of humans in painting/drawing abilities.

I'm pretty sure the amount of people that can draw is less than that. And they can beat image generators by a mile as those generators are mostly doing automated matte painting. Yes copy-paste is faster than typing, but that's not write a novel.

> Text generators are decent...but I'd say they write better than 90% of humans.

Humans use language to communicate. And while there are bad communicators, I think lots of people are doing ok on that front. Text generators can be perfect syntax-wise, but the intent has to come from someone. And the produced text's quality is proportional to the amount of intent that it produces (that's why corporate language is so bland).

Video generators are in their infancy - we get decent quality, but absolutely mental imagery.*

See Image Generator section, but in motion.

> Reasoning is the weakest point, in my opinion... If they can figure out reasoning

That's the 1-billion dollar question.

bufferoverflow1y ago

> The threshold for acceptable self-driving is genuine effort from the automated system to avoid accidents as we can't punish it for bad driving. And I want auditable proof of that.

You can have any standard of safety that you want, that's absolutely your choice. Plenty of automated systems that your life depends on that you have never audited and never will. That includes elevators, cars, planes, all kinds of medical gear, etc.

My standard is simpler: whenever the AI is safer than the average human driver, it already saves lives.

> I'm pretty sure the amount of people that can draw is less than that.

You probably have some esoteric definition of "people that can draw". Anyone capable of holding a pen/pencil/brush can draw.

> And while there are bad communicators, I think lots of people are doing ok on that front

Look at the previous comment. Seems like you overestimated even your own communication skills :)

> Text generators can be perfect syntax-wise, but the intent has to come from someone.

I'm hoping that we will achieve the level of AI writing where you can prompt "write me an interesting sci-fi book", and it does. Not much intent here, much artificial intelligence required.

kmarc1y ago

My employer PoC'd, collaborated with and eventually bought Codeium's solution.

I couldn't care less about (any, so also neither about) the LLM hype. Especially didn't bother going to a new web site (ChatGPT), or installing new IDEs etc.

I checked Codeium's mycompany-customized landing page: a one-liner vim plug-in installation and copy pasting an auth token.

I started typing in the very same editor, very same environment, very same everything, and the thing just works, most of the time guesses well what I would want to write, so then I just press tab to accept and voila.

I wasn't expecting such a seamless experience.

I still haven't integrated its "chat" functionality into my workflow (maybe I won't at all). I'm not hyped about it, it just feels like a companion to already working (and correct) code completion.

I read a lot about other people's usages (I'm a devXP engineer), and I feel like that for whatever reason there is more love / hype / faith on their chosen AI companion than how much they actually could improve if took a humble way of understanding code, reading (and writing) docs, reasoning about the engineering solution.

As everything, now AI is losing hype, but somehow (in my bubble) seems like engineers are still high on it. But I also see that this will distill further the set of people who I look up to and want to collaborate with, because e of that mentioned humbleness, as opposed to just accepting text predicted solutions mindlessly.

0x0081y ago

I think we need to be very careful making blanket statements about the usefulness of certain LLMs.

In my experience for every task another LLM is excelling and where one was good it might fail for the other task. They can do great things, but it’s not guaranteed and a lot of manual intervention and back and forth is still needed.

We are not at the point where using AI in the company ist just a blanket win for everyone involved. Companies are investing a lot but the return is hard to measure and not always guaranteed.

This is the problem with early technologies, they work sometimes but not guaranteed and we build our expectations on extrapolating their usefulness. We should not judge this technology by the current success rate, but rather by how much impact it will have once we get the success rate to higher and higher levels.

Still, what we can say is that for certain occupations it already helps them reduce their work by 15% (software engineers) and probably even more for some (writers, product owners, office warriors and alike). This is a great achievement in of itself, think how much this will make up in a company as large as MS or Google.

antupis1y ago

> As everything, now AI is losing hype, but somehow (in my bubble) seems like engineers are still high on it. But I also see that this will distill further the set of people who I look up to and want to collaborate with, because e of that mentioned humbleness, as opposed to just accepting text predicted solutions mindlessly.

It is dotcom bust again. Mainstream is losing interest but at the same time I see our internal chatbots / ai agents doing hockey stick growth and I am using several hours code pilot daily.

11thEarlOfMar1y ago

Not until we've seen a plethora of AI startups go public with no revenue.

upon_drumhead1y ago

I'm not sure that is realistic anymore. The run of free money is over and I expected that markets are going to be very picky compared to a few years ago

freemoney991y ago

I must have missed the memo. Where could I get the free money? "a few years ago" we had a global pandemic. Are you claiming that markets will be very picky compared to that time?

2 more replies

ummonk1y ago

Whether and to what extent AI can be monetized is an open question. But there's no question that LLMs are already seeing extensive use in everyday office work and already making large improvements to productivity.

hatefulmoron1y ago

> But there's no question that LLMs are already seeing extensive use in everyday office work and already making large improvements to productivity.

Are you referencing something specific here, or is there something you can link to? To be honest the only significant 'disruption' I've seen for LLMs so far has been cheating on homework assignments. I'd be happy to read something if you have it.

ummonk1y ago

It’s purely anecdotal on my part but I have an ever increasing proportion of nontechnical acquaintances telling me how they discovered they can use ChatGPT to save large amounts of time drafting emails, writing reports, etc. (something which is a major part of work duties for many average office workers).

pdimitar1y ago

I and many others are questioning it. Please provide some proof. I've only seen some lazy programmers get boilerplate generated quicker, and some kids cheating on homework. I actually saw executives make use of ChatGPT's text summarization capabilities... until one of them made the critical mistake to fully trust it and flunked an important contract because ChatGPT overlooked something that would be super obvious to a human.

So again, let's see some proof of this extensive use and large improvements to productivity.

asadotzler1y ago

Links to studies/surveys/interviews/anything with even the suggestion of proof for your claim other than simple assertion?

j-a-a-p1y ago

The article suggests the contrary: 4.8% use in US companies, down from 5.4%. (I would wish I would have gotten these numbers, but for a tech company founded in 2015 these are not remarkable).

ummonk1y ago

Most of the other 95% of companies that say they aren’t using AI to produce goods or services will still have many employees who’re using LLM services for help drafting emails, documents, reports, etc.

jowdones1y ago

Guys vomiting such blank statements as ummonk deserve to be beaten with a wet cloth.

burnerquestions1y ago

I question it. Source?

mensetmanusman1y ago

I’m just surprised something nearly replaced google in my lifetime.

langcss1y ago

Google is now "a tool" not "the tool" for finding information. Perplexity and Phind do a good job and DDG is there for the privacy angle. In addition to LLMs just giving you the answer you need.

bamboozled1y ago

How on earths name do you use an LLM to find information ? I just don’t get it. For current events it out of date and it confidently feeds me shit ?

I might use them occasionally for a rubber ducky but , replacing Google ? Hm

2 more replies

PcChip1y ago

Kagi. Kagi replaced Google.

0points1y ago

What a ridiculous claim.

1 more reply

BeetleB1y ago

Hard to say they replaced them, when they use Google in their backend...

1 more reply

technick1y ago

I was out at Defcon this year and it was all about AI this, AI that, AI will solve the worlds problems, AI will catch all threats, blah blah blah blah...

bamboozled1y ago

I work with people like this. The least skilled, least experienced, least productive people on my team constantly recommend “AI” solutions that are just a waste of time.

I think that’s what people like about AI, it’s hope, maybe you won’t have to learn anything but still be productive. Sounds nice ?

0points1y ago

My clients are like this lately.

Non techies that now are suggesting how I design solutions for them by asking ChatGPT. And they seem to treat me like the stupid one for refusing.

plastic-enjoyer1y ago

I was at a UX / Usability conference and it was basically the same. Everyone talked about AI here and AI there, but no one had an actual usecase or idea how to incorporate AI in a purposeful way. I can genuinely understand, why people feel that AI is a fad.

zombot1y ago

Such bad timing! As I was just about to replace my dentist, my oncologist, my GP, my tax attorney, and my investment banker with CrapGPT, expecting baldness and cancer to be cured by tomorrow, not to mention a get-rich-quick scheme for everybody and their grandmother. Son, I am disappoint.

jimjimjim1y ago

But what about all those organizations that have "Do something with AI" as the goal for the quarter? All those bonuses driving people to somehow add AI to products. All the poor devs that have been told to replace features driven by deterministic code with AI good-enough-ness.

ssimoni1y ago

Hilarious. The article tries to go even one step further past the loss of hype, by making an additional argument that ai might not be in a hype cycle at all. Meaning they conjecture that it might not even come out of the trough of disillusion to mass adoption.

That’s gonna be a bad take I think.

someonehere1y ago

It’s become an invaluable resource for my team in debugging scripts we’ve written for our services. There are a couple of third-party integrations that have been helping us greatly increase our release of features and fixes for problems in our company.

DaoVeles1y ago

This is why I feel like OpenAI and the public release of ChatGPT has probably done more damage than good. In trying to get ahead of everyone else, they have thrown a lot of great tech under the bus trying to get market dominance.

It has made people lump all AI technology into a bubble regardless of it is functional or not.

You are using this stuff to do some really cool things, but having hype attached to it can be very positive short term, damaging in the medium term and neutral long term. We are moving into the medium term.

gorgoiler1y ago

Asking an API to write three paragraphs of text still takes tens of seconds and requires working internet and an expensive data center.

Meanwhile we’re seeing the first of the new generation of on-device inference chips being shipped as commodity edge compute.

When the devices you use every day — cars, doorbells, TV remotes, points-of-sale, roombas — can interpret camera and speech input locally in the time it takes to draw a frame and with low enough power to still give you 10h between charges: then we’ll be due another round of innovation.

The article points to how few parts of the economy are leveraging the text-only API products currently available. That still feels very Web 1.0, for me.

e-clinton1y ago

Claude 3.5 is vastly better the 4o. I produce new features at a rate that’s 2-3x faster than I could without it. Not perfect and isn’t great in all use cases but overall transformational. I’ve been coding for 20+ years.

castigatio1y ago

I think many things can be true at the same time:

- AI is currently hyped to the gills - Companies may find it hard to improve profits using AI in the short term - A crash may come - We may be close to AGI - Current models are flawed in many ways - Current level generative AI is good enough to serve many use cases

Reality is nobody truly knows - there's disagreement on these questions among the leaders in the field.

An observation to add to the mix:

I've had to deliberately work full time with LLM's in all kinds of contexts since they were released. That means forcing myself to use them for tasks whether they are "good at them" yet or not. I found that a major inhibitor to my adoption was my own set of habits around how I think and do things. We aren't used to offloading certain cognitive / creative tasks to machines. We still have the muscle memory of wanting to grab the map when we've got GPS in front of us. I found that once I pushed through this barrier and formed new habits it became second nature to create custom agents for all kinds of purposes to help me in my life. One learns what tasks to offload to the AI and how to offload them - and when and how one needs to step in to pair the different capabilities of the human mind.

I personally feel that pushing oneself to be an early adopter holds real benefit.

jackhab1y ago

Can you give some examples of the tasks you did manage to offload successfully?

castigatio1y ago

- Emotional regulation. I suffer from a mostly manageable anxiety disorder but there are times I get overwhelmed. I have an agent setup to focus on principles of Stoicism and its amazing how quickly I can get back on track just by having a short chat with it about how I'm feeling.

- Personalised learning. I wanted to understand LLM's at foundational technical level. Often I'll understand 90% of an explanation but there's a small part that I don't "get". Being able to deliberately target that 10% and be able to slowly increase the complexity of the explanation (starting from explain like I'm 5) is something I can't do with other learning material.

- Investing. I'm a very casual investor. But I keep a running conversation with an agent about my portfolio. Obviously I'm not asking it to tell me what to invest in but just asking questions about what it thinks of my portfolio has taught me about risk balancing techniques I wouldn't have otherwise thought about.

- Personal profile management. Like most of us I have public facing touch points - social media, blog, github, CV etc. I find it helpful to have an agent that just helps me with my thought process around content I might want to create or just what my strategy is around posting. It's not at all about asking the thing to generate content - it's about using it to reflect at a meta level on what I'm thinking and doing - which stimulates my own thinking.

- Language learning - I have a language teaching agent to help me learn a language I'm trying to master. I can converse with it, adapt it to whatever learning style works best for me etc. The voice feature works well with this.

- And just in general - when I have some thinking task I want to do now - like I need to plan a project or set a strategy I'll use an LLM as a thought partner. The context window is large enough to accomodate a lot of history - and it just augments my own mind - gives me better memory, can point out holes in my thinking etc.

Edit: actually now that I have written out a response to your question I realise It's not so much offloading tasks in a wholesale way - its more augmenting my own thinking and learning - but this does reduce the burden on me to "think about" a range of things like where to get information or to come up with multiple examples of something or to think through different scenarios.

1 more reply

nbzso1y ago

Honestly. I am enjoying it. From Dave will replace you, to this is not working well in 6 months. A new record. Logically, everyone around me forgot that I patiently explained the limits of stochastic parrots and the false hope on synthetic data. If we lived in the remotely responsible place, some people would have their heads rolling down the stairs. The psychological damage over workforce from AI hype is comparable only with the negative effect of social networks on the society. :)

dbrueck1y ago

For me, the most amazing thing about LLMs is translation between written languages (human, not programming). I can only speak to the translation between English and Spanish on everyday topics, but ChatGPT often produces translations that are near native speaker quality, and even when it doesn't, the results are almost always far, far above the "good enough to communicate clearly" threshold. It's incredible.

nerdjon1y ago

I feel like I have to disagree, even though I really don't want too. This technology is seriously overhyped.

We have to realize that there is a ton of money right now behind pushing AI everywhere. We have entire conventions for leadership pushing that a year later "is the time to move AI to Prod" or "Moving past the skeptics".

We have investors seemingly asking every company they invest in "how are you using generative AI" before investing. We have Microsoft, Google, and Apple (to a lesser degree) forcing AI down our throats whether we like it or not and ignoring any reliability (inaccurate) issues.

FFS Microsoft is pushing AI as a serious branding part of Windows going forward.

We have too much money committed to pushing the idea that we already have general AI, too much marketing, etc.

Consumer hype and money in this situation are going to be very different things. I do think a bust is going to happen, but I don't think in any meaningful way the "hype" has died down. I think and I hope it will die down, we keep seeing how the technology just simply can't do what they are claiming. But I honestly don't think it is going to happen until something catastrophic happens, and it is going to be ugly when it does. Hopefully your company won't be so reliant on it to not recover.

moi23881y ago

Well, maybe because people and companies still overwhelmingly seem to think LLMs == AI.

AI ain’t going nowhere. And certainly isn’t overhyped. LLMs however, certainly are overhyped.

Then again I find it a good interface for assistants and actual AI and APIs that it can call on your behalf

Ologn1y ago

> Since peaking last month the share prices of Western firms driving the ai revolution have dropped by 15%.

NVDA's high closes were $135.58 June 18, down to $134.91 July 10th and $130 close today. It's highest sale is $140.76. So it's close today is 8% off its highest sale ever, and 4% off its highest close ever, not a big thing for a volatile stock. It's earnings are next week and we'll see how it does.

Nvidia and SMCI are the ones who have been earning money selling equipment for "AI". For Microsoft, Google, Facebook, Amazon, OpenAI etc., it is all big initial capital expenditure which they (and the scolding investment bank analysts) hope to regain in the future.

bulbosaur1231y ago

Same way mobile phones are losing their hype...they've become ubiquitous

carlmr1y ago

I'm really wondering if we're going to see a lack of people with CS degrees a few years from now because of Jensen Huang saying AI will do all that and we should stop learning how to program.

sham11y ago

Clearly Jensen is a genius and just ensured us infinite job security. Well, either that or he was just driving the hype b/c nvidia sells the shovels for the AI gold rush.

Personally, I'd wager the latter.

yawboakye1y ago

> artificial intelligence is losing hype.

among which audience? is the hype necessary for further development? we attained much, if not all, of the recent achievements without hype. if anything, i'm strongly in favor of ai losing all the hype so that our researchers can focus on what's necessary, not what will win the loudest applause from so fickle a crowd. i'd be worried if ai was attracting less researchers than, say, two or three years ago. that doesn't seem to be the case.

justmarc1y ago

Maybe it's because people are finding out that it's actually not as intelligent as they thought it would be in its current iteration.

The future is most definitely exciting though, and sadly quite scary, too.

m3kw91y ago

Hype is relative to your circle, where you are getting your info, and how the algorithm targets you with info that interests you. So yes, the hype is tiring for that Economist journalist, but for many they have not even heard or used it, and then there is everyone in between. As for myself there is hype but tongue seem justified based on how good the LLMs are currently

nottorp1y ago

https://en.wikipedia.org/wiki/AI_winter

Those who do not know history are doomed to repeat it.

But then, the current hype wasn't there to produce something useful, but for "serial entrepreneurs" to get investor money. They'll just move to the next hyped thing.

naasking1y ago

Yes, hype happens because something new that can potentially be applied to many problems triggers lots of experimentation and discussion. Once people figure out the problems to which it's well-suited and ill-suited, experimentation and discussion die down and there's just application. Nothing to see here, this is standard and expected.

cleandreams1y ago

The problem is that current generative AI is not actually intelligent.

Yann LeCunn had a great tweet on this: Sometimes, the obvious must be studied so it can be asserted with full confidence: - LLMs can not answer questions whose answers are not in their training set in some form, - they can not solve problems they haven't been trained on, - they can not acquire new skills our knowledge without lots of human help, - they can not invent new things. Now, LLMs are merely a subset of AI techniques. Merely scaling up LLMs will not lead systems with these capabilities.

link https://x.com/ylecun/status/1823313599252533594?ref_src=twsr...

To focus on this: - LLMs can not answer questions whose answers are not in their training set in some form, - they can not solve problems they haven't been trained on

Given that we are close to maximum in the size of the training set, this means they are not going to improve without some completely unknown at the moment technical breakthrough. Going from "not intelligent" to "intelligent" is a massive shift.

jonahx1y ago

"in some form" is doing a lot of work there.

The problem is that, by the standards of most human beings, they are in fact doing what we informally call "inference" or "creating new things".

That this is being accomplished by something that is "technically a fancy autocomplete" doesn't seem to matter practically... it's still doing all this useful and surprising stuff.

intended1y ago

It is carrying lots of weight.

At this moment that’s the most sophisticated we can be in talking about LLMs.

I will say that the utility of these tools is not being denied. It’s just the struggle to explain the varied experiences.

I can get only as far as analogy, not precise definitions.

For me, LLMs are like the invention of microwave ovens. Very useful.

They aren’t like the discovery of fire.

stevenhuang1y ago

Yea, LeCunn is kind of a meme.

You're doing yourself and readers a disservice when you quote him without mentioning his conflict of interest.

His research is in analytical approaches to ML hence his bitterness against current LLM techniques and skepticism towards Sutton's Bitter Lesson.

KingOfCoders1y ago

Which is great, the internet exploded when TV stopped talking about "the internet" and everyone just used it.

0points1y ago

Right, I forgot that is why internet became popular /s

KingOfCoders1y ago

You confuse causality with correlation, a common mistake.

1 more reply

DebtDeflation1y ago

Things that are coming to an end:

- Startups whose entire business model is to just provide a wrapper around OpenAI's API.

- Social Media "AI Influencers" and their mindless "7 Ways To Become A Millionaire With ChatGPT" videos.

- Non-technical pundits claiming we are 1-2 years from AGI (and AGI talk in general).

- The stock market assigning insane valuations to any company that claims to somehow be "doing AI".

Things that are NOT coming to an end:

- Ongoing R&D in AI (and not just LLMs).

- Companies at the frontier of AI (OpenAI, Anthropic, Mistral, Google, Meta) releasing ever more capable models and tooling around those models.

- Forward looking companies in all industries using AI both to add capabilities to their products and to drive efficiencies in internal processes.

philipwhiuk1y ago

> - Forward looking companies in all industries using AI both to add capabilities to their products and to drive efficiencies in internal processes.

This collapses as soon as this collapses

> - The stock market assigning insane valuations to any company that claims to somehow be "doing AI".

2 more replies

ksynwa1y ago

At the core of it, the high valuations were not a product of the potential LLMs had in assisting and enhancing individuals' works, rather because there was a hope that LLMs could replace some varieties of human labourers wholesale, thereby cutting labour costs which are often the highest expenditure of many companies. This has not materialised. I have seen more stories of PR nightmares out of attempting this than those with good endings. But maybe that is because of the sensationalist nature of news media itself.

Either way if it is indeed a bubble that will burst at some point, it doesn't bode well for the tech industry. With the mass layoffs, which are ongoing, seems like there won't be enough jobs for everyone.

mikhael281y ago

As long as Zuck keeps releasing open-source models, the moat will continue to disappear from these companies. Only expensive, corporate processing tiers will exist and everyone will run stuff locally. Not a lot of money to be made from local processing.

taberiand1y ago

Sure it's not all it's cracked up to be but I sure hope there's a sweet spot where I can run the latest models for a cheap price ($20 / month is a steal), and it doesn't instead crash to the point where they get turned off

kderbyma1y ago

Honestly, the only thing I have found somewhat useful with LLMs is to get smarter tab complete and to occasionally fill out small methods in classes, and finally writing unit tests and adding documentation. it saves me a little time on mostly improving test coverage and readability. But until I give it some examples it usually hallucinates even method names within the class that are very similar but slightly different and some of the time saved is lost by having to fix it's mistakes. I would say it's improving my LoC output by maybe 5-15% max, but the tab complete is nice when writing code.

pif1y ago

The most useful ChatGPT has been for me consisted in teaching me some nice recipes for elk eggs.

For the record, before spelling the recipes out, it made sure I understood that collecting elk eggs may be unlawful in some jurisdictions.

moridin1y ago

Good, maybe now we can focus on building killer apps rather than hype-posturing.

lz4001y ago

AI is a very strange thing where 2 seemingly smart coders use it and one comes out thinking it's obviously revolutionary and the other one thinking it's a waste of time and where 2 seemingly smart journalists use it and one thinks AGI and the end of the world is nigh and the other one thinks the market will crash when the hype dies over.

I think part of it is due to the politically and internet-induced death of nuance. But part of it I can't fully understand.

Personally I think it's rather useful. I don't consider myself a heavy user and still use it almost every day to help code, I ask it a lot of questions about specific and general stuff. It's partially or totally substituted for me: Stack Overflow, Google Search, Google Translate, most tech references. In the office I see people using it all the time, there's almost always a chatgpt window open in some of the displays.

I think it's very difficult to say this is 100% hype and/or a "phase". It's almost a proven fact it's useful and people will want it in their lives. Even if it never improves again, ever. It's a new tool in the toolbox and there will be businesses providing it as a service, or perhaps we will get to open source general availability.

On the other extreme, all the AI doomerism and AGI stuff to me seems almost as unfounded as before generative AI. Sure, it's likely we'll get to AGI one day. But if you thought we were 100 years away, I don't think chatgpt put us any closer to it and I just don't get people who now say 5. I'd rather they worried about the impact of image gen AI in deepfakes and misinformation. That's _already_ happening.

kaoD1y ago

> AI is a very strange thing where 2 seemingly smart coders use it and one comes out thinking it's obviously revolutionary and the other one thinking it's a waste of time

My take on this is that those 2 developers are often working on very different tasks.

If you're a very smart coder working in a large codebase with tons of domain knowledge you'll find it's useless.

If you're a very smart coder working in a consultancy and your end result looks like a few thousand lines of glue code, then you're probably going to get a lot out of LLMs.

It's a bit like "software engineering" vs "coding". Current iterations of LLMs is good for "coding" but crap at "software engineering".

1 more reply

jdefr891y ago

It’s hilarious seeing people getting LLMs to traditional takes traditional discrete algorithms do perfectly already. “Let’s use LLM to do basic arithmetic!” Like, that’s not what they are built for. We want more generalization… So much to unpack here and I’m tired of having to explain these basic things. You will know our models got more powerful if they can do something like solve the ARC challenge, not cramming it with new updated information we know it will already process a certain way…

laichzeit01y ago

That's great. Then don't use it? I however find it immensely useful and will continue to use it.

scubadude1y ago

I'm still waiting for the Virtual Reality from 1996 to change the world. Colour me surprised that AI is being found to be 90% hype.

eesmith1y ago

Also from the 1990s, "intelligent agents". Here's what Don Norman wrote in 1994 at https://dl.acm.org/doi/pdf/10.1145/176789.176796 :

> The new crop of intelligent agents are different from the automated devices of earlier eras because of their computational power. They have Turing-machine powers, they take over human tasks, and they interact with people in human-like ways-perhaps with a form of natural language, perhaps with animated graphics or video. Some agents have the potential to form their own goals and intentions. to initiate actions on their own without explicit instruction or guidance, and to offer suggestions to people. Thus, agents might set up schedules, reserve hotel and meeting rooms, arrange transportation, and even outline meeting topics, all without human intervention.

matrix871y ago

> Silicon Valley’s tech bros are having a difficult few weeks.

they need to find a different derogatory slur to refer to tech workers

ideally one that isn't sexist and doesn't erase the contributions of women to industry

janalsncm1y ago

“AI” never existed, at least AGI never did. AI that works is called machine learning and it’s not going away because it actually drives revenue at many companies. But the people who are working on that were working on it before blockchain and they’ll be working on it long after the next hype cycle runs out of steam. Unlike grifting, actual expertise takes time.

I have mixed feelings. On the one hand, I have a ton of schadenfreude for the AI maximalists (see: Leopold Aschenbrenner and the $1 trillion cluster that will never be), hype men (LinkedIn gurus and Twitter “technologists” that post threads with the thread emoji regurgitating listicles) or grifters (see: Rabbit R1 and the LAM vaporware).

On the other hand, I’m worried about another AI winter. We don’t need more people figuring out how to make bigger models, we need more fundamental research on low-resource contexts. Transformers are really just a trick to be able to ingest the whole internet. But there are many times where we don’t have a whole internet worth of data. The failure of LLMs on ARC is a pretty clear indication we’re not there yet (although I wouldn’t consider ARC sufficient either).

XCSme1y ago

I think text-to-SQL is quite cool and works reasonably well.

bentt1y ago

All we need to see is a computer that operates itself according to what you ask it to do, and the hype will be back. For some reason nobody's really showing this, but it seems obvious. Maybe it's too dangerous.

gsky1y ago

It's the only one creating tech jobs at the moment

Kuinox1y ago

Why is there no journalist name on this article ?

majewsky1y ago

That's house style for the Economist.

1 more reply

freemoney991y ago

These days you can't be a respected news outlet if you don't regularly have an article/post/blog about AI losing hype. Wondering when that fad will reach its peak...

ryoshu1y ago

Good. Time to build.

j-a-a-p1y ago

TL;DR, article is not so much about AI, it is more about Gartner's hype cycle. According the Economist data only 25% of tech hypes follow this pattern. Many more (no percentage given) are just a flash in the pan.

AI is following more a seasonal pattern with a AI Winters, can we expect a new winter soon?

dwighttk1y ago

Good… it’s all hype

bpiroman1y ago

I use ChatGPT almost everyday as a part of my coding work flow

j_timberlake1y ago

They were writing pro-AI articles less than 2 months ago. They can just post AI-hype and AI-boredom articles so both sides will give them clicks. It's like an alternate form of Gell-Mann Amnesia that you're feeding.

pdimitar1y ago

Shockingly, people can change their minds.

rldjbpin1y ago

while this field is now paying my bills, i am lowkey happy to see this notion in the mainstream.

> “An alarming number of technology trends are flashes in the pan.”

this has been a trend that seems to keep on recurring but does not stop from the tech bros from pushing the marketing beyond the realities.

raising money in the name of the future will give you similar results as self-driving cars or vr. the potential is crazy, but it is not going to make you double your money in a couple financial years. this should help serious initiatives find better-aligned investors.

meindnoch1y ago

The only time I found LLMs useful was creatig fake heartwarming stories to farm likes from boomers on Facebook.

signa111y ago

can someone please post an archive link to this article ? thank you !

nblgbg1y ago

https://archive.ph/PFmWw

iainctduncan1y ago

The real take away from this article is that the Gartner hype cycle is bullshit.

olalonde1y ago

> Silicon Valley’s tech bros

The Economist, seriously?

robertlf1y ago

Why post an article that's behind a paywall? How many of us can read it?

rambojohnson1y ago

blah blah blah

megamike1y ago

tell me I am already bored with it next.....

kkfx1y ago

ML is born in two master branches, one it's image manipulation, where video manipulation follow, another is textual search and generation toward the saint Graal of semantic search.

The first was started with simple non-ML image manipulation and video analysis (like finding baggage left unmoved for a certain amount of time in a hall, trespassing alerts for gates and so on) and reach the level of live video analysis for autonomous drive. The second date back a very big amount of time, maybe with the Conrad Gessner's libraries of Babel/Biblioteca Universalis ~1545 with a simple consideration: a book is good to develop and share a specific topic, a newspaper to know "at a glance" most relevant facts of yesterday and so on but we still need something to elicit specific bit of information out of "the library" without the human need to read anything manually. Search engines does works but have limits. LLMs are the failed promise to being able to juice information (in a model) than extract it on user prompt distilled well. That's the promise, the reality is that pattern matching/prediction can't work much for the same problem we have with image, there is no intelligence.

For an LLM if a known scientist (as per tags in some parts of the model ingested information) say (joking in a forum) that eating a small rock a day it's good for health, the LLM will suggest such practice simply because it have no knowledge of joke. Similarly having no knowledge of humans a hand with ten fingers it's perfectly sound.

That's the essential bubble, PRs and people without knowledge have seen Stable Diffusion producing an astronaut riding a horse, have ask some questions to ChatGPT and have said "WOW! Ok, not perfect but it will be just a matter of time" and the answer is no, it will NOT be at least with the current tech. There are some use, like automatic translation, imperfect but good enough to be arranged so 1 human translator can do the same job of 10 before, some low importance ID checks could be done with electronic IDs + face recognition so a single human guards can operate 10 gates alone in an airport just intervening where face recognition fails. Essentially FEW low skill jobs might be automated, the rest is just classic automation, like banks who close offices simply because people use internet banking and pay with digital means so there is almost no need to pick and deposit cash anymore, no reasons to go to the bank anymore. The potential so far can't grow much more, so the bubble burst.

Meanwhile big tech want to keep the bubble up because LLM training is a thing not doable at home as single humans alone, like we can instead run a homeserver for our email, VoIP phone system, file sharing, ... Yes, it's doable in a community, like search with YaCy, maps with Open Street Maps etc but the need of data an patient manual tagging is simply to cumbersome to have a real community born and based model that match or surpass one done by Big Tech. Since IT knowledge VERY lately and very limited start to spread a bit enough to endanger big tech model... They need something users can't do at home on a desktop. And that's a part of the fight.

Another is the push toward no-ownership for 99% to better lock-in/enslave. So far the cloud+mobile model have created lock-in but still users might get data and host things themselves, if they do not operate computers anymore, just using "smart devices" well, the option to download and self host is next to none. So here the push for autonomous taxis instead of personal cars, connected dishwashers who send 7+Gb/day home and so on. This does not technically work so despite the immense amount of money and the struggle of the biggest people start to smell rodent and their mood drop.

zelcon1y ago

Copium

omnee1y ago

I just asked Google's Gemini the following question:

Q: How many N's are there in Normation?

A: There is one N in the word "Normation"

Note that the answer is the same when asked n's instead of N's.

And this is but one example of many simple cases demonstrating that these model are indeed not reasoning in a similar manner to humans. However, the outputs are useful enough that I myself use Claude and GPT-4o for some work, but with full awareness that I must review the outputs in cases where factual accuracy is required.

Chirono1y ago

This is just an artefact of tokenisation though. The model simply isn’t ever shown the letters that make up words, unless they are spelled out explicitly. It sees tokens representing groups of words. This is a little like saying a human isn’t intelligent because they couldn’t answer your question that you asked in an ultrasonic wavelength. If you’d like to learn more this video is a great resource: https://youtu.be/zduSFxRajkE?si=LvpXbeSyJRFBJFuj

AlexDragusin1y ago

Local instance of Llama 3.1 8B

Q: How many N's are there in Normation? A: There are 2 N's in the word "Normation". Q: Confidence level? A: A more precise question! After rechecking, I can confirm that there is indeed 2 N's in the word "Normation". Confidence level: 100%

Q: How many N's are there in the string "Normation"? A: There are 2 N's in the string "Normation".

On several regenerations, at some point it gives: A: There are 3 N's in the word "Normation". Q: Confidence level? A: A slight challenge! After re-checking, I see that there are actually 2 N's in the word "Normation".

j / k navigate · click thread line to collapse

Artificial intelligence is losing hype (opens in new tab)

780 comments