undefined | Better HN

0 pointscdelsolar1y ago0 comments

I don’t think you’ve used LLMs enough. They are revolutionary, every day. As a coder I’m several times more productive than I was before, especially when trying to learn some new library or language.

0 comments

ardaoweo1y ago

They are revolutionary for use cases where hallucinated / wrong / unreliable output is easy and cheap to detect & fix, and where there's enough training data. That's why it fits programming so well - if you get bad code, you just throw it away, or modify it until it works. That's why it work for generic stock images too - if you get bad image, you modify the prompt, generate another one and see if it's better.

But many jobs are not like that. Imagine an AI nurse giving bad health advice on phone. Somebody might die. Or AI salesman making promises that are against company policy? Company is likely to be held legally liable, and may lose significant money.

Due to legal reasons, my company couldn't enable full LLM generative capabilities on chatbot we use, because we would be legally responsible for anything it generates. Instead, LLM is simply used to determine which of the pre-determined answers may fit the query the best, which it indeed does well when more traditional technologies fail. But that's not revolutionary, just an improvement. I suspect there are many barriers like that, which hinder its usage in many fields, even if it could work most of the time.

So, nearly all use cases I can think of now will still require a human in the loop, simply because of the unreliability. That way it can be a productivity booster, but not a replacement.

kkielhofner1y ago

Human medical errors have been one of the leading causes of death[0] since we started tracking it (at least decades).

The healthcare system has always killed plenty of people because humans are notoriously unreliable, fallible, etc.

It is such a stubborn, critical, and well-known issue in healthcare I welcome AI to be deployed slowly and responsibly to see what happens because the situation hasn’t been significantly improved with everything else we’ve thrown at it.

[0] - https://www.ncbi.nlm.nih.gov/books/NBK225187/

tekknik1y ago

> But many jobs are not like that. Imagine an AI nurse giving bad health advice on phone. Somebody might die.

This problem is not unique to AI and you see this problem with human medical professionals. Regularly people are misdiagnosed or aren’t diagnosed at all. At least with AI you could compare the results of different models pretty instantly and get confirmation. An AI Dr also wouldn’t miss information on a chart like a human can.

> So, nearly all use cases I can think of now will still require a human in the loop, simply because of the unreliability. That way it can be a productivity booster, but not a replacement.

This is exactly what your parent said, but yet you replied seemingly disagreeing. AI tools are here to stay and they do increase productivity. Be it coding, writing papers, strategizing. Those that continue to think of AI as not useful will be left behind.

mewpmewp21y ago

To me those usecases are already revolutionary. And human in the loop doesn't mean it is not revolutionary. I see it multiplying human productivity rather than as immediate replacement. And it can take some time before it is properly iterated and integrated everywhere in a seamless manner.

pennomi1y ago

A product doesn’t have to be useful for everything to still be useful.

sirspacey1y ago

If you adjust your standard to the level of human performance in most roles, including nursing, you’ll find that AI is reasonably similar to most people in that in makes errors, sometimes convincing ones, and that recovering from those errors is something all social/org systems must do & don’t always get right.

Human in the loop can add reliability, but the most common use cases I’m seeing with AI are helping people see the errors they are making/their lack of sufficient effort to solve the problem.

jazzyjackson1y ago

Are you making several times as much money?

IME LLMs are great at giving you the experience of learning, in the same way sugar gives you the experience of nourishment

barrkel1y ago

Developer productivity doesn't map very directly to compensation. If one engineer is 10x as productive as another, they're lucky if they get 2x the compensation.

amelius1y ago

The 10x engineer will just start their own company.

1 more reply

cambaceres1y ago

No but I can work 2-3 hours a day (WFH) while delivering results that my boss is very happy with. I would prefer to be paid 3 times as much and working 8 hours a day, but I'm ok with this too.

Rinzler891y ago

Like with all productivity gains in history, this won't last too long once management realizes this and squeezes deadlines by 2-3x since it will be expected for everyone to use LLMs at work to get things done 2-3 faster than before.

1 more reply

JohnFen1y ago

> I don’t think you’ve used LLMs enough. They are revolutionary, every day.

How much is "enough"? Neither myself nor my coworkers have found LLMs to be all that useful in our work. Almost everybody has stopped bothering with them these days.

carbine1y ago

what line of work are you in?

dax_1y ago

Do you perhaps have some resources on how you use AI assistants for coding (I'm assuming Github Copilot). I've been trying it for the past months, and frankly, it's barely helping me at all. 95% of the time the suggestions are just noise. Maybe as a fast typer it's less useful, I just wonder why my experience is so different than what others are saying. So maybe it's because I'm not using it right?

inthebin1y ago

I think it's your mindset and how you approach it. E.g. some people are genuinely bad at googling their way to a solution. While some people know exactly how to manipulate the google search due to years of experience debugging problems. Some people will be really good at squeezing out the right output from ChatGPT/Copilot and utilize it to maximum potential, while others simply won't make the connection.

Its output depends on your input.

E.g. say you have an API swagger documentation and you want to generate a Typescript type definition using that data, you just copy paste the docs into a comment above the type, and copilot auto fills your Typescript type definition even adding ? for properties which are not required.

If you define clearly the goal of a function in a JSDoc comment, you can implement very complex functions. E.g. you define it in steps, and in the function line out each step. This also helps your own thinking. With GPT 4o you can even draw diagrams in e.g. excalidraw or take screenshots of the issues in your UI to complement your question relating to that code.

epiccoleman1y ago

> some people know exactly how to manipulate the google search due to years of experience debugging problems

this really rings true for me. especially as a junior, I always thought one of my best skills was that I was good at Googling. I was able to come up with good queries and find some page that would help. Sometimes, a search would be simple enough that you could just grab a line of code right off the page, but most of the time (especially with StackOverflow) the best approach was to read through a few different sources and pick and choose what was useful to the situation, synthesizing a solution. Depending on how complicated the problem was, that process might have occurred in a single step or in multiple iterations.

So I've found LLMs to be a handy tool for making that process quicker. It's rare that the LLM will write the exact code I need - though of course some queries are simple enough to make that possible. But I can sort of prime the conversation in the right direction and get into a state where I can get useful answers to questions. I don't have any particular knowledge on AI that helps me do that, just a kind of general intuition for how to phrase questions and follow-ups to get output that's helpful.

I still have to be the filter - the LLM is happy to bullshit you - but that's not really a sea change from trying to Google around to figure out a problem. LLMs seem like an overall upgrade to that specific process of engineering to me, and that's a pretty useful tool!

1 more reply

DiscourseFan1y ago

> E.g. you define it in steps, and in the function line out each step. This also helps your own thinking

Yeah but there are other ways to think through problems, like asking other people what they think, which you can evaluate based on who they are and what they know. GPT is like getting advice from a cross-section of everyone in the world (and you don’t even know which one), which may be helpful depending on the question and the “people” answering it, but it may also be extroadinarily unhelpful, especially for very specialized tasks (and specialized tasks are where the profit is).

Like most people, I have knowledge of things very specific I know that less than a 100 people in the world know better than me, but thousands or even millions more have some poorly concieved general idea about it.

If you asked GPT to give you an answer to a question it would bias those millions, the statistically greater quantative solution, to the qualitative one. But, maybe, GPT only has a few really good indexes in its training data that it uses for its response, and then its extremely helpful because its like accidentally landing on a stackoverflow response by some crazy genius who reads all day, lives out of a van in the woods, and uses public library computers to answer queries in his spare time. But that’s sheer luck, and no more so than a regular search will get you.

ilaksh1y ago

Take a look at aider-chat or zed. zed just released new AI features. Had a blog post about it yesterday I think.

Also you can look into cursor.

There are actually quite a few tools.

I have my own agent framework in progress which has many plugins with different commands. Including reading directories, tree, read and write files, run commands, read spreadsheets. So I can tell it to read all the Python in a module directory, run a test script and compare the output to a spreadsheet tab. Then ask it to come up with ideas for making the Python code match the spreadsheet better, and have it update the code and rerun the tests iteratively until its satisfied.

If I am honest about that particular process last night, I am going to have to go over the spreadsheet to some degree manually today, because neither gpt-4o nor Claude 3.5 Sonnet was able to get the numbers to match exactly.

It's a somewhat complicated spreadsheet which I don't know anything about the domain and am just grudgingly learning. I think the agent got me 95% of the way through the task.

m_fayer1y ago

I rely on LLMs extensively for my work, but only a part of that is with copilots.

I have copilot suggestions bound to an easy hotkey to turn them on or off. If I’m writing code that’s entirely new to the code base, I toggle the suggestions off, they’ll be mostly useless. If I’m following a well established pattern, even if it’s a complicated one, I turn them on, they’ll be mostly good. When writing tests in c#, I reflexively give the test a good name and write a tiny bit of the setup, then copilot will usually be pretty good about the rest. I toggle it multiple times an hour, it’s about knowing when it’ll be good, and when not.

Beyond that, I get more value from interacting with the llm by chat. It’s important to have preconfigured personas, and it took me a good 500 words and some trial and error to set those up and get their interaction styles where I need them to be. There’s the “.net runtime expert” the “infrastructure and release mentor”, and on like that. As soon as I feel the least bit stuck or unsure I consult with one of them, possibly in voice mode while going for a little walk. It’s like having the right colleague always available to talk something through, and I now rarely find myself spinning my wheels, bike-shedding, or what have you.

myrmidon1y ago

It is very helpful in providing highly specific "boilerplate" in languages/environments you are not very familiar with.

The text interface can also be useful for skipping across complex documentation and/or learning. Example: you can ask GPT-4 to "decode 0xdf 0xf8 0x44 0xd0 (thumb 2 assembly for arm cortex-m)" => this will tell you what instruction is encoded, what it does and even how to cajole your toolchain into providing that same information.

If you are an experienced developer already, with a clear goal and understanding, LLMs tend to be less helpful in my experience (the same way that a mentor you could ask random bullshit would be more useful to a junior than a senior dev)

kybernetyk1y ago

> this will tell you what instruction is encoded, what it does and even how to cajole your toolchain into providing that same information.

or it will hallucinate something that's completely wrong but you won't notice it

1 more reply

throwthrowuknow1y ago

Copilot is just an autocomplete tool, it doesn’t have much support for multiturn prompting so it’s best used when you know exactly what code you want and just want it done quickly like implementing a well defined function to satisfy an interface or refactoring existing code to match an example you’ve already written out or prefilling boilerplate on a new file. For more complex work you need to use a chat interface where you can actually discuss the proposed changes with the model and edit and fork the conversation if necessary.

svaha17281y ago

Don’t work in large established code bases. Make flappy bird games in Python.

Regic1y ago

My experience is mostly with gpt-4. Act like it is a beginner programmer. Give it small, self-contained tasks, explain the possible problems, limitation of the environment you are working with, possible hurdles, suggest api functions or language features to use (it really likes to forget there is a specific function that does half of what you need instead of having to staple multiple ones together). Try it for different tasks, you will get a feel what it excels in and what it won't be able to solve. If it doesn't give good answer after 2 or 3 attempts, just write it yourself and move on, giving feedback barely works in my experience.

zarzavat1y ago

What language do you use?

If you can beat copilot in a typing race then you’re probably well within your comfort zone. It works best when working on things that you’re less confident at - typing speed doesn’t matter when you have to stop to think.

mewpmewp21y ago

I do 120wpm, but still copilot outpaces me, and it is not just typing, it is the little things I don't have to think about. Of course I know how to do all of it, but it still takes some mental energy to come up with algorithms and code. It takes less energy to verify what copilot output at least to me.

dax_1y ago

I use C# for the most part, sometimes PowerShell. But I can certainly see how it's more useful when I don't know much of the API yet. Then it would be a lot of googling which the AI assistant could avoid.

kybernetyk1y ago

My experience is similar. Most of the results are not really useful so I have to put work in to fix them. But at that point I can do the small extra step of doing it completely myself.

lumenwrites1y ago

This comment doesn't deserve the downvotes its getting, the author is right, and I'm having the same experience.

LLM outputs aren't always perfect, but that doesn't stop them from being extremely helpful and massively increasing my productivity.

They help me to get things done with the tech I'm familiar with much faster, get things done with tech I'm unfamiliar with that I wouldn't be able to do before, and they are extremely helpful for learning as well.

Also, I've noticed that using them has made me much more curious. I'm asking so many new questions now, I've had no idea how many things I was casually curious about, but not curious enough to google.

Workaccount21y ago

Good luck telling a bunch of programmers that their skills are legitimately under threat. No one wants to hear that. Especially when you are living a top 10% lifestyle on the back of being good at communicating to computers.

There is an old documentary of the final days of typesetters for newspapers. These were the (very skilled) people who rapidly put each individual carved steel character block into the printing frame in order print thousands of page copies. Many were incredulous that a machine could ever replicate their work.

I don't think programmers are going to go away, but I do think those juicy salaries and compensation packages will.

mewpmewp21y ago

At least for now it seems more like a multiplier that wouldn't reduce the amount of work out there, possibly even increase demand in certain cases as digitisation becomes easier so projects that weren't worth to do before will be now and more complicated usecases will open up as well.

So same programmer with the same 8h of workday will be able to output more value.

spratzt1y ago

The requirement for programmer’s is absolutely going to decline.

Some will undoubtably transition to broader based business consultancy services. For those unable or unwilling to do so the future is bleak.

JohnFen1y ago

> I do think those juicy salaries and compensation packages will.

I think that's inevitable with or without LLMs in the mix. I also think the industry as a whole will be better for it.

1 more reply

selimthegrim1y ago

What’s the title of the documentary?

1 more reply

j / k navigate · click thread line to collapse

0 comments

ardaoweo1y ago

So, nearly all use cases I can think of now will still require a human in the loop, simply because of the unreliability. That way it can be a productivity booster, but not a replacement.

kkielhofner1y ago

Human medical errors have been one of the leading causes of death[0] since we started tracking it (at least decades).

The healthcare system has always killed plenty of people because humans are notoriously unreliable, fallible, etc.

[0] - https://www.ncbi.nlm.nih.gov/books/NBK225187/

tekknik1y ago

> But many jobs are not like that. Imagine an AI nurse giving bad health advice on phone. Somebody might die.

> So, nearly all use cases I can think of now will still require a human in the loop, simply because of the unreliability. That way it can be a productivity booster, but not a replacement.

mewpmewp21y ago

pennomi1y ago

A product doesn’t have to be useful for everything to still be useful.

sirspacey1y ago

Human in the loop can add reliability, but the most common use cases I’m seeing with AI are helping people see the errors they are making/their lack of sufficient effort to solve the problem.

jazzyjackson1y ago

Are you making several times as much money?

IME LLMs are great at giving you the experience of learning, in the same way sugar gives you the experience of nourishment

barrkel1y ago

Developer productivity doesn't map very directly to compensation. If one engineer is 10x as productive as another, they're lucky if they get 2x the compensation.

amelius1y ago

The 10x engineer will just start their own company.

1 more reply

cambaceres1y ago

No but I can work 2-3 hours a day (WFH) while delivering results that my boss is very happy with. I would prefer to be paid 3 times as much and working 8 hours a day, but I'm ok with this too.

Rinzler891y ago

1 more reply

JohnFen1y ago

> I don’t think you’ve used LLMs enough. They are revolutionary, every day.

How much is "enough"? Neither myself nor my coworkers have found LLMs to be all that useful in our work. Almost everybody has stopped bothering with them these days.

carbine1y ago

what line of work are you in?

dax_1y ago

inthebin1y ago

Its output depends on your input.

epiccoleman1y ago

> some people know exactly how to manipulate the google search due to years of experience debugging problems

1 more reply

DiscourseFan1y ago

> E.g. you define it in steps, and in the function line out each step. This also helps your own thinking

ilaksh1y ago

Take a look at aider-chat or zed. zed just released new AI features. Had a blog post about it yesterday I think.

Also you can look into cursor.

There are actually quite a few tools.

It's a somewhat complicated spreadsheet which I don't know anything about the domain and am just grudgingly learning. I think the agent got me 95% of the way through the task.

m_fayer1y ago

I rely on LLMs extensively for my work, but only a part of that is with copilots.

myrmidon1y ago

It is very helpful in providing highly specific "boilerplate" in languages/environments you are not very familiar with.

kybernetyk1y ago

> this will tell you what instruction is encoded, what it does and even how to cajole your toolchain into providing that same information.

or it will hallucinate something that's completely wrong but you won't notice it

1 more reply

throwthrowuknow1y ago

svaha17281y ago

Don’t work in large established code bases. Make flappy bird games in Python.

Regic1y ago

zarzavat1y ago

What language do you use?

mewpmewp21y ago

dax_1y ago

kybernetyk1y ago

My experience is similar. Most of the results are not really useful so I have to put work in to fix them. But at that point I can do the small extra step of doing it completely myself.

lumenwrites1y ago

This comment doesn't deserve the downvotes its getting, the author is right, and I'm having the same experience.

LLM outputs aren't always perfect, but that doesn't stop them from being extremely helpful and massively increasing my productivity.

Workaccount21y ago

I don't think programmers are going to go away, but I do think those juicy salaries and compensation packages will.

mewpmewp21y ago

So same programmer with the same 8h of workday will be able to output more value.

spratzt1y ago

The requirement for programmer’s is absolutely going to decline.

Some will undoubtably transition to broader based business consultancy services. For those unable or unwilling to do so the future is bleak.

JohnFen1y ago

> I do think those juicy salaries and compensation packages will.

I think that's inevitable with or without LLMs in the mix. I also think the industry as a whole will be better for it.

1 more reply

selimthegrim1y ago

What’s the title of the documentary?

1 more reply

j / k navigate · click thread line to collapse