But many jobs are not like that. Imagine an AI nurse giving bad health advice on phone. Somebody might die. Or AI salesman making promises that are against company policy? Company is likely to be held legally liable, and may lose significant money.
Due to legal reasons, my company couldn't enable full LLM generative capabilities on chatbot we use, because we would be legally responsible for anything it generates. Instead, LLM is simply used to determine which of the pre-determined answers may fit the query the best, which it indeed does well when more traditional technologies fail. But that's not revolutionary, just an improvement. I suspect there are many barriers like that, which hinder its usage in many fields, even if it could work most of the time.
So, nearly all use cases I can think of now will still require a human in the loop, simply because of the unreliability. That way it can be a productivity booster, but not a replacement.
The healthcare system has always killed plenty of people because humans are notoriously unreliable, fallible, etc.
It is such a stubborn, critical, and well-known issue in healthcare I welcome AI to be deployed slowly and responsibly to see what happens because the situation hasn’t been significantly improved with everything else we’ve thrown at it.
This problem is not unique to AI and you see this problem with human medical professionals. Regularly people are misdiagnosed or aren’t diagnosed at all. At least with AI you could compare the results of different models pretty instantly and get confirmation. An AI Dr also wouldn’t miss information on a chart like a human can.
> So, nearly all use cases I can think of now will still require a human in the loop, simply because of the unreliability. That way it can be a productivity booster, but not a replacement.
This is exactly what your parent said, but yet you replied seemingly disagreeing. AI tools are here to stay and they do increase productivity. Be it coding, writing papers, strategizing. Those that continue to think of AI as not useful will be left behind.
Human in the loop can add reliability, but the most common use cases I’m seeing with AI are helping people see the errors they are making/their lack of sufficient effort to solve the problem.
IME LLMs are great at giving you the experience of learning, in the same way sugar gives you the experience of nourishment
How much is "enough"? Neither myself nor my coworkers have found LLMs to be all that useful in our work. Almost everybody has stopped bothering with them these days.
Its output depends on your input.
E.g. say you have an API swagger documentation and you want to generate a Typescript type definition using that data, you just copy paste the docs into a comment above the type, and copilot auto fills your Typescript type definition even adding ? for properties which are not required.
If you define clearly the goal of a function in a JSDoc comment, you can implement very complex functions. E.g. you define it in steps, and in the function line out each step. This also helps your own thinking. With GPT 4o you can even draw diagrams in e.g. excalidraw or take screenshots of the issues in your UI to complement your question relating to that code.
this really rings true for me. especially as a junior, I always thought one of my best skills was that I was good at Googling. I was able to come up with good queries and find some page that would help. Sometimes, a search would be simple enough that you could just grab a line of code right off the page, but most of the time (especially with StackOverflow) the best approach was to read through a few different sources and pick and choose what was useful to the situation, synthesizing a solution. Depending on how complicated the problem was, that process might have occurred in a single step or in multiple iterations.
So I've found LLMs to be a handy tool for making that process quicker. It's rare that the LLM will write the exact code I need - though of course some queries are simple enough to make that possible. But I can sort of prime the conversation in the right direction and get into a state where I can get useful answers to questions. I don't have any particular knowledge on AI that helps me do that, just a kind of general intuition for how to phrase questions and follow-ups to get output that's helpful.
I still have to be the filter - the LLM is happy to bullshit you - but that's not really a sea change from trying to Google around to figure out a problem. LLMs seem like an overall upgrade to that specific process of engineering to me, and that's a pretty useful tool!
Yeah but there are other ways to think through problems, like asking other people what they think, which you can evaluate based on who they are and what they know. GPT is like getting advice from a cross-section of everyone in the world (and you don’t even know which one), which may be helpful depending on the question and the “people” answering it, but it may also be extroadinarily unhelpful, especially for very specialized tasks (and specialized tasks are where the profit is).
Like most people, I have knowledge of things very specific I know that less than a 100 people in the world know better than me, but thousands or even millions more have some poorly concieved general idea about it.
If you asked GPT to give you an answer to a question it would bias those millions, the statistically greater quantative solution, to the qualitative one. But, maybe, GPT only has a few really good indexes in its training data that it uses for its response, and then its extremely helpful because its like accidentally landing on a stackoverflow response by some crazy genius who reads all day, lives out of a van in the woods, and uses public library computers to answer queries in his spare time. But that’s sheer luck, and no more so than a regular search will get you.
Also you can look into cursor.
There are actually quite a few tools.
I have my own agent framework in progress which has many plugins with different commands. Including reading directories, tree, read and write files, run commands, read spreadsheets. So I can tell it to read all the Python in a module directory, run a test script and compare the output to a spreadsheet tab. Then ask it to come up with ideas for making the Python code match the spreadsheet better, and have it update the code and rerun the tests iteratively until its satisfied.
If I am honest about that particular process last night, I am going to have to go over the spreadsheet to some degree manually today, because neither gpt-4o nor Claude 3.5 Sonnet was able to get the numbers to match exactly.
It's a somewhat complicated spreadsheet which I don't know anything about the domain and am just grudgingly learning. I think the agent got me 95% of the way through the task.
I have copilot suggestions bound to an easy hotkey to turn them on or off. If I’m writing code that’s entirely new to the code base, I toggle the suggestions off, they’ll be mostly useless. If I’m following a well established pattern, even if it’s a complicated one, I turn them on, they’ll be mostly good. When writing tests in c#, I reflexively give the test a good name and write a tiny bit of the setup, then copilot will usually be pretty good about the rest. I toggle it multiple times an hour, it’s about knowing when it’ll be good, and when not.
Beyond that, I get more value from interacting with the llm by chat. It’s important to have preconfigured personas, and it took me a good 500 words and some trial and error to set those up and get their interaction styles where I need them to be. There’s the “.net runtime expert” the “infrastructure and release mentor”, and on like that. As soon as I feel the least bit stuck or unsure I consult with one of them, possibly in voice mode while going for a little walk. It’s like having the right colleague always available to talk something through, and I now rarely find myself spinning my wheels, bike-shedding, or what have you.
The text interface can also be useful for skipping across complex documentation and/or learning. Example: you can ask GPT-4 to "decode 0xdf 0xf8 0x44 0xd0 (thumb 2 assembly for arm cortex-m)" => this will tell you what instruction is encoded, what it does and even how to cajole your toolchain into providing that same information.
If you are an experienced developer already, with a clear goal and understanding, LLMs tend to be less helpful in my experience (the same way that a mentor you could ask random bullshit would be more useful to a junior than a senior dev)
or it will hallucinate something that's completely wrong but you won't notice it
If you can beat copilot in a typing race then you’re probably well within your comfort zone. It works best when working on things that you’re less confident at - typing speed doesn’t matter when you have to stop to think.
LLM outputs aren't always perfect, but that doesn't stop them from being extremely helpful and massively increasing my productivity.
They help me to get things done with the tech I'm familiar with much faster, get things done with tech I'm unfamiliar with that I wouldn't be able to do before, and they are extremely helpful for learning as well.
Also, I've noticed that using them has made me much more curious. I'm asking so many new questions now, I've had no idea how many things I was casually curious about, but not curious enough to google.
There is an old documentary of the final days of typesetters for newspapers. These were the (very skilled) people who rapidly put each individual carved steel character block into the printing frame in order print thousands of page copies. Many were incredulous that a machine could ever replicate their work.
I don't think programmers are going to go away, but I do think those juicy salaries and compensation packages will.
So same programmer with the same 8h of workday will be able to output more value.
Some will undoubtably transition to broader based business consultancy services. For those unable or unwilling to do so the future is bleak.
I think that's inevitable with or without LLMs in the mix. I also think the industry as a whole will be better for it.