The fact that it works is insufficient proof that it was the right thing to do. Building a habit of relying on LLM generated code is an inherently risky practice, and ChatGPT will literally warn you against trusting its outputs. Sure, it lets you growth hack your way through sort term problems, but in the long term I’m not convinced this is responsible decision making at the current levels of LLM technology.
Or maybe I’m just a Luddite, stuck in my old ways.
People who write code also make mistakes, yet we don't consider it "inherently risky practice". We just review others' code, tweak it, make it more appropriate for prod and voila. Same thing applies here.
nice caveat doing a heckuvallot of heavy lifting. i understand that we're talking about coders and sort have this inferred impression that coders will have this understanding, but...that's an awfully broad brush you've used to paint over the simple fact that most people using LLMs (in general) are not understanding this.
You’re using this as a soapbox to cast holier than thou elitist aspersions on an imagined Everyone Else that isn’t as enlightened as you.
This exact “LLMs bad and here’s why!” thread of conversation is getting so old. The fact that it always has the same few talking points is evidence enough that those indulging themselves in it have been party to these conversations before. They know how it goes. And now it’s their turn to say the same old tired and in this case largely irrelevant things to sound smart and to pat themselves on the back.
And why does that matter? Plenty of people "use JavaScript without fundamental understanding of the language's inner workings" but things are fine (not). My point is, people have always misunderstood the TOOLS that they use, but I don't see the same kinds of rejection before? Yes, people use LLMs thinking it is the end of programming but you can become way, way more productive as a programmer if you use it as a TOOL. The other day I used it to create a simple Python function to generate N distinct colors. It works, it seems to work, it even suggested using hue instead of rgb since its better or whatever (so I looked it up and it is for the eye) so I just used it. Should I spent a week going on a deep dive into the human eye perception and reading up articles on this?
How do you know most people using LLMs are not understanding this?
"Your job will not be taken by an AI. Your job will be taken by someone assisted by an AI."
The process touched on in the article, with thorough review before commit by a human with in-depth experience of the language and APIs and the domain in question, is exactly how AI-generated code should be incorporated into a workflow. The earlier slander against the author's technical ability seems misguided and unsupportable.
As long as you keep the scope small ("Write some example code that calls $API in Python", "Make it asynchronous; so I can queue up n calls and execute them in parallel"), it generates perfectly good code that is easy to understand too.
If a company prevents me from using ChatGPT, I will use it clandestinely unless they offer an equivalent. There's no going back.
Of course not, that’s ridiculous. You probably searched, read a few stackoverflow comments, found a relevant GitHub repo, a library for python/language of choice, and probably also a SAAS offering solely focused on the 3 lines of code you need. You quickly parsed all that and decided to modify some code in one of the SO comments for your needs. Next time, you looked passed half the junk and went straight to the first SO result and was able to tweak and use the result. The next time, it didn’t help but did help you write some inspired custom code for the problem, at least you knew what not to try.
My point being ai is useful. It’s not meant to be first result is final answer type solution, if that’s how you use it you will have issues.
I’m (not OP!) a cloud engineer but also work on a lot of FE (React) code for internal tools. ChatGPT has saved me countless hours (literally tens a month) writing super simple code that I am able to easily write up myself but typing it out just takes time. After month of using it I find myself still quite excited whenever cGPT saved me another hour. We also use Retool, but I find myself writing code ‘myself’ more often since cGPT launched.
No, I wouldn’t just copy paste production code handling PII, but prototyping or developing simple tools is sooooo much faster, for me.
The problem with chatGPT usage is not imperfect code. The problem, when there is one, is not treating its code the way one would treat a human’s.
But in regards to production code, I agree. When code is committed to a codebase, a human should review it. Assuming you trust your review process, it shouldn't matter whether the code submitted for review was written by a human or a language model. If it does make a difference, then your review process is already broken. It should catch bad code regardless of whether it was created by human or machine.
It's still worth knowing the source of commits, but only for context in understanding how it was generated. You know humans are likely to make certain classes of error, and you can learn to watch out for the blind spots of your teammates, just like you can learn the idiosyncrasies and weak points of GPT generated code.
Personally, I don't think we're quite at "ask GPT to commit directly to the repo," but we're getting close. The constant refrain of "try GPT-4" has become a trope, but the difference is immediately noticeable. Whereas GPT-3.5 will make a mistake or two in every 50 line file, GPT-4 is capable of producing fully correct code that you can immediately run successfully. At the moment it works best for isolated prompts like "create a component to do X," or "write a script to do Y," but if you can provide it with the interface to call an external function, then suddenly that isolated code is just another part of an existing system.
As tooling improves for working collaboratively with large language models and providing them with realtime contextual feedback of code correctness (especially for statically analyzeble or type-checked languages), they will become increasingly indispensable to the workflow of productive developers. If you haven't used co-pilot yet, I encourage you to try it for at least a month. You'll develop an intuition for what it's capable of and will eventually wonder how you ever coded without it. Also make sure to try prompting GPT-4 to create functions, components or scripts. The results are truly surprising and exciting.
It can be hard to remember though when there are unrealistic deadlines and helping someone inexperienced to do the work is twice the effort.
ChatGPT and Copilot are like intern software devs who can produce code in seconds. They generate code that's usually close, but not always correct, at the savings of a great deal of your time of typing the whole thing vs. verifying correctness.
For critical and complex algorithms, it's not worth using ML coding assistants right now, but they will be in the future. It's obvious that that's where it's headed: massive efficiency gains for non-technical and barely technical people, and the decline in demand for software engineers, and with it, also a decline in software engineering salaries.
The root of the problem here is people making production stuff who don't know wtf they're doing. If they turn to SO posts, LLMs, or "developers" on fiverr/upwork doing the same thing, is there really much of a difference? LLMs seem to mostly be tightening the loop of horror that's already been happening.
Same downward trajectory, increased velocity.
I guess, to your point, it's only trouble if the margarita mixer guy is put in charge of something that matters? :D
(might be a bad example, I've known some fine engineers and mechanics that are absolutely margarita mixer guy, but hopefully my point is taken lol)
I'd love to know about the nirvana you've been in up till now, because working around code from numerous large companies the vast majority of it is the crappest ass crap straight from the crapper with no redeeming qualities, and it has been this way forever. I'm not saying their isn't good parts, there are general core routines that sheer need for them to be performant and non-data corrupting forced some Sr engineer to fix them.