I've definitely given it a try for some of that, but the truth is I just don't have that many greenfield projects. Most of my work is maintaining existing code, and CoPilot/GPT-4 haven't proven as useful for me at that.
When it comes to new, greenfield projects, I really do try to be ambitious generally speaking, and this leads me into trouble. I gave it a try to see if it could generate code for something where I needed just a slightly tricky data structure maneuver. It was a nicely self-contained problem, pretty much ideal for this sort of thing, but it still generated code that didn't compile and wasn't optimal.
To be fair, it was kinda close, which is great. But overall it wasn't worth the time: I'm pretty slow, but most of my time is NOT in trying to implement basic things, it's trying to come up with what I need to implement in the first place. So I definitely tried spitballing with GPT-4 a little bit to try to get an idea of whether or not that was a good idea. Truth is, it's a bit hit-or-miss.
Here's my take:
- For programming tasks, it's just skipping too much stuff. I think the future for LLMs doing complex tasks is definitely going to depend heavily on huge context windows and approaches like langchain. That said, I think today you still can't really get something that resembles a "self-driving" robot programmer, because it's just lacking the depth necessary.
- For language tasks, it's fantastic. When it comes to language tasks, it really doesn't feel like it's skipping anything, it feels like it has a genuine understanding of language. I'm not sure if we're there yet, but I am hopeful for the future of machine translation using LLMs: You can't talk to Google Translate about the context your string will be used in, but you CAN do that for GPT-4, and while I'm not really fluent in any language aside from English, cross-checking its work, I think it's at least a lot better than what you can currently do using DeepL and Google Translate for common languages. (To be fair, even LLaMA finetunes have been SHOCKINGLY useful for this use case in my experience. They hallucinate a bit too much to be useful, but the fact that the hallucinations usually seem to lie close to the answer leaves me impressed given that it's a several GiB bundle of data on my SSD that runs on CPU only.)
Also, GPT-4 is just going to be epic for people learning new things. I am enthusiastically excited about the ability to just auto-generate some example code for some kind of thing I'm not familiar with. Even if it's nearly useless, it can probably get you pointed in the right direction and learning what terms you need to google for.