undefined | Better HN

0 pointsMuffinFlavored3y ago0 comments

I don't get this. I asked GPT-4 to help me build a machine learning program for historical stock prices. It spat back about 8 errors, functions that don't exist, nothing compiled, multiple logic flaws...

Which one of us is lying?...

0 comments

ugh1233y ago

Probably neither of you? I've had better luck coding with ChatGPT when I ask it very concise questions, mainly at the function or small class level, but still asking for enough functionality to feel satisfied. It helps to give it a "shell" of some code, or a starting point of code to build from. I also make sure to ask it not to include 3rd party libraries (I'd consider those under separate discussions with it). It's truly helpful. And that's saying nothing about copilot integration in the IDE which is sooo good.

newswasboring3y ago

None of you are. You are using the tool improperly. The tool cannot spit out entire applications yet, especially if all you told it was the final outcome. I see GPT-4 as a junior programmer in my team who is unusually productive. It cannot do the design for me, but executing it is a piece of cake for it.

MuffinFlavoredOP3y ago

> The tool cannot spit out entire applications yet,

Yesterday I had a ticket at my job to extend some functionality in a code base that was probably 200 files and 100+ lines of code in each file, and that's before any `import` references to other libraries.

How can you feed all of these tokens to GPT-4 in a cost effective way so that it knows about your application well enough to recommend/pull off code completion at a human-like level?

newswasboring3y ago

I'm working on exactly this. I think we can use embeddings and langchain agents to do this.

SanderNL3y ago

The largest models like GPT4 have the interesting property to really, really finish what you started. If you start with flaws of any kind, it will continue to produce them. The inverse is true as well.

This is an actual thing[1] and it’s something larger models are actually worse (better?) at. They score higher and higher on the loss function (did I predict correctly), but their utility (does it work) goes down.

Just thought it was noteworthy.

[1] https://arxiv.org/abs/2102.03896

MuffinFlavoredOP3y ago

> If you start with flaws of any kind

I asked it to start and it provided me the flaws :D I can share the prompt if you'd like.

SanderNL3y ago

Please do, I’m curious.

boringuser23y ago

Asking GPT-4 to make you an entire application as a complete amateur is absurd and shows you don't understand the context window.

bombolo3y ago

Maybe he just writes really really really simple code all day and faked his way into a developer position? And then his team mates have to fix/redo anything he does?

I've seen it happen several times.

boringuser23y ago

Wrong.

j / k navigate · click thread line to collapse

0 comments

ugh1233y ago

newswasboring3y ago

MuffinFlavoredOP3y ago

> The tool cannot spit out entire applications yet,

How can you feed all of these tokens to GPT-4 in a cost effective way so that it knows about your application well enough to recommend/pull off code completion at a human-like level?

newswasboring3y ago

I'm working on exactly this. I think we can use embeddings and langchain agents to do this.

SanderNL3y ago

Just thought it was noteworthy.

[1] https://arxiv.org/abs/2102.03896

MuffinFlavoredOP3y ago

> If you start with flaws of any kind

I asked it to start and it provided me the flaws :D I can share the prompt if you'd like.

SanderNL3y ago

Please do, I’m curious.

boringuser23y ago

Asking GPT-4 to make you an entire application as a complete amateur is absurd and shows you don't understand the context window.

bombolo3y ago

Maybe he just writes really really really simple code all day and faked his way into a developer position? And then his team mates have to fix/redo anything he does?

I've seen it happen several times.

boringuser23y ago

Wrong.

j / k navigate · click thread line to collapse