undefined | Better HN

0 pointsasadotzler11mo ago0 comments

Code runs or it doesn't, that's a sort of verification feedback that other use cases don't offer, at least not so immediately. Formal code verification is a thing, not so much for verification of say legal citations. Code is language with some well documented rules all over the training corpora. Many other use cases are hardly so well represented in model training. These are just a few of many, many reasons that code is an easier problem than most.

0 comments

6511mo ago

Code runs or it doesn't... but that doesn't mean it does what you want it to do.

An LLM could generate code that takes raw user input and adds it to a raw SQL query. Does it work? Yeah. Is it a terrible security flaw? Also yeah.

Additionally, if you want a certain UX and the LLM cannot get there but the code works, that doesn't mean it's successful.

j / k navigate · click thread line to collapse

0 comments

6511mo ago

Code runs or it doesn't... but that doesn't mean it does what you want it to do.

An LLM could generate code that takes raw user input and adds it to a raw SQL query. Does it work? Yeah. Is it a terrible security flaw? Also yeah.

Additionally, if you want a certain UX and the LLM cannot get there but the code works, that doesn't mean it's successful.

j / k navigate · click thread line to collapse