OpenAI Models Dominate Structured Code Edit Benchmark (opens in new tab)

(blog.mentat.ai)

5 pointsbiobootloader2y ago1 comments

1 comments

I saw the same thing with Mailogy.

I only tested variants of gpt-3.5 and -4 but got ~50% invalid syntax errors with 3.5, and virtually none with 4.

j / k navigate · click thread line to collapse

I saw the same thing with Mailogy.

I only tested variants of gpt-3.5 and -4 but got ~50% invalid syntax errors with 3.5, and virtually none with 4.

j / k navigate · click thread line to collapse