undefined | Better HN

0 pointsthomashabets26d ago0 comments

I pointed plain old gpt 5.5 at openbsd and found plenty of bugs.

Sent patches for two just in "find".

Openbsd, like all other projects, needs a large scale LLM powered bug squash effort.

My recent experience: https://blog.habets.se/2026/05/Everything-in-C-is-undefined-...

0 comments

> This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings.

Anthropic did that for OpenBSD.

https://red.anthropic.com/2026/mythos-preview/

thomashabets2OP5d ago

I know.

I'm saying you don't even need Mythos to find bugs in OpenBSD. GPT 5.5 is SO much better than humans at finding these things.

The fact that we don't even need Mythos, or $20k (I just pay $24/month and this was one of my MANY uses), to find bugs in OpenBSD shatters the dream that there exists any human who can write C properly with enough expertise, dedication, and time.

jjav5d ago

> (I just pay $24/month and this was one of my MANY uses), to find bugs in OpenBSD

Bugs, or exploitable security vulnerabilities?

If the latter, have you reported them all?

thomashabets2OP5d ago

As I mentioned in the post, I only did a brief exploration of OpenBSD in order to cheer myself up. I took some findings, confirmed them being true bugs, and ended there.

As I said in the out of bounds null termination write patch, I don't believe it's exploitable. I would have gotten a CVE, website, and logo then (kidding!). But it was UB. And one-byte overflows have in the past been exploitable by better sploit authors than me.

In any case, I reported that since I felt it was clear that OpenBSD folks would obviously care about it, exploitable or not.

Confirming these findings take time, even though I found GPT to almost always be correct. I will NOT report upstream until I understand the bug. I ain't no slop reporter. As I said in the post OpenBSD (and all other code bases) need a larger effort. The Mythos/Glasswing effort focusing on actually exploitable ones may be a good method for getting them fixed, without overwhelming projects with patches, even when the patches are correct.

I did confirm at least one more UB, and did consider whether to report that OpenBSD `find` reads `status` via `WIFEXITED(status)` without checking `waitpid()` for errors. This is UB since `status` is uninitialized. (https://github.com/openbsd/src/blob/ae684bfaed6cae797cd90e27...)

The reason is my previous experience with OpenBSD where the reply may be "<some standard> is wrong in this regard", and because they control their whole system, they don't care. E.g. in this case they may go "we build with GCC x.y.z exactly, and we know what actually happens in this controlled domain". This may be a bit unfair to them, but not by much.

GPT also flagged the extremely surprising behavior of running `cat -n file1 file2` if file1 doesn't end with a newline. And that `find /etc/passwd -execdir[…]` doesn't run the command. But maybe that's how they want it? I don't want to go through the whole thing for them to go "yeah we won't do that" again. So I think this project is for them. GPT is as available to them as it is to me.

Tangent: in running GPT against `cat` I learned that not only is `cat -n` not standardized, but it also behaves COMPLETELY differently than on Linux, if you provide more than one file.

j / k navigate · click thread line to collapse

0 comments

tiffanyh5d ago

Anthropic did that for OpenBSD.

https://red.anthropic.com/2026/mythos-preview/

thomashabets2OP5d ago

I know.

I'm saying you don't even need Mythos to find bugs in OpenBSD. GPT 5.5 is SO much better than humans at finding these things.

jjav5d ago

> (I just pay $24/month and this was one of my MANY uses), to find bugs in OpenBSD

Bugs, or exploitable security vulnerabilities?

If the latter, have you reported them all?

thomashabets2OP5d ago

As I mentioned in the post, I only did a brief exploration of OpenBSD in order to cheer myself up. I took some findings, confirmed them being true bugs, and ended there.

In any case, I reported that since I felt it was clear that OpenBSD folks would obviously care about it, exploitable or not.

Tangent: in running GPT against `cat` I learned that not only is `cat -n` not standardized, but it also behaves COMPLETELY differently than on Linux, if you provide more than one file.

j / k navigate · click thread line to collapse