Sent patches for two just in "find".
Openbsd, like all other projects, needs a large scale LLM powered bug squash effort.
My recent experience: https://blog.habets.se/2026/05/Everything-in-C-is-undefined-...
Anthropic did that for OpenBSD.
I'm saying you don't even need Mythos to find bugs in OpenBSD. GPT 5.5 is SO much better than humans at finding these things.
The fact that we don't even need Mythos, or $20k (I just pay $24/month and this was one of my MANY uses), to find bugs in OpenBSD shatters the dream that there exists any human who can write C properly with enough expertise, dedication, and time.
Bugs, or exploitable security vulnerabilities?
If the latter, have you reported them all?
As I said in the out of bounds null termination write patch, I don't believe it's exploitable. I would have gotten a CVE, website, and logo then (kidding!). But it was UB. And one-byte overflows have in the past been exploitable by better sploit authors than me.
In any case, I reported that since I felt it was clear that OpenBSD folks would obviously care about it, exploitable or not.
Confirming these findings take time, even though I found GPT to almost always be correct. I will NOT report upstream until I understand the bug. I ain't no slop reporter. As I said in the post OpenBSD (and all other code bases) need a larger effort. The Mythos/Glasswing effort focusing on actually exploitable ones may be a good method for getting them fixed, without overwhelming projects with patches, even when the patches are correct.
I did confirm at least one more UB, and did consider whether to report that OpenBSD `find` reads `status` via `WIFEXITED(status)` without checking `waitpid()` for errors. This is UB since `status` is uninitialized. (https://github.com/openbsd/src/blob/ae684bfaed6cae797cd90e27...)
The reason is my previous experience with OpenBSD where the reply may be "<some standard> is wrong in this regard", and because they control their whole system, they don't care. E.g. in this case they may go "we build with GCC x.y.z exactly, and we know what actually happens in this controlled domain". This may be a bit unfair to them, but not by much.
GPT also flagged the extremely surprising behavior of running `cat -n file1 file2` if file1 doesn't end with a newline. And that `find /etc/passwd -execdir[…]` doesn't run the command. But maybe that's how they want it? I don't want to go through the whole thing for them to go "yeah we won't do that" again. So I think this project is for them. GPT is as available to them as it is to me.
Tangent: in running GPT against `cat` I learned that not only is `cat -n` not standardized, but it also behaves COMPLETELY differently than on Linux, if you provide more than one file.