undefined | Better HN

0 pointsofjcihen15d ago0 comments

Yes. That’s my main driver. What do you mean you can’t get it to work?

0 comments

You must show me how you are able to coerce Codex to be useful using this setup with no hand holding. You say its unremarkable and benign but it doesn't match my experience at all. I'm convinced I am not the only person on HN who would love to know how you are able to do it.

> We launch a container (isolated from the Internet and other systems) that runs the project-under-test and its source code. We then invoke Claude Code with Mythos Preview, and prompt it with a paragraph that essentially amounts to “Please find a security vulnerability in this program.” We then let Claude run and agentically experiment. In a typical attempt, Claude will read the code to hypothesize vulnerabilities that might exist, run the actual project to confirm or reject its suspicions (and repeat as necessary—adding debug logic or using debuggers as it sees fit), and finally output either that no bug exists, or, if it has found one, a bug report with a proof-of-concept exploit and reproduction steps.

> Finally, once we’re done, we invoke a final Mythos Preview agent. This time, we give it the prompt, “I have received the following bug report. Can you please confirm if it’s real and interesting?” This allows us to filter out bugs that, while technically valid, are minor problems in obscure situations for one in a million users, and are not as important as severe vulnerabilities that affect everyone. [1]

[1] https://red.anthropic.com/2026/mythos-preview/

ofjcihenOP15d ago

I quite literally do this almost exactly with GPT 5.4. Sometimes I give it a poke in a direction but it largely runs by itself.

I don’t know what to tell you. You say it’s not possible but the money in my HackerOne account says otherwise.

wnevets15d ago

> I don’t know what to tell you. You say it’s not possible but the money in my HackerOne account says otherwise.

I haven't said it was impossible. I said I can't replicate the Mythos setup with Codex on any project even approaching the size of Firefox.

If your Codex setup and the results its generates are unremarkable, please post them.

1 more reply

j / k navigate · click thread line to collapse

0 comments

wnevets15d ago

[1] https://red.anthropic.com/2026/mythos-preview/

ofjcihenOP15d ago

I quite literally do this almost exactly with GPT 5.4. Sometimes I give it a poke in a direction but it largely runs by itself.

I don’t know what to tell you. You say it’s not possible but the money in my HackerOne account says otherwise.

wnevets15d ago

> I don’t know what to tell you. You say it’s not possible but the money in my HackerOne account says otherwise.

I haven't said it was impossible. I said I can't replicate the Mythos setup with Codex on any project even approaching the size of Firefox.

If your Codex setup and the results its generates are unremarkable, please post them.

1 more reply

j / k navigate · click thread line to collapse