Building secure, scalable agent sandbox infrastructure (opens in new tab)

(browser-use.com)

79 pointsgregpr072mo ago17 comments

17 comments

I think this is pretty standard and similar to approaches that are evolving naturally (I've certainly used very similar patterns).

I'd be pretty keen to actually hear more about the Unikraft setup and other deeper details about the agent sandboxes regarding the tradeoffs and optimizations made. All the components are there but has someone open-sourced a more plug-and-play setup like this?

LarsenCC2mo ago

Agreed, the pattern is converging across the industry. The Unikraft setup is where it gets interesting for us with sub-second boots (or sub 100ms even), scale-to-zero that suspends the VM after a few seconds of idle (frees resources), and dedicated bare metal in AWS so we're not sharing hardware.

We haven't open-sourced the control plane glue yet but it's something we're thinking about. browser-use itself is open source. The sandbox infra on top is the proprietary part for now.

yakkomajuri2mo ago

Exactly, this is the very stuff I'd be interested to hear more about. Great work on all this btw and best of luck going forward!

nicklo2mo ago

I’m building a self-hostable, open source agent sandbox orchestrator here: https://github.com/ash-ai-org/ash-ai

pavitrabhalla2mo ago

We are building a plug-and-play setup for this at superserve.ai

jeremyjacob2mo ago

It’s neat to see more projects adopting Unikernals. I’ve played around with Unikraft’s Cloud offering about a year ago when it was CLI/API only and was impressed by the performance but found too many DX and polish issues to take it to production. Looks like they’ve improved a lot of that since.

nderjung2mo ago

Howdy! We are hard at work at improving the DX, and as a result we've been working on a brand new CLI. We haven't made any announcements yet, but it's already open-source for early adopts if you'd like to give it a try!

https://github.com/unikraft/cli

Feedback is very much appreciated, we're listening! :)

orf2mo ago

The first 3 “hardening” points are not great.

Essentially it’s just: remove .py files an execute del os.environ[“SESSION_TOKEN“]? This doesn’t really sound very secure, there are a number of ways to bypass both of these.

It’s just security through obscurity

LarsenCC2mo ago

Fair point, and you're right that those three steps alone aren't a security boundary. They're defense-in-depth, not the primary isolation.

The actual security model is the architecture itself: the sandbox runs in its own VM inside a private VPC. It has no AWS keys, no database credentials, no LLM API tokens. The only thing it can do is talk to the control plane, which validates every request and scopes every operation to that one session.

So even if you bypass all three hardening steps, you get a session token that only works inside that VPC, talking to a control plane that only lets you do things scoped to your own session. There's nothing to escalate to.

The bytecode removal, privilege drop, and env stripping are just there to make the agent's life harder if it tries to inspect its own runtime. Not the security boundary.

cedws2mo ago

The billion engineers building sandbox tools at the moment are missing the point. Sandboxing doesn't matter when the LLM is vulnerable to prompt injection. Every MCP server you install, every webpage it fetches, every file it reads is a threat. Yeah you can sit there and manually approve every action it takes, but then how is any of this useful when you have to supervise it constantly? Even Anthropic say that this doesn't work because reviewing every action leads to exhaustion and rubber stamping.

The problem is not what the LLM shouldn't have access to, it's what it does have access to.

The usefulness of LLMs is severely limited while they lack the ability to separate instructions and data, or as Yann LeCun said, predict the consequences of their actions.

logicx242mo ago

Yup. I just wrote about this last week: https://tachyon.so/blog/sandboxes-wont-save-you

Of all the problems in agent security, sandboxing solves the easiest problem.

cedws2mo ago

Excellent post.

CuriouslyC2mo ago

Prompt injection is hard but I believe tractable. I've found that by having a canary agent transform insecure input into a structured format with security checks, you can achieve good isolation and mitigation. More at https://sibylline.dev/articles/2026-02-22-schema-strict-prom...

Bnjoroge2mo ago

maybe the usecase that makes unikernels alot more mainstream. Always found them intriguing

eyberg2mo ago

Except this is very clearly running linux.

lazharichir2mo ago

What can you NOT run on this, it's not very clear? Is it like MicroVMs on steroids where you can run more binaries than the strict minimum?

mcpmarketplace2mo ago

This resonates. Pattern 2 (full agent isolation) handles the runtime threat, but there's a gap upstream. The MCP ecosystem has thousands of servers now and zero vetting. You find a repo, hope it's legit, and give it system access. Sandboxing won't help if the tool itself is designed to exfiltrate data through legitimate-looking API calls.

The missing layer is pre-installation scanning. Runtime isolation + supply chain vetting together is the real answer.

j / k navigate · click thread line to collapse

17 comments

yakkomajuri2mo ago

I think this is pretty standard and similar to approaches that are evolving naturally (I've certainly used very similar patterns).

LarsenCC2mo ago

We haven't open-sourced the control plane glue yet but it's something we're thinking about. browser-use itself is open source. The sandbox infra on top is the proprietary part for now.

yakkomajuri2mo ago

Exactly, this is the very stuff I'd be interested to hear more about. Great work on all this btw and best of luck going forward!

nicklo2mo ago

I’m building a self-hostable, open source agent sandbox orchestrator here: https://github.com/ash-ai-org/ash-ai

pavitrabhalla2mo ago

We are building a plug-and-play setup for this at superserve.ai

jeremyjacob2mo ago

nderjung2mo ago

https://github.com/unikraft/cli

Feedback is very much appreciated, we're listening! :)

orf2mo ago

The first 3 “hardening” points are not great.

Essentially it’s just: remove .py files an execute del os.environ[“SESSION_TOKEN“]? This doesn’t really sound very secure, there are a number of ways to bypass both of these.

It’s just security through obscurity

LarsenCC2mo ago

Fair point, and you're right that those three steps alone aren't a security boundary. They're defense-in-depth, not the primary isolation.

The bytecode removal, privilege drop, and env stripping are just there to make the agent's life harder if it tries to inspect its own runtime. Not the security boundary.

cedws2mo ago

The problem is not what the LLM shouldn't have access to, it's what it does have access to.

The usefulness of LLMs is severely limited while they lack the ability to separate instructions and data, or as Yann LeCun said, predict the consequences of their actions.

logicx242mo ago

Yup. I just wrote about this last week: https://tachyon.so/blog/sandboxes-wont-save-you

Of all the problems in agent security, sandboxing solves the easiest problem.

cedws2mo ago

Excellent post.

CuriouslyC2mo ago

Bnjoroge2mo ago

maybe the usecase that makes unikernels alot more mainstream. Always found them intriguing

eyberg2mo ago

Except this is very clearly running linux.

lazharichir2mo ago

What can you NOT run on this, it's not very clear? Is it like MicroVMs on steroids where you can run more binaries than the strict minimum?

mcpmarketplace2mo ago

The missing layer is pre-installation scanning. Runtime isolation + supply chain vetting together is the real answer.

j / k navigate · click thread line to collapse