I really want more security people to get involved in the LLM space because everyone seems to have just lost their minds.
If you look at this thing through a security lens it’s horrifying, which was a cause of frustration when Anthropic changed their TOS to ban use of alternative clients with a subscription. I don’t want to use that Swiss cheese.
[0] assuming a human with security training was involved in the design/prompting of the sandbox development.
[1] Claude has well used mechanisms for asking the user before taking potentionally dangerous actions. Why it is not part of the "disable my own SANDBOX" branches of code is confusing.
https://github.com/anthropic-experimental/sandbox-runtime/is...
I ended up making my own sandbox wrapper instead https://GitHub.com/arianvp/landlock-nix
> The restrictive policy was designed with these goals in mind:
> 1. No bypass of security by executing programs via ld.so.
> 2. Anything requesting execution must be trusted.
One correction on the table: SELinux and AppArmor shouldn't be grouped under "rename-resistant: No". AppArmor is path-based. SELinux labels are on the inode, a rename doesn't change the security context. The copy attack doesn't apply either: a process in sandbox_t creating a file in /tmp gets tmp_t via type transition, and the policy does not grant sandbox_t execute permission on tmp_t.
[1] https://github.com/linux-application-whitelisting/fapolicyd
On the copy attack: the `sandbox_t` -> `tmp_t` type transition you describe is a real defense, but it's policy-dependent. It's my understanding that `sandbox_t` is one of the most locked-down SELinux domains, while most interactive users (AI agents included) run as `unconfined_t`, where `tmp_t` files are executable, and the copy attack succeeds. So, whether a copied binary gets an executable type (or not) actually depends on the transition rules in the loaded policy.
Instead, content-addressable enforcement doesn't depend on policy configuration. The hash follows the content regardless of where it lands or what label it gets.
Two architectural differences worth noting, which I guess you are already aware of. First, `fapolicyd` is a userspace daemon... The kernel blocks until the daemon responds. This works, but the daemon itself becomes a single point of failure, isn't it? If it stalls or is killed, the system either deadlocks or fails open (hence the deadman's switch). Veto keeps hash computation and enforcement inside the BPF LSM hook. The BPF program can't (hopefully lol) crash and requires no context switch for the decision.
Second, `fapolicyd` defaults to an allowlist model: anything requesting execution must be in the trust database. That's a stronger default posture than our current denylist. We're starting with denylist because it's the lower-friction entry point for teams adopting agent security incrementally: you block known things without having to enumerate all good things first. In 2 words: different tradeoffs.
Hooks alone aren't a security boundary — Anthropic and Trail of Bits both say "guardrails, not walls." The missing piece is continuous behavioral measurement: tracking tool failures, subagent spawns, and risk drift in real time, then blocking dangerous calls before execution based on a live risk score — not just pattern matching.
I've been working on this at P-MATRIX (open source, Apache-2.0). The core idea: a 4-axis trust model that produces a real-time risk score R(t), and a Safety Gate that intercepts tool calls based on that score. Kill switch activates automatically when risk crosses a threshold.
npm: @pmatrix/claude-code-monitor | GitHub: github.com/p-matrix/claude-code-monitor
Good lord, why do people use LLMs to write on this topic? It destroys credibility.
HN users continue to upvote LLM written submissions.
The default for me is every LLM submission has little credibility unless proven otherwise. Enshittied.
For many it's not worth the effort to even try anymore. Particularly when the content of a submission is about LLMs: why worry?
Leo di Donato, who helped create Falco, the cloud native runtime security, wrote a technical deep dive into how Claude Code bypassed it's own denylist and sandbox. And introduces Veto, a kernel-level enforcement engine built into the Ona platform.