No,no, Docker is not a sandbox for untrusted code.
We live in a bizarre world where somehow "you need a hypervisor to be secure" and "to install this random piece of software, run curl | sudo bash" can live next to each other and both be treated seriously.
The kata-containers [1] runtime takes a container and runs it as a virtual host. It works with Docker, podman, k8s, etc.
It's a way to get the convenience of a container, but benefits of a virtual host.
This is not do-all-end-all, (there are more options), but this is a convenient one that is better than typical containers.
LPEs abound - unprivileged user ns was a whole gateway that was closed, io-uring was hot for a while, ebpf is another great target, and I'm sure more and more will be found every year as has been the case. Seccomp and unprivileged containers etc make a huge different to stomp out a lot of the attack surface, you can decide how comfortable you are with that though.
Start here to help give you ideas for what to research:
https://linuxsecurity.com/features/what-is-a-container-escap...
That is to say, Docker is typically a security win because you get things like seccomp and user/DAC isolation "for free". That's great. That's a win. Typically exploitation requires a way to get execution in the environment plus a privilege escalation. The combination of those two things may be considered sufficient.
It is not sufficient for "I'm explicitly giving an attacker execution rights in this environment" because you remove the cost of "get execution in the environment" and the full burden is on the kernel, which is not very expensive to exploit.
@task(name="analyze_data", compute="MEDIUM", ram="512MB", timeout="30s", max_retries=1)
def analyze_data(dataset: list) -> dict:
# Your code runs safely in a Wasm sandbox
return {"processed": len(dataset), "status": "complete"}
This is fundamentally awkward in a language with as absurdly flexible a type system as Python. What if that list parameter contains objects that implement __getattr__? What if the output dict has an overridden __getattr__?Even defining semantics seems awkward, especially if one wants those semantics to simultaneously make sense and have any sort of clear security properties.
edit: a quick look at the source suggests that the output is deserialized JSON regardless of what the type signature says. That’s certainly one solution.
We stick to JSON to make sure we pass data, not behavior. It avoids all that complexity.
I’ve been building on that foundation: script runs in sandbox, all commands and file writes get captured, human-in-the-loop reviews the diff before anything executes. It’s not adversarial (block/contain) but collaborative (show intent, ask permission).
Different tradeoff than WASM or containers: lighter than VMs, cross-platform, and the user sees exactly what the agent wants to do before approving.
WIP, currently porting to PyPy 3.8 to unlock MacOS arm64 support: https://github.com/corv89/shannot
Long, long ago, there was "repy"[1][2]. (This is definitely included in the "none succeeded" bucket, FWIW.)
I have been looking towards some kind of quick-start qemu option as a possibility, but the project will take a while.
If we want to isolate untrusted code at a very fine-grained level (like just a specific function), VMs can feel a bit heavy due to the overhead, complexity etc
This is so true
How does it work? Which WASM euntime does it use? Does it use a Python jnterpreter compiled to WASM?
https://github.com/mavdol/capsule
(From the article)
Appears to be CPython running inside of wasmtime
---
That is not save at all. You could always hijack builtin functions within untrusted code.
def untrusted_function():
original_map = map
def noisy_map(func, *iterables):
print(f"--- Log: map() called on {func.__name__} ---")
return original_map(func, *iterables)
globals()['map'] = noisy_map