https://blog.plan99.net/why-not-capability-languages-a8e6cbd...
But as pointed out by others, this particular exploit wouldn't be stopped by capabilities. Nor would it be stopped by micro-kernels. The filesystem is a trusted entity on any OS design I'm familiar with as it's what holds the core metadata about what components have what permissions. If you can exploit the filesystem code, you can trivially obtain any permission. That the code runs outside of the CPU's supervisor mode means nothing.
The only techniques we have to stop bugs like this are garbage collection or use of something like Rust's affine type system. You could in principle write a kernel in a language like C#, Java or Kotlin and it would be immune to these sorts of bugs.
This essay only addresses my second point - capabilities within a program. It doesn't address OS level capabilities at all.
But even in the space of programming languages, I find this essay extremely unconvincing. Like, you raise points like this:
> Here are some problems you’ll have to solve in order to sandbox libraries: What is your threat model? How do you stop components tampering with each other’s memory?
The threat model is left pad cryptolockering your computer via a supply chain attack. The solution is to design a language such that if I import leftpad, then call it, my computer can't get hacked.
You stop components tampering with each others' memory by using a memory safe language.
> its main() method must be given a “god object” exposing all the ambient authorities the app begins with
So what? The main function already takes arguments. I don't understand the problem.
Haskell already passes a type object as an argument to anything which does IO. They don't do it for security. Turns out having pure functions separated from non-pure functions is a beautiful thing.
Then there's these weird claims:
> Any mutable global variable is a problem as it may allow one component to violate expectations held by another.
You don't need to ban mutable global variables! Lets imagine we did this in safe rust. I think the only constraint is that a global variable can't be shared over the boundary between crates. But - nobody does that anyway. Even if you did share a global over a crate boundary, the child crate would still only be able to access it through methods on the type.
Sneaky developers could leverage globals to violate the security boundary. But it would be hard to do by accident. Maybe just, don't do that.
Your essay talks about some research project making a capability based java subset. And I understand that the resulting ergonomics weren't very good. But that isn't evidence that capabilities themselves are a bad idea. If a research student wrote a half baked C compiler one time, you wouldn't take that as evidence that C compilers are a bad idea. I do, however, accept that the burden of proof is on me to demonstrate that its a good idea. I hope that I can some day rise to that challenge.
> The filesystem is a trusted entity on any OS design I'm familiar with
Thats not how capability based microkernels like SeL4 work. The filesystem is owned by a specialised process. Other processes only modify files by sending messages to the filesystem process via a capability handle. If nobody created a writable file handle, the file can't be arbitrarily mutated by another module. Copyfail happened because in linux, any code can by default interact with the page table. One piece of code was missing access control checks. In capability based systems, its basically impossible to accidentally forget access control checks like that.
> The only techniques we have to stop bugs like this are garbage collection or use of something like Rust's affine type system. You could in principle write a kernel in a language like C#, Java or Kotlin and it would be immune to these sorts of bugs.
Copyfail is a logic bug. C#, Java or Kotlin wouldn't save you from it at all.
> The solution is to design a language such that if I import leftpad, then call it, my computer can't get hacked.
That requirement may seem clear right now, but the moment you talk to other people about your language you'll find there's no agreement on what "get hacked" means. Some people will consider calling exit(0) repeatedly to be "hacked" because it's a DoS attack, others will say no code execution or priv escalation happened, so that's not being hacked. Some will say that left-pad being able to read arbitrary bytes from your address space is being hacked, others will say no harm done and thus it wasn't being hacked. The details matter and you need to nail them down in advance.
It turns out for example that one of the top uses of the Java SecurityManager was just to stop plugins accidentally calling System.exit() and tearing down the whole process. It wasn't even a security goal, really.
> You stop components tampering with each others' memory by using a memory safe language.
That's not enough. See languages like Ruby or JavaScript, which are memory safe but not sandboxable due to all the monkeypatching they allow.
> Haskell already passes a type object as an argument to anything which does IO. They don't do it for security. Turns out having pure functions separated from non-pure functions is a beautiful thing.
But almost nobody uses Haskell, partly because of poor ergonomics like this! So if you want a language that gets wide usage and has a good library ecosystem, monads for everything probably isn't going to take off.
> If nobody created a writable file handle, the file can't be arbitrarily mutated by another module.
We're talking about critical bugs in the filesystem so what the FS processes idea of a file handle is doesn't really matter. If you can confuse or buffer overflow the FS process by sending it messages, you can then edit state inside that process you weren't supposed to be able to access, and as that process controls the security system for everything it's game over. Microkernels have no way to stop this, which is one reason very few operating systems move the core FS out into a separate process. You can't easily survive a crash of the core FS code, and it being exploited is equivalent to an exploit of the core microkernel anyway in terms of adversarial goals. So you might as well just run it in-kernel and reap the performance benefits.
> But almost nobody uses Haskell
Sad, but true
> partly because of poor ergonomics like this!
I'm somewhat dubious that's the reason, partly because I find such ergonomic excellent! Especially those provided by my capability system Bluefin: https://hackage.haskell.org/package/bluefin
The copyfail bug wasn’t a bug in the filesystem code. It was a bug in the crypto algorithm code, which wrote to the filesystem page table without checking if the process invoking it had permission to write to the passed file handle. In a monolithic kernel like Linux, every subsystem can access the memory of every other subsystem by default. It’s up to each subsystem to be careful. As we keep discovering, “be really careful” is not a successful security strategy.
A capability based OS like SeL4 is more secure. With SeL4, you would put the crypto algorithms and filesystem in separate user space processes. These processes would only communicate by RPC, by invoking capabilities. We can imagine how the copyfail scenario would play out: A user process has a capability representing its (read only) access to some privileged file on disk. It passes that capability to the crypto algorithm process. A bug - or even complete takeover - of the crypto algorithm process still doesn’t change that the file cap is read only. The crypto algorithm process doesn’t have direct access to the memory representing that file. It only has the read only file handle. All it can do with that handle is invoke it, which will only give it read access. Even with a bug in the crypto algorithms process, the OS would stay secure.
Yes, capability OSes aren't a magic bullet. A bug in the filesystem process could still result in filesystem corruption. But better is better. OS capabilities provide defence in depth. They would have prevented copyfail.
As far as I can tell, your argument against capabilities is that they might be slow. Some implementations have poor ergonomics. They don’t magically solve every possible security bug. You also, personally, used a bad implementation of capabilities this one time years ago in Java. Is that accurate?
You must see how unconvincing I find your argument. What are you even trying to do? Convince people to not explore different ideas in computer science? When I close my eyes I see an old man yelling: “Hey you kids! What are you doing up there, trying new things? You stop that right now!”
The assumption here is that the FS is the root of trust for the kernel. (A claim I consider dubious, but what do I know about knowing things?) It's another way to say that if you don't harden your root of trust, you're SOL. Which, ok, fair enough. But that's frankly irrelevant because hardening the root of trust is table stakes. The system cannot be secured without it, regardless of the threat model.
All of the concerns about a definition of "getting hacked" falls out of ignoring the hardening of the root of trust. I don't wish to put words in your mouth, but my interpretation of the argument is essentially, "we can't have nice things because the root of trust cannot be hardened sufficiently to prevent all intrusions."
Iff the FS is the root of trust, and it is not possible to confuse the FS by sending it messages, then there is no game over. You have a root of trust that cannot be broken.
> Microkernels have no way to stop this, which is one reason very few operating systems move the core FS out into a separate process.
My reading of the history reaches a very different conclusion. First, the primary reason that very few operating systems in practice use a microkernel design is because Linus Torvalds believed it was too slow for early 90's hardware [1]. And everyone else just does whatever Linux is doing.
Second, security through surface area reduction (and more broadly, defense-in-depth) was always the point of the microkernel design [2]. Trivially, the principle of least privilege is how one arrives at a secure system. Monolithic kernels, to this very day, continue to prove that they cannot be secured in any practical manner. I can only assume we need things to get worse before kernel developers will tighten up and take security seriously.
> So you might as well just run it in-kernel and reap the performance benefits.
There's that same mentality. Apparently "speed at all costs" is the willful trading of security for performance. That position is just as flawed as trading essential liberty for temporary safety [3]. It doesn't matter how fast the thing is when the slightest bump always causes it to explode, killing everyone on board.
[1]: https://web.archive.org/web/20040210002251/http://people.flu...
[2]: https://www.cosy.sbg.ac.at/~clausen/PVSE2006/linus-rebuttal....
[3]: https://old.reddit.com/r/todayilearned/comments/k0c8o6/til_b...