Yes, this thing is gonna make IPv4 NAT look like a nice thing in comparison.
Yes, it will probably mean horrible kludge mapping of isolated-applications to UIDs, but until you get to 2^15 ~ 2^16 count of isolated-applications it should work fine.
Yes, this will be on a per-system basis, the resulting filesystem will be only useable by your system, and no other system.
What I'm saying is that in theory the "filesystem" and the "UIDs are 32 bit" parts are mostly there. They're there from the multi-user-big-box days not being used (except by Android/Linux).
> With 32-bit identifiers, if you make each level 16-bit, you only have room for two levels. What if you have need for a third?
The main reason why 65536 UIDs and GIDs are often submapped to every user is because POSIX systems often have a hardcoded assumption that user nobody is UID 65534, GID 65534, and if you want to run nested POSIX systems under POSIX systems without too many changes, reserving that many UIDs and GIDs are required.
There's no good place for that universal "nobody" user anyways, and if you're rethinking how the UIDs and GID mechanisms relate to security, definitively no place for a universal nobody, so you might as well map only the required ammount of UIDs/GIDs per isolated-application.
That then leads to leaving unmapped UIDs unmapped on both host and isolated-applications.
Unless you're reaching the 2^15 ~ 2^16 count of isolated-applications then it should work fine.
Another option would be doing what Android (and supposedly flatpak) does: you should not be able to simply run whatever you want if you're a isolated-application. If as a isolated-application you need to run another isolated-application you need to invoke "the platform" via `am` or `flatpak-spawn` and use it to spawn another isolated-application.
> What happens with filesystems though? I would assume the filesystem is using the root user namespace. Which means if you have two different UIDs and they map to the same UID in the root namespace, they get collapsed into one for file ownership/etc. That seems a rather major limitation.
As far as I know most of what can be considered normal Linux filesystems (ext4, btrfs and I think xfs) support said 32-bit UIDs so you would not need to change filesystem code (and I believe changing and bugfixing filesystem code is always a scary proposition) to use a 32-bit mapping.
Nothing prevents you from using only UID/GIDs; there are other security mechanisms that could be used:
* present every isolated-applications with different overlay filesystems visible. So you can have several things read/write to the same places, but every one has it's own view of what is being read/written.
* displaying a entirely different file system for every isolated-application (bindfs as an example)
* every isolated-application has it's SELinux context (labels) or other forms of ACL applied to those files.
But I find this birthday attack scenario dubious, why would you "need" this UID overlap if the isolated-application don't overlap outside of both namespaces?
If they aren't the same isolated-application it's the wrong thing to do and a security risk.
If they do map to the same isolated-application with the same set of data then trusting everything will be fine is a reasonable assumption. It isn't getting any more data or more privileges from being at different UIDs or GIDs in different contexts.