In my understanding once you remove all the layers of abstraction as some point it's a bunch of databases and data stores. Someone has to manage them. Why wouldn't a breach of those users be able to do whatever they want?
And a higher level, someone is writing the code to implement such a stringent access system. Why wouldn't a breach of those users (or a rogue employee) be able to accomplish bad things?
Building a large-scale information system is like building a nuclear power station. There are a million ways to screw it up and only a few recognized right ways. If you ignore the best practices, it will eventually destroy your company and harm your users. Twitter have nuked themselves here. How can they come back from this? It sure looks like an insider risk mitigation system would have been money well spent.
We all know access controls and multiple operators are good, yeah. But at the heart of it is still a bunch of linux machines that have to be managed and deployed to. Which as far as I know has no mechanism for check with operator x before running command from operator 0.
- at-rest encryption of the datastores with the content encryption key protected by a HSM. A KMS (key management system) would be the interface to retrieve the key, with access control enabled. An even better solution would be to have the HSM cipher/decipher the data directly, thus the encryption key would never leave the HSM (or the encryption key is also ciphered by the HSM). But performance-wise it is not realistic.
- in-transit encryption from the client to the datastore. No end-to-end encryption more likely thus allowing admins who have access to encryption termination hosts (reverse proxy, twitter backend app, datastore,etc) to read (and maybe alter) the data by doing memory dumps
- access control for datastore operations: allowing only the twitter backend and some privileged users to read/write in the datastores, etc.
Doing end-to-end encryption from the client to the datastore with a key per client is possible but it would make the solution very complex to operate and not performant.
The tl;dr is that they use hardware security modules (HSMs) with quorum-based access controls. Any administrative actions such as deploying software or changing the list of authorized operators requires a quorum of operators to sign a command for that action using their respective private keys.
While this system was designed specifically around protecting customers' private keys, you could imagine a similar system around large databases.
Not necessary
> or filesystem access
Also no
> or ability to modify the fleet.
Not that either. It feel like the conversation around these things is stuck in the far past. Large-scale organizations can and have driven the number of people with root passwords to zero. "Filesystem access" shouldn't be as easy as you're implying and it also shouldn't be of any use, since everything in the files ought to be separately encrypted with keys that can only be unwrapped by authorized systems.
Even the last thing you said about Linux systems starting processes ... even a minor application of imagination can lead you to think of an init daemon that can enforce the pedigree of every process on the machine.
Presumably this database runs on some machine? And this machine was logged into in order to install and setup the database?
Encrypted rows of data are meaningless to an "admin" that can query to its heart's content but will never be able to decrypt the result set. On the other hand, the layers on top (such as the web-tier that emits the plaintext) may have the keys to decrypt, but lack the privs to run around in the database; from that level, they must pass along the user's credentials to obtain user specific content.
Since people don't search by content on Twitter (afaik) and only 'meta-data' indexes are used (such as hash-tags, follower, following, date) this is entirely doable for something like Twitter.
There is also 'Homomorphic Encryption', but I'm not sure the tech there has reached acceptable performance levels.