Bezos, of all people, was like "make it happen." And it did. It was basically work for no reason except future proofing. Having someone up the food chain OK this much work for the future (and no hard dollar benefit) is highly unusual.
And besides that they've done some incredible things with their infrastructure, like authorization. Distributed authorization is really hard, but at AWS it's completely invisible. Remove a permission from an IAM role and it moves through AWS really, really fast. It's totally magic. Anyone who was abused by CORBA knows how hard that is to do well.
Their newer stuff (like Cognito) is sort of weird, but other things are surprisingly solid given how big AWS is. Small shops have trouble shipping feature complete software, and BigCorps can be even worse. AWS has gotten really good at it.
As for AWS, as far as I remember, Bezos was initially against the idea. The idea was the brainchild of one Andy Jassy who along with Rick Salzell convinced a reluctant board into trying this out. They realized that they had been unintentionally building this cloud platform for some years now in order to provide sellers with computing resources. Opening up to public users was just a small sales move. Whether they do it or not, they were going to continue to invest in their cloud platform and nothing would change as far as their technical direction was concerned, so the board finally relented.
It's unfortunate that only Amazon themselves can add new permissions to IAM to secure their services. Why can't our applications add new permissions to IAM and query those? This is going to be a shameless plug, but it was this very problem that caused my cofounders and I to quit our jobs and start a company. Together (and now with a community of hundreds of users and contributions from a few well-known companies) we built SpiceDB[0], which is the culmination of state of the art distributed systems and authorization technology developed open source instead of behind closed doors at a hyper-scaler. We were mostly inspired by the internal system at Google, which is actually more powerful than AWS or Google Cloud's IAM services, despite a fork of it actually powering GCP's IAM.
For the purposes of authorization, services integrate with a library that handles retrieving and caching policies based on caller identity. services create a context that includes all of the relevant metadata (service, operation, resources, etc.) and the library evaluates the policy and says allow or deny.
Doing it all in application means that if the control/distribution systems for auth go down most things that are in motion will remain in motion, and that deployments of the authentication/authorization code deploy out at a per-service granularity which also scopes blast radius.
There's some pretty obvious pain points (doing anything as a library means update the world for new features) but it has nice degradation properties and is relatively straightforward to grok as a service owner.
At some level every API call is authorized (and tracked).
To be honest, this is one of the secret sauces that makes AWS go. Someone once told me that they're not doing anything exciting, just caching, but I'm pretty sure they didn't really know what was going on.
Essentially I think we've gone too far: service-oriented architectures turned into "micro" services, which come with a lot of complexity and distributed systems issues. I think for most small companies monoliths are right, for medium-sized companies (say 50+) it makes sense to carefully introduce a few separate services, and only for large companies (say 300+) does many services (which may or may not be "micro") start to pay off. I've heard it said that "microservices solve a people problem, not a technical one", and I think that's true.
That last part, to me, is the key to success: getting the whole business to do things in a new way. That is fucking hard. If you can get your business to do it, you have an invaluable superpower. The more things that you can reinvent, faster, gives you more and more superpowers. It's one thing to change your architecture. But also imagine getting every employee to change how they deal with vacations, suppliers, customers, finance, or involving entirely new industries. The easier it is to adapt and change, the longer you survive and the more you thrive. Evolution, baby.
Interesting example. Why would changing distributed computing architecture have an impact on vacation policy?
Maybe you work at a company that sometimes works with the government. As a result, the whole company might develop a hiring process which is very slow, very detailed, and excludes certain people from being hired. But probably only a very small number of employees actually have to conform to those government requirements. You can apply them to all new hires "for simplicity", but it makes it harder to hire for non-government positions. So changing how you hire, to make it easier and faster to hire people of a wider background, benefits your organization. If your org can't easily make those changes, it will be disadvantaged.
I feel like the first time I heard the term was early 2000's, and wasn't it a mainframe thing first? Dunno, just wondering.
Anyhow, it's nicely written, very concise, and worth noting how the original author focuses more on "What kind of realistic options do we have?" than winning the A vs. B vs. C argument in one fell swoop.
Ironically when .NET was launched, Microsoft's vision was web services everywhere, with orchestration servers like Bizztalk.
We got there eventually, only using REST (aka JSON-RPC) and gRPC instead.
It is really interesting to see a recent(ish) trend away from this three tier design and back towards tighter coupling between application layers. Usually due to increased convenience & developer ergonomics.
We've got tools that 'generate' business layers from/for the data layer (Prisma, etc).
We've got tools that push business logic to the client (Meteor, Firebase, etc)
The thing about Amazon's systems is that they are horrendously complex. In ~2016 I was working on the warehousing software, and it was a set of some hundreds of microservices in the space, which also communicated (via broad abstraction) to other spaces (orders, shipments, product, accounting, planning, ...) which were abstractions over 100s of other microservices.
So what I observed at the time was a broad increase in abstraction horizontally, rather than vertically. This manifesto describes splitting client-server into client-service-server; the trend two decades later was splitting <a few services, one for each domain> into <many services, one for each slice of each domain>, often with services that simply aggregated the results of their subdomains for general consumption in other domains.
I'm sure things have only gotten more complicated since then (in particular, a large challenge at the time was the general difficulty in producing maintainable asynchronous workflows, so lots of work was being done synchronously or very-slightly-asynchronously that should have been done in long-running or even lazy workflows).
Of course, there’s some cargo culting around services where people jump to that architecture before they need it, but for most apps YAGNI. it’s cool that their architecture was driven by clear needs “just in time” to allow them to continue to scale
For example,
> In the case of DC processing, customer service and other functions need to be able to determine where a customer order or shipment is in the pipeline. The mechanism that we propose using is one where certain nodes along the workflow insert a row into some centralized database instance to indicate the current state of the workflow element being processed.
definitely doesn’t seem to reflect the hiding of a database behind an interface. (From a workflow node’s perspective, rows in that centralized database should be an implementation detail it has no knowledge of.)
Then again, this is part of their pitch for workflow processing, not service-oriented architecture.
There are companies started later than 2010 where this was still the case. Interesting to think about how shipping things quickly is so different than scaling them up.
“Distributed Computing Manifesto
Created: May 24, 1998”
> Amazon's distributed computing manifesto (1998) (2022)