I have a reusable library that lets me choose between any of the models I choose to support or any new model in the same family that uses the same request format.
Every project I’ve done, it’s a simple matter of changing a config setting and choosing a different model.
If the model provider goes out of business, it’s not like the model is going to disappear from AWS the next day.
This sounds so enterprise. I've been wanting to talk to people that actually use it.
Why use Bedrock instead of OpenRouter, Fal, etc.? Doesn't that tie you down to Amazon forever?
Isn't the API worse? Aren't the p95 latencies worse?
The costs higher?
This is the API - it’s basically the same for all supported languages
https://docs.aws.amazon.com/code-library/latest/ug/python_3_...
Real companies aren’t concerned about cost as much as working with other real companies, compliance, etc and are comparing cost or opportunities between doing a thing and not doing a thing.
One of my specialties is call centers. Every call deflected by using AI vs talking to a human agent can save from $5 - $15.
Even saving money by allowing your cheaper human agents to handle a problem where they are using AI in the background, can save money. $15 saved can buy a lot of inference.
And the lock in boogeyman is something only geeks care about. Migrations from one provider to another costs so much money at even a medium scale they are hardly ever worth it between the costs, distractions from doing value added work, and risks of regressions and downtime.
> Isn't the API worse
No, for general inference the norm is to use provider-agnostic libraries that paper over individual differences. And if you're doing non-standard stuff? Throw the APIs at Opus or something.
> Aren't the p95 latencies worse?
> The costs higher?
The costs for Anthropic models are the same, and the p95 latencies are not higher, they're more stable if anything. The open weights models do look a bit more expensive but as said many businesses don't pay sticker price for AWS spend or they find it worth it anyway.
https://cloud.google.com/blog/products/networking/aws-and-go...
This isn’t some type of VPN solution, think more like DirectConnect but between AWS and GCP instead of AWS and your colo.
It’s posited that AWS agreed to this so sales could tell customers that they don’t have to move their workloads from AWs to take advantage of Google’s AI infrastructure without experiencing extreme latency.