Always use your own proxy where the egress is well within your free tier, i.e. do this: Azure|Amazon > Hetzner|Linode > Cloudflare
Why?
Because Cloudflare cache is a massively multi-tenant LRU cache and whilst hot files will be cached well (and with Cloudflare Tiered Cache even better - but this itself is a cost) anything else is still going to expose you to some degree of egress cost.
When I exposed AWS to the web I paid $3k per month to AWS. With Cloudflare in front of AWS I paid $300 per month to AWS. With Linode in front of AWS and behind Cloudflare I paid $20 per month to Linode and about $12 per month to AWS.
A Linode, Hetzner instance... or any other dumb cheap web server that comes with a healthy free tier of bandwidth is all you need to set up a simple nginx reverse proxy and have it cache things to disk https://docs.nginx.com/nginx/admin-guide/content-cache/conte...
Or if caching is your biggest priority then Fastly or Akamai will shine too.
But if you're balancing all considerations and want the cheap "good enough" caching with the DDoS protection, free TLS certs, and unmetered (assuming you aren't imgur or something)... then Cloudflare does a great job at being good enough. And for those sharp edges... drop in a proxy of your own, or layer your CDNs.
Why not directly Hetzner|Linode > Cloudflare?
I hate these 'cloud economics' optimizations that people tend to try.
(search HN and reddit for that URL, you'll see they've been around and recommended for a really long time).
see https://blog.cloudflare.com/introducing-r2-object-storage/
From the Cloudflare blog, it seems R2 would've handled this exact situation - auto-migration of cloud S3-like-storage objects - download from cloud-storage just once and cache in R2 for Cloudflare to serve.
Would love to find out if you can write to any/every region and have things replicate, or if you have to write to a single region. BunnyCDN's edge storage solution looked interesting until I found out it only supported writes to a single region.
Hoping R2 might be my savior here, otherwise will probably have to roll my own active-active minio cluster, which I'm not looking forward to maintaining. Other suggestions welcome!
Team I'm in at the moment is in the early stages of cloud adoption but the company in total has fell hook line and sinker for AWS. When I mentioned the cost there is always an excuse.
The main one being that you don't have to hire sysadmins anymore as that's taken care now by AWS. Ah yes but they have actually been replaced with a "DevOps" team plus just our department now spend > £1 million per year to AWS in hosting costs. A 20% reduction in those fees could pay for a few sysadmin(s).
The next one is that no other vendor would be able to supply the kit. You know StackOverflow is able to run on a single webserver (https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar...). Plus many of the other providers have loads of instances available.
I mean I'm not against cloud it's just not the cheapest option if you choose one of the big 3 providers. I use a company called scaleway (https://www.scaleway.com/en/) they have all the essential cloud services you need and everything else you can run yourself in docker or k8s.
Dealing with hardware failures, hardware vendors, confusing licensing, having to know SKUs, racking new cabinets, swapping hard drives, patching servers - it's all awful work. When you go cloud only, you can be more productive instead of dealing with some of that nonsense work.
And, honestly, I miss the old days. Today, $cloud has some weird spasms where you suddenly get an influx of connection timeouts or tasks waiting for aeons to get scheduled and you just can't log in to a switch or a machine and figure out what the exact hell is going on. You just watch the evergreen $cloud status page, maybe file some tickets and pray someone bothers to investigate, or maybe live with those random hiccups "sorry $boss, everything is 100% good on our side, it's $cloud misbehaving today", adding more resilience -> complexity -> unreliability in the name of reliability to the system. Either way, with the clouds I feel handicapped, lacking the ability to diagnose things when they go wrong.
I don't miss those three days we spent fighting a kernel panic. Was about a decade ago - we outgrew the hardware and had to get a new one with a badass-at-the-time 10GB SFP+ NIC that worked nice for the first few weeks but then its driver suddenly decided to throw some tantrums on almost a hourly basis. I don't even remember the details - a lot of time flew since then, but thankfully we found some patch somewhere in the depths of LKML and the server was a perfect clockwork ever since. That wasn't fun, but that was an one-in-many years incident.
Either way, I do feel that in the ancient ages hardware and software used to be so much more simple and reliable. Like, today people start with those multi-node high-availability all-the-buzzwords Kubernetes-in-the-cloud monstrosities that still fail now and then (because there are so many moving parts shit's just bound to fail at incredible rate), and in the good old days people somehow managed to have a couple of servers in the rack - some proper, some just desktop towers sitting by - and with some duct tape and elbow grease those ran without incidents for years and years.
Have I turned old and sour? Or maybe it's just the nostalgia about the youth, and I've forgotten or diminished most the issues while warmly remembering all the good moments?
If cloud improved QOL for ALL employees I'd agree but I think it just shifts work around and costs more.
I've met plenty of datacenter technicians that loved they work and the opportunities for growth it provided.
Some companies really know how to manage a datacenter with minimum pain. Some don't.
Each to their own, but I think you'll find there's a fairly significant portion of sysadmins who love that work!
It's basically a form of permanent debt. Faster product market fit, higher long term infrastructure costs until you have enough breathing room to start pulling it into your own datacenter. At that point you have some negotiating leverage with the cloud provider.
On the other hand, if you're not looking for explosive growth man oh man is DigitalOcean or anyone of a number good providers of good old VPSes / Cloud-lite.
I've worked with teams on both sides, and everyone is gonna have to deal with figuring out how to run at scale, it's just different ways of achieving that.
I've worked with teams that manage their own infrastructure with dedicated servers, and not having to think about scaling for a long time as the one beefy server could just take whatever load you threw at it.
I've also worked with teams who don't manage their own infrastructure and thought they were ready to scale without issues, but once the scale actually happened, it turned out there was more things to consider than just the amount of servers you run, race-conditions were everywhere but no one thought about that.
Definitely a case of "right tool for the right job", but I don't think it's as easy as "Self-managed: harder to scale, PaaS/Cloud: easy peazy to scale".
For ~100eur/month on hertzner you can get a 16core Zen3, 128GB RAM with 8TB of NVMe SSD.
Unless your stack is horrendously badly optimised you can serve SO MUCH traffic off that - definitely billions of postgres records without breaking a sweat.
So the scale argument somewhat disappears - if anything, people end up adding much more complexity to the product to get round the high hardware costs of the cloud (complex caching systems for example, instead of just throwing loads of hardware at the problem).
Actually AWS won't help you here. I have literally been on a 2 day training course or aurora with AWS and the explanation of how to scale was actually just the same as any traditional non-cloud explanation. Correct usage of indexes, partitioning data, optimising queries (especially any non trivial query output by an ORM) and read replicas.
In terms of explosive growth if you're talking about something like google or tiktok again slapping it all in AWS will not automatically just work. There is a lot of engineering that you'll need to get to their level.
I also think you haven't really looked at the SO link I sent through with thoughtful engineering they have huge user base with a tiny footprint.
> DigitalOcean or anyone of a number good providers of good old VPSes / Cloud-lite
Not sure why you are dunking on DO here they are a fully fledged cloud provider with much the same stuff you would need. You can also run up a huge bill on DO as well.
Depends on the team size of the said startup [0]. In my opinion, tech-shops are better off using new-age cloud providers like fly.io / glitch.com / render.com / railway.app / replit.com / deno.com / workers.dev etc [1].
[0] https://tailscale.com/blog/modules-monoliths-and-microservic...
Most of the problems here will be DBA problems like understanding query plans and such. Even with AWS RDB, I’ve had to upload various setting files to tweak tunables to get things working.
They still serve a lot more traffic than I do and I have hundreds of instances; thousands of containers.
A similar “scale” e-commerce site would be significantly more load, have more dynamic data, and just be overall harder to run.
You can hire a "few" sysadmins for 200k/year?
https://uk.indeed.com/jobs?q=System%20Administrator&vjk=5149...
Probably not at FAANG level salaries but I doubt there are many sysadmins working for FAANG companies anymore.
DevOps btw are more expensive and infact in the UK DevOps can be higher paid that a developer. I suspect most of the DevOps working for this company are on £65k+. According to:
https://ifs.org.uk/tools_and_resources/where_do_you_fit_in
That puts those earners in the top 3% or from that website:
" In the below graph, the alternatively shaded sections represent the different decile groups. As you can see, you are in the 10th decile group.
In conclusion, Your income is so high that you lie beyond the far right hand side of the chart. "
I'm curious about your workload. I tend to only use cloud for workloads where it's either (1) by far the only feasible option (e.g. need GPUs for short periods of time), or else (2) basically free.
> I mean I'm not against cloud it's just not the cheapest option
This is certainly true for most workloads. It's also true that buying is better than renting, but here I am living in a rented apartment.
The logic from on high might be something like "if demand is uncertain and capex is risky, why buy when you can rent?"
Exactly this. As a low-level / embedded / non-cloud stuff dev, I've been getting up to speed through all the cloud-ification of the industry, but I'm still scared (not literally ofc) of running most things on my own on any big cloud provider (smaller ones seem more manageable).
I'm reading this and seems like being a customer of cloud services is like walking a dangeous path filled with gotchas and caveats, just jumping from cover to cover while hiding from danger, and hoping you're safe and didn't mess it up so far, "fingers crossed".
Like this tiny detail that he didn't realize was critical, so I would fall on it too plus on another 500s small papercuts: "oh I set cache up, so I hope all is well". "Yeah, no you aren't, I guess you didn't think of this detail about maximum cached file size! Gotcha, Game Over!"
Yeah cloud providers should have clearer communitacions and etc etc... but the fact of today is that they don't. So I'd never sleep well feelin 100% confident that I had covered and taken into account every minuscule detail and possible scenario that could end up being a disaster.
Another advantage is his big network that he can ask for help. There's also a chance that his blog post will reach the right person in Azure and he'll get a reduced bill.
As someone who doesn't have the same network or the "fame", I am concerned about what would have happened to me in that situation.
I am still waiting for a cloud without these dark patterns. But that will never happen because it‘s leaving a big amount of money on the table by not being hostile.
As you say they make it hard deliberately.
Edit: Turn out Azure have this:
https://docs.microsoft.com/en-us/azure/cost-management-billi...
"Predatory death-trap pricing" captures the spirit of the thing with rather more clarity. It is wholly intentional after all.
Funny enough...Oracle (OCI) makes it better, you can buy oracle"coins" 1to1 with $ and load your account just with what you think you need.
This is how mobile and landline phone companies made enormous fortunes before flat rate billing. It’s called post-paid vs pre-paid billing.
This is very straight forward from their view, before: almost no traffic = almost no costs, now: huge traffic = $$$.
On the other hand, it doesn't seem that Troy did try to talk to them about this and seems to want to eat the costs himself. As it was his mistake. I think that's commendable. I also think with the amount of free advertisement Troy has done for them they'd be open to this and I can imagine we might see a followup post like "MS was so nice they waived my costs".
Maybe it's not a problem when you're dealing with millions of VC money, but there's no way in hell I would host anything in a bandwidth-metered cloud service when my or my own company's money is involved.
Eh, I don't know - either way he is a Microsoft Regional Director and MVP for Cloud (as well as security), runs courses on cloud deployment on Pluralsight, and has done speeches on Azure and reducing cloud bills, so if a he can get stung it doesn't say a whole lot good about my chances.
There are both spending limits and alerting that you could use, but would be impossible to predetermine from Azure's perspective, so they rightly ask you to.
The result is that you get a lowest common denominator type of dashboard. And hence a whole industry of providing just a prettier dashboard on top of AWS / GCP / Azure metrics.
Datadog started with a prettier dashboard for Cloudwatch data.
Cloudability started with a prettier dashboard for the Cost and Usage Report.
And also works the other way around. The individual product teams buy development environments to circumvent the console restrictions.
For example, a few years ago, the Redshift team purchased "DataRow".
From TFA it looks like that would be 10 cents per "time series". Or what I translate it to, is 10 cents every 5 minutes (*I think, but I havent used Azure in some time*). $1.20/hour, $28.80/day, almost $900/month. Not too hard to drop that by making the alert less frequent. (edit: I think I saw AU$ there, so maybe it is AU$900.)
Monitoring CPU? Another $0.10 per month. Memory? Another $0.10.
Thankfully, not $900.
As an aside, their (Azure's) pricing docs are written in the same fishy way their technical docs are written (my opinion only)...
The alert emails are way more meaningful (with projected amount in subject for example) unlike generic ones from Azure Alerts, so you see a real alert and prompted to take immediate action.
1: https://cloudalarm.in/Home/Docs/#how-is-budget-alarm-differe...
Also, Azure has an option to alert you beforehand if it looks like you'll go over; struggling to see how your service is any better.
In a business setting, you want your service to stay up, at the cost of spike in costs if accidents or mistakes happen.
In a personal project, you want there to be hard limit on cost, and your service to go down if spikes call for it. (I'm relatively sure that no one wants their personal projects to incur a bill of thousands of dollars by accident.)
No you don’t. This is absolutely not a given. Being a “business” doesn’t mean you suddenly have unlimited budget.
The vast majority of businesses are not “web scale” and are better off taking an availability outage than suddenly handling 1,000,000x the normal volume of traffic.
If you are selling you product via your web site and you're suddenly on TV with millions watching and accessing your site, you definitely don't want the server to go downand autoscaling + a bit higher cost would be great.
But it's also the case that if they did implement hard limits of some sort, you'd be reading blog posts about how AWS destroyed my project just when it was going big because someone stuck a circuit breaker foot gun in some corner and everything stopped working properly when usage spiked.
I do think there should probably be a hard circuit breaker. It should be simple and therefore inflexible. And it should come with a big warning sign. Still people will get burned because someone will set it, a project grows, and one day it goes off.
If you're using a cloud provider I'd highly recommend setting one of those up.
In Azure it's under your Subscription and then Budgets
I truly believe they want you to use a lot of their resources on a consistent, long-term basis; they don't get long-term value from people having short, one-off anomalies, so budgets and monitoring are aligned with their customers - just not total cost of ownership calculations :)
The fact is that there are low end VPS, middle end VPS, high end VPS, and dedicated servers. If you started from a low end VPS, it is very easy to gradually upgrade your VPS.
A $5/month VPS can be used to play for tons of things. I just don't get people who use free tier cloud, unless you just want to learn about the cloud hosting per se.
Making your application scalable is a significant effort that may involve different trade-offs. Your typical Prestashop or Magento e-commerce site will still max out the DB and go down, cloud or not, but with the cloud you'll end up with a huge bill in addition to your downtime.
Engineering your application to be scalable is an option that's often not made for cost/time to market reasons which is fine, but in this case the cloud will give you much less scalability than a lot of people believe.
Their interests is keeping you as a long term customer. So, they will help you if they can. Unexpectedly high bills like that can end the relation in no time. And 10K is not a lot on a yearly basis. That's a few months of normal usage for lots of companies. So, protecting that revenue is worth something to them. That's also worth realizing when you deal with cloud providers: you are spending non trivial amounts of money on their services and support is part of that deal.
I've seen several cases on both Azure and AWS that bills got weaved after someone opened support ticket starting with "oops, I just did..."
He did not enable alerts.
The post applies to everyone and I’d second it. Ask nicely for a refund in these situations, the worst that can happen is they say no.
Where did they say that “only Troy Hunt shall receive a refund, for only Troy Hunt is a good faith actor, so say we all”?
Even if you run a relatively opaque cost structure business like a restaurant, you can still calculate the maximum cost of ingredients for one month, the salaries, energy, etc. if you simply use the "best case scenario" of having every seat at every table booked for all opening hours, with people ordering your most sold dishes. Cloud computing is still leagues above that in terms of cost predictability.
I once worked for small, non-startup software company who pondered moving servers to Azure. The Azure partner shop analysed the needs and came up with a monthly cost "between 30k and 120k per month". They were really surprised the company stuck with their non-cloud setup because "everybody is moving into the cloud!!"
If the restaurant suddenly ordered ten thousand times more ingredients than usual, their supplier would probably call back and say "is that really what you want?" rather than just shrugging and shipping them tonnes of tomatoes with a bill for one billion dollars.
Though in those cases the billing isn't really complex or opaque, and you _can_ monitor it if you care to check your meter regularly throughout the month. But, for the electrical case anyway, you can't drill into what exactly is consuming watts without either fancy monitoring equipment or potentially tedious investigation.
In contrast, with the cloud the bill is directly proportional to the amount of inbound requests from the Internet, with no out of the box way to implement a limit (I guess you could install Apache/Nginx and enforce a limit there, but doesn't that kinda defeat the whole point of the cloud?).
Just ask Texans.
- when we were hit with very high traffic due to a bug or something else, most of the time it would lead to customer outages. Based on the contract some times it requires to pay back because SLAs were not reached. Also an outage could lead to customers canceling the subscription.
We swapped one type of problems with another.
But the overprovisioned server might still be a lot cheaper than the cloud bill. It can be totally reasonable to have a server running at 1-5% load 98% of the time if you really need the capacity for the remaining 2%.
Also, neither "scaling up" as in "re-deploying the same setup on a beefier instance" nor "scaling out" as in "let's expand to the US and have a server there" is too difficult if the setup is automated (Ansible).
Credit card chargebacks, especially.
The only thing that would really help were a hard spending limit that stops all services except storage. If your site is important there will be such an amount of user feedback that it is impossible to miss it for a long time.
Or they can fail completely.
And the alerts themselves cost if you want something reliable so you have to weight that against the danger. Pay as you go cloud can be a maze of costing concerns..
> The only thing that would really help were a hard spending limit that stops all services except storage.
Yep. Though that is small comfort if you need to guarantee more than a couple if 9s of uptime, hopefully those with that requirement can soak up the unexpected billing blips.
Sadly, I haven't found a way to do that with AWS
Actions weren't there last time I checked (few years ago).
Any enterprise will not want any limits because of spends, they would be lot more pissed if service was pulled because spending cap set by someone sometime in the past is now exceeded. Likely is why such feature is optional not mandatory.
Excess/unexpected billing would be negotiated in typical sales cycle discussions. Making a default hard cap however would result in a lot of senior people are going midnight calls for emergency budget approvals, management would get annoyed by that.
AWS has decent tools in this regard, but it pales compared to Oracle. Azure is a product I've never used with any scale (just small projects), but the fact that it actually costs money to setup alerts is gross (and morally reprehensible). Even if it's a trivial amount, that alone just sours the product in my eyes. I mean, already Azure is pretty uncompetitive unless you're running on free credits, as Troy apparently is (purportedly some $13K per year, so unsure what the pitch for donations to cover a bill is about).
Oracle Cloud has an enticing free tier, but I'm too afraid to use it because it requires a credit card and I don't see any way to put a monthly cap on my budget. (I'm sure hobby projects with ~$5 - 10/month budgets isn't their target market, but I can dream :)
Edit to add the page I was reading: https://docs.oracle.com/en/cloud/get-started/subscriptions-c...
I think that the cloud provider business model that allows for uncapped maximum costs is a bit of a commercial dark pattern. What makes it somewhat more nefarious is that it is relatively easy to blame the customer.
I’m not surprised that the cloud providers are quick to refund users as it’s likely that they only do it in a fraction of cases and it buys a lot of goodwill.
It would be interesting to try and design a cloud that supports OutOfMoneyException’s with gradual degradation and capped liability for costs built in.
I don't actually believe so. Cloud providers are known to refund bills incurred by mistake. They make so much margins on legitimate usage by big companies & startups that it's just not worth burning developer goodwill & potentially waste efforts trying to collect a bill the customer legitimately can't pay (and will guarantee he will never use nor advocate for your service again).
https://azure.microsoft.com/en-au/pricing/details/bandwidth/
My conclusion Troy still doesn't know how much he is paying.
Actually, wow it seems AWS is also the same price as Linode and DO for egress. While Linodes and DO do come with decent free bandwidth, this is a surprise to me.
AWS charges $0.09/GB, and Azure charges $0.0875/GB.
Maybe Troy Hunt gets a discount for being a Microsoft Regional Director and MVP. (Neither of which make him an employee of Microsoft, confusingly enough.)
https://docs.digitalocean.com/products/billing/bandwidth/
https://www.linode.com/docs/guides/network-transfer/
https://aws.amazon.com/ec2/pricing/on-demand/
https://azure.microsoft.com/en-us/pricing/details/bandwidth/
https://azure.microsoft.com/en-au/pricing/details/bandwidth/...
The AUD $0.014/GB is only for data transfer between Availability Zones.
When everything was moved to production, URL went live, nobody ever did any kind of bandwidth checking, caching, no CDN, no cost tracking. $10,000 in our first week. That's about 1/4 what our total spend on the co-located servers was for the whole year. Boss flipped his lid and wanted to kill the new guy who was on the project.
After about 2 years we got rid of all the co-located stuff and were spending about 1.5x, but we had more apps, they served heavier pages, etc.
We overspent quite heavily on our on-prem stuff for a game I helped launch, for political reasons the next game ended up running on the cloud.
The price was roughly 10x before discounts. With our heavy discounts and a wide amount of slimming down/cost optimisation (easily 3 months of work) we got it to 2.3x
There will always be a need for sysadmins/cloudops/devops for that environment, so we didn't save any headcount either.
I can't imagine getting anywhere close to parity in costs, Functions-as-a-service ended up costing more than compute instances too so we went back to compute instances in places where we thought we'd get away from it.
That said, it was a lot nicer to use!
It is important to remember that not all cloud providers participate in it. For example, in Hetzner Cloud, they explicitly provide the maximum amount you are going to pay for a given instance or service in a given month. You are guaranteed not to pay more. Everybody knows why Amazon etc. refuses to do it this way.
"Your account has exceed $100 spend. Reply 'SHUTDOWN' to shutdown all services, 'STOP ALERTS' to never see this alert again, or 'DOUBLE TRIGGER' to double the alert trigger value to $200."
$100 is arbitrary, it could be any nominal sum. The idea being that the user can double the alert each time they get it just from SMS. I bet 95% of users would double their alert limit to a comfortable point. The other ~5% will be power users who customize their alerts.
The idea that these companies couldn't know what limits customers want is kinda silly. We can use the same techniques for alerts that we use in algorithms for expanding vector storage, for example. We can "amortize" alerts, so to speak.
We have some recent case studies where we've successfully reduced cloud costs by 95%
https://www.cloudexpat.com/case-studies/
hi(at)cloudexpat.com - happy to help!
Or someone maliciously bypasses CF cache e.g. by parameters.
Cloud just is not suitable for any kind of volume egress. It's a death trap. Like going on vacation with data roaming enabled.
> I removed the direct download links from the HIBP website and just left the torrents which had plenty of seeds so it was still easy to get the data. Since then, Cloudflare upped that 15GB limit and I've restored the links for folks that aren't in a position to pull down a torrent. Crisis over.
Mice cried and stung themselves, but kept eating the cactus.
Most CDN providers have a lot of machines out on the edges of their networks, and it's understandable that they don't stuff these machines with large disks, likely preferring smaller faster SSDs. But this is a very common pitfall of CDNs that needs more attention, along with messaging on the dashboards and settings pages.
I've had problems with no warning on Cloudfront, Cloudflare, Bunny.net all from not realising that my files were beyond the CDN's cache size limit, but none of them seem to do a good job at surfacing this other than "talk to customer support".
Cloudfront does list the max size clearly in the limits and quotas page, though, and if you front your S3 bucket with Cloudfront, you could turn caching off and still get the discounted bandwidth out rates (S3 -> Cloudfront is always free, even if the file is fetched every time).
I see S3 is initial $0.09/GB, going down to $0.07 after 50TB or $0.05 after 150TB.
Cloudfront North America is $0.085 for first 10TB; but $0.110 and up for other regions. going down to $0.060 north america after 100TB, and okay $0.025 after 1PB. (but $0.050 and up in other regions even after 1PB).
So okay, Cloudfront gets cheaper egress at large scale, I guess. By about 50% though, not an order of magnitude, and could be much less depending on region.
A large bill is probably chump change for someone like Troy, for others it's a year or two of savings. The risk is not worth it.
AUD $0.014 is roughly USD $0.01. Which I thought was reasonable. But on [1] only "Data transfer between Availability Zones(Egress and Ingress)" cost $0.01. Do transferring from Azure to CF count as that? Other Internet egress (routed via Routing preference transit ISP network) starts at $0.08
I hope someone from Azure CS could give him a custom discount.
It is also worth thinking, the cost HIBP saved on Cloud / Serverless over the years could have wiped out ( if not more ) by this single incident.
[1] https://azure.microsoft.com/en-au/pricing/details/bandwidth/...
To be clear - we would not have been able to catch this one right now :'(
Would love to hear thoughts / brainstorm ideas - is there any way we can proactively catch these types of cost spikes?
Setting up limits and alerts as part of the system creation is usually the best strategy.
Cloudflare has the same model, but they distribute the costs. The vast majority of people never use anywhere close to their share, so they subsidize the outliers and the free tier.
https://blog.cloudflare.com/the-relative-cost-of-bandwidth-a...
Convenience always costs money, there is no (big) cloud provider doing it out of their own pocket or rather not optimizing for huge profits.
It's the same as with any other service, really. So I don't understand, why some people assume it would be different here.
(Note: I am not saying that Troy Hunt assumed this, but I know people who go to the cloud because "It's cheaper". It was never cheaper, on no project I worked on. It was more convenient, but in the end it was more expensive mostly)
No matter what the traffic is, The first thing to do with any cloud service provider is to set the budget alerts according to our wallet, be it one with credits or otherwise. At this point, I don't even try any new cloud service provider who doesn't offer credible budget alerts.
Another key takeaway is,
> Huh, no "CacheControl" value. But there wasn't one on any of the previous zip files either and the Cloudflare page rule above should be overriding anything here by virtue of the edge cache TTL setting anyway.
Even this could blow up. All cloud service providers set the "CacheControl" to "No" and if we would want to cache something which is not cached by CF by default e.g. *html using Page Rules then we need to set CacheControl (e.g. max-age) at the cloud service provider end too.
P.S. I've written about these recently on my blog titled 'Saving Cloud Costs'[1] from a frugal solopreneur PoV.
It's my opinion that it's better to work with known limitations and optimize for them.
In the case of bandwidth, work with a fixed pipe size, or do the math and set up a QoS that implements a throttle to avoid exceeding your bandwidth allotment.
This happened to Murfie a couple of years ago, and that's why I had to step in to try to fix things. I'm still trying, and there are still challenges, but I won't allow landlords and cloud costs to disrupt things again.
It's time for cf to work a bit on its UX
Uh no - it's on cloudflare and azure. Why don't they have a global setting that says Max Charges Per Month: $X and it just shuts down when it hits that number? This is why I don't really like using big cloud services like this.
Turns out I wasn't setting x-ms-cache-control when writing all the blobs, so that's a win right there.
(interestingly, it appears that rclone, which I was in the process of moving to, doesn't do that, so I might have to keep my custom Azure storage library around)
$ shard="$(echo "${sha1}" | cut -c 1)"
$ cdb -q pwned-passwords-v8-sha1-${shard}.cdb "${sha1}"
But as a cloud evangelist at Microsoft, you may sing the corporate IT gospel anyway.
¹https://mro.name/agakdfaHe should have known better that there is a risk, that you don't know some detail that costs you a lot of money.
Cloud Bandwidth is soooooooooo expensive. If there is a risk that you have to pay this, please us a provider like Hetzner with fixed costs. If you like your serverless things, just host the big files at Hetzner.
It's suspicious that cloud providers STILL don't have any sort of "circuit breaker" infrastructure for this sort of thing - yes, you can set up alerts, but you can't say, "shut the whole thing down before the costs go above a certain threshold".
Kind of sad that service we are accustomed to using, various software integrates it (whether using HIBP API or downloaded pwned passwords archive) - is on a shoulder of single guy that now has to pay for his mistake.
Great that Cloudflare helps him with the service, otherwise who knows if we had access to HIBP at this scale?
Fact is that stuff like this can happen. Consider how many variables are in play to determine the final cost of a cloud service it is very much a double-edged sword. Sometimes you cut yourself unintentionally.
So now we all learn from this, I suggest we help him out.
If you're not Troy Hunt or another celebrity with special access to Cloudflare -- I don't think you really have access to Cloudflare to do a lot of work with you to ensure that your data gets cached and your egress is minimal, for large files on a very cheap cloudflare plan. (Based on the costs reported by Hunt as catastrophic, I don't think he's paying cloudflare for a large enterprise plan)
(Also, it's unclear if caching large data like this is even within the ToS of Cloudflare?)
I don't think Cloudflare promises to cache any particular URLs for any particular amounts of time (except no greater than cache headers etc; but they don't promise never to evict from cache sooner; they evict LRU according to their own policies). Cloudflare's marketed purposes include globally distributed performance, and security. I don't think they include "saving egress charges by long-term caching your data".
I have a much smaller project, but egress charges for data are an increasingly large part of my budget. I've been trying to figure out what if anything can be done about it. I wish I had a guaranteed way to get ultra-long-cache promise-to-be-within-ToS for very large data files from Cloudflare for a affordable fixed-rate price. (Maybe I do? But just haven't reassured myself of it yet?)
> In desperation, I reached out to a friend at Cloudflare… I recalled a discussion years earlier where Cloudflare had upped the cacheable size… Since then, Cloudflare upped that 15GB limit…
Since I'm looking for solutions for this same problem (delivering lots of data at very cheap prices), I am finding myself a bit annoyed that Hunt is talking about how he solved it, using tools/price-levels not available to most of us who don't have his level of access due to position.
Interestingly, MSN/Azure is part of the "Bandwidth Alliance" with cloudflare, which initially one thinks means there are no egress charges when delivering to cloudflare. (That is what it means for some other alliance members like backblaze). But that's clearly not the case or this story wouldn't happen, right? Turns out Azure gives you a fairly small egress discount when delivering to cloudflare, and only if you set things up in a non-standard way.
as long as your human admin costs are lower then cloud services
I get that everyone has an obsession with dirt cheap providers instead of cloud solutions like aws/azure. But that doesn’t mean it’s better. Everything has pros and cons.
What would happen if a credit card limit was exceeded, a site would just stop working?
But still, couldn't help to get the following lasting impression after reading it: these days being able to click around the UIs of the cloud providers should be a billable skill by itself.
I took a good course on pluralsight about AWS and the first lesson was to setup a billing alert.
What will hard limits will do to your infra? You can't take down / suspend DBs, EC2s, etc... Just because you set a 1k USD limit and that's it.
Alerts are the 1st thing you should setup IMHO
I really appreciate the work that Troy is doing, but seeing much needed money ending up and Microsoft or Amazon leaves a bitter taste. I hope at some point it will become cool again to just rent a VM or dedicated server for small projects and stop throwing so much money at the already richest people in the world.
As far as I am concerned, I just don't understand why people use cloud services.
Edit: Consider this article, and Geoff's statement about Azure credits.
https://www.theregister.com/2021/04/21/microsoft_revokes_mvp...
I really don't understand the cloud craze. Everything is more complex to debug, more expensive, and more shitty in all the possible ways you can imagine. I mean i was not exactly a fan of the VPS craze 10-15 years ago, but at least it wouldn't automatically ruin your bank account whenever you got a little traffic.
Kudos to the author for having so much money (thousands in one month?!) to waste. I wish i did too :)
Cloud providers love it when people do this and are famously easy to talk to when you get an unexpected invoice high enough to require remortgaging your house to even begin addressing it, but I think unless you're working on a side hustle that inherently will need to run in the cloud regardless of scale or are experimenting with cloud technologies in an explicitly time boxed toy project, using cloud services is the financial equivalent of handing a hobbyist craftsperson one of these chainsaw angle grinder attachments that even professionals find hard to keep from bouncing into your body.
If you do want to use cloud services for anything you pay out of your own pocket, the first consideration should be cost management and monitoring. Your employer might have big enough pockets to shrug off a runaway compute instance you forgot about for a month, but that can quickly translate into money that can be anything from inconvenient to life altering if it comes out of your personal budget.
Or just stick with the free tier and make sure everything simply shuts down if you run out. Sure, a "bandwidth exceeded" error page might not get you as many upvotes on HN, Reddit or social media, but it also won't impair your finances.
I'm sure there becomes a point where cost of (hardware + maintenance + staffing) > (cloud + staffing), in which case sure crack on. But like you, I'll stick to a rented server for my stuff.
Their monthly cost is something between 0 and a few cents.
Stuff like Hertzner is fine, but if you know your way around AWS you realize have massive cost savings. Prob the same for Azure.
Finally, in many places 40 EUR for a pet project is actually a lot of money.
Also, as you can see in a screenshot on TFA: Some services are simply dirt cheap. The storage account and its various “sub-services” is such a thing. It’s hard to compete with dedicated hardware here.
Depending on your dedicated hosting provider, the traffic cost trap exists, too. Hetzner is a bit of a special case.
He has a writeup here on how he gets costs down in a big way: https://www.troyhunt.com/serverless-to-the-max-doing-big-thi...
Current workplace is considering a fully self-hosted stack as a unique selling point for the customers and segments we're in. That means, we have storage and linux admins available, as well as tooling and know-how how to run this securely and efficiently. Thus, placing large and often downloaded files on our file stores at hetzner is very much a no-brainer, because it adds very little workload to the teams maintaining these stores and it's cheap.
However, this can be a daunting thing if you don't have this skillset in the org. It can be learned, but that's time spent not working on the product (and it's not trivial to learn good administrative practices from the hell that google results can be). At such a point, a cloud service just costs you less man-hours. And again - it wouldn't be much time for me, but it would be a lot of time if you had to figure all of that out on the fly. That's essentially why the saying goes that cloud services save you time, but cost money.
1, when they need to adjust rapidly between different resource usage profiles, e.g. because they are growing rapidly and can't predict what the usage will be X days in advance
2. They have huge resource requirements and don't care to invest in their own infrastructure, but can negotiate lower rates with a cloud provider
3. When their resource usage is modest but profitability is high enough that cloud expenditure is a rounding error
A month later a ntp security vulnerability was discovered, soon the server was put offline, some 'patch your things asap' not so nice emails came in. From that time my take is one should spend some time probably daily on an own server if one wants to mantain it.
I’m a huge Hetzner fan, and their cloud offering is definitely growing but still isn’t as convenient and featureful as it could be (and they don’t share their roadmap currently so hard to tell what they’re working on next).
I’m trying to do something about it though, working on Nimbus Web Services[0]. In my mind all we need is something to bridge the managed services gap and make it very easy to set up the basic 3 tier app with some amount of scale/performance elasticity!
[0]: https://nimbusws.com
- Security Information and Event Management - exports, alerts, OS configuration
- OS/Application Hardening - Encryption, Password/keys rotation, CIS/other baselines, Drift Management
- Backup - Encryption, (don't forget your passwords/keys are changing), retention, data protection compliance, monitoring, alerting, test days
- High Availability - replication, synchronisation, monitoring, alerts, test days
This is just the tip of the ice berg, if you operate in an environment where Insurance, Reputation, Regulatory Compliance, certification, etc.. are important, then it's easy to see why PAAS solutions are desirable.
This recurring question of "why AWS/Azure instead of Hetzner/OVH ?" keeps happening because people are incorrectly comparing higher-level PaaS to lower-level IaaS without realizing it.
PaaS != IaaS are not equivalent. IaaS is not a direct drop-in replacement for PaaS to save money if the workload is using PaaS features that IaaS does not include.
The author Troy Hunt is using the higher-level Azure services like Table Storage (like AWS DynamoDB/SimpleDB) and Azure Functions (like AWS Lambda), and others. E.g. One of the article's hyperlinks talks about using Azure Functions.[1]
If he used Hetzner, he'd have to reinvent the Azure services stack with open-source projects (some of which are buggy and immature) and expend extra sysadmin/programming work for something that's not as integrated. The Azure/AWS stack includes many desirable housekeeping tools such as provisioning, monitoring, routing, etc which he'd also have to re-invent.
TLDR: People choose Azure/AWS because it has more features out of the box. You just have to figure out on a case-by-case basis if the PaaS value-add makes financial sense for your particular workload.
EDIT to downvoters: if Hetzner actually has built-in equivalents to AWS Lambda and DynamoDB, please reply with a correction because I don't want to spread misinformation.
[1] https://www.troyhunt.com/serverless-to-the-max-doing-big-thi...
I use the credit card of my employer. For my own projects I use my own server for everything. Granted, it doesn't get much traffic.
Some offers from cloud providers are pretty good. If you want to scale to more (virtual) machines, it can be more easily done with the usual providers. I also expect Amazon to know more about firewall and reverse proxy configuration, it renews my certificates automatically and has rudimentary services for monitoring of server state. There is a certain convenience to it.
Would I recommend cloud based hosting? Absolutely not. You become dependent on the provider and prices are often steep. Even if you do not know much about server security, your unsecured s3 bucket will be far more exposed than your standard db installation on your own server. Better build expertise for systems you have full control over than to invest the time on the details of AWS which are more subjected to change.
For companies the benefits are the abiltiy to get new servers at a click of a button and get rid of a server. For example, asking the ops team to setup a snapshot of a database for a few hours while I do something is super useful.
There is also the ability to use autoscale and other stuff to automagically scale your system to handle traffic peaks. With dedcicated servers you need to always have those resources available. It's attractive to managers that they're only paying for resources when they're using it.
There are also managed services like DynamoDb, Lambda, S3, etc that can make things easier and reduce your sysadmin work. And allow you to get up and running very quickly.
Obivously, a major downside is that the pricing is extremely vulnerable to spikes like this. I think we see an article like this every 3 months or so. This one is rather tame compared to some others that were 10x as much for a 24-hour period.
Well that's the first issue. Many people have automated large parts of their infrastructure in this way so that distributing one huge file becomes part of that whole mess. The goal is of course to keep costs down to a minimum. You can actually do a lot with little money using cloud services.
But the careful balance is that you can easily miss little details. But how does that differ from any systems administration? The details are just in new areas that didn't exist 5-10 years ago.
And the details you miss are more likely to increase cost. And when you process a lot of traffic, you're popular, that can go real fast.
20 years ago in hosting we might get a porn stash on a hacked NT4 server that would draw bandwidth. And back then a whole company might have 100Mbit fiber so you'd notice.
The reason to use cloud-style services is so you can focus on building the product quickly instead of building and maintaining architecture. But once the product is stable, a cost-reduction pass is in order.
To handle that day of getting 1 million customers, which you've been forever optimising for.
Any.. day.. now...
From the article:
This was about AU$350 a day for a month… priced at AU$0.014 per GB
A company could not stay in business if every one of their “unlimited 1 Gbps” customers for €40 per month actually used that bandwidth.
Take a look at their datacenter in Germany: https://www.youtube.com/watch?v=5eo8nz_niiM