We had about 25 Dell R730xd servers. When the cluster would start to fill up, we would just replace drives with larger drives. Upgrading drives with SwiftStack is a piece of cake. When I left we were upgrading to 10TB drives as that was the best pricing. We didn't buy the drives from Dell as they were crazy expensive. We just bought drives from Amazon/New Egg, and kept some spares onsite. We got a better warranty that way too. Dell only had a 1 year warranty, but the drives we were buying had a 5 year warranty.
Idk what your team’s expertise is, but I’d advise avoiding the cloud as long as possible. If you can build out an on-premise infrastructure, it will be a huge competitive advantage for your company because it will allow you to offer features that your competitors can’t.
Examples of this:
- Cloudflare built up their own network and infrastructure and it’s always been their biggest asset. They set the standard for free tier of CDN pricing, and nobody who builds a CDN on top of an existing cloud provider will ever beat it.
- Zoom. By hosting their own servers and network, Zoom is similarly able to offer a free tier where they are not subject to variable costs from free customers losing them money on bandwidth charges.
- WhatsApp. They scaled to hundreds of millions of users with less than a dozen engineers, a few dozen (?) servers, and some Erlang code.
IMO defaulting to the cloud is one of the worst mistakes a young company can make. If your app is not business critical, you can probably afford up to a day of downtime or even some data loss. And that is unlikely to happen anyway, as long as you’ve got a capable team looking after it who chooses standard and robust software.
And cheap.
If you put people in charge who are looking for ways of expanding their empire and budget through spending money on EMC/VMWare/Oracle/etc/etc then you can quickly wind up spending a lot more money.
Simplistic network designs, simplistic server designs, simplistic storage designs with mostly open source software used everywhere can be highly competitive with Cloud services.
Mostly all that Amazon did to create AWS/EC2 was to fire anyone who said words like SAN or EMC and do everything very cheaply using open source software, and evolved away from Enterprise vendors and towards commodity hardware.
If you make "frugality" a core competency in your datacenter design like Amazon did, then you can easily beat the cloud.
You also need to have [dev]ops people who are inclined to say "yes" to the business and who know how to debug things and can operate independently of needing to phone up EMC.
Buy storage servers from 45drives they basically build same hardware as Backblaze uses. Add copper 10G nics to the servers.
Get necessary switches 10G with 40G uplink ports. Whatever your favorite. Use 10GBaseT to the servers.
Install hardware in a quality data center. Like one of theirs -
https://www.digitalrealty.com/
And get 10G virtual cross connects to AWS.
Back of the envelope calculation you need 30TB raw, so about 60 servers. They aren’t really that power hungry so 10 per cabinet. 6 cabinets. at least 6+2 switches.
Software wise you have lots of options with this infra. High upfront cost but low MRC vs all other options. Assuming you have skilled sys admins who know what they are doing.