What part of the cost gets out of hand? Having to have a Machine for every process? Do you remember what napkin math pricing you were working with?
For example, I could get a digitalocean vm with 2gb ram, 1vcpu, 50gb storage, 2tb bandwidth for $12/mo.
For the same specs at fly.io, it'd be ~$22/mo not including any bandwidth. It could be less if it scales to zero/auto stops.
I recently tried experimenting with two different projects at fly. One was an attic server to cache packages for NixOS. Only used by me and my own vms. Even with auto scaling to zero, I think it was still around $15-20/mo.
The other was a fly gpu machine with Ollama on it. The cold start time + downloading a model each time was kind of painful, so I opted for just adding a 100gb volume. I don't actually remember what I was paying for that, but probably another 20/mo? I used it heavily for a few days to play around and then not so much later. I do remember doing the math and thinking it wouldn't be sustainable if I wanted to use it for stuff like home-assistant voice assistant or going through pdfs/etc with paperless.
On their own, neither of these are super expensive. But if I want to run multiple home services, the cost is just going to skyrocket with every new app I run. If I can rent a decent dedicated server for $100-$200/mo, then I at least don't have to worry about the cost increasing on me if a machine never scales to zero due to a healthcheck I forgot about or something like that.
Sorry if it's a bit rambly, happy to answer questions!
I would be curious how the Attic server would have gone with a Tigris bucket and local caching. Not sure how hard that is to pull off, but Tigris should be substantially cheaper than our NVMes and if you don't really NEED the io performance you're not getting anything for that money. Which is a long winded way of saying "we aren't great at block storage for anything but OLTP workloads and caches".
One thing we've struggled to communicate is how _cheap_ autosuspend/autostop make things. If that Machine is alive for 8 hours per day you're probably below $8/mo for that config. And it's so fast that it's viable for it to start/stop 45 times per day.
It's kind of hard to make the thing stay alive with health checks, unless you're meaning external ones?
We are suboptimal for things that make more sense as a bunch of containers on one host.
I'll have to look at autosuspend again too. I remember having autostop configured, but not autosuspend. I could see that helping with start times a lot for some stuff. It's not supported on GPU machines though, right? I thought I read that but don't see it in the docs at a quick glance.
> It's kind of hard to make the thing stay alive with health checks, unless you're meaning external ones?
Sorry, I did mean external healthchecks. Something like zabbix/uptimekuma. For something public facing, I'd want a health check just to make sure it's alive. With any type of serverless/functions, I'd probably want to reduce the healthcheck frequency to avoid the machine constantly running if it is normally low-traffic.
> We are suboptimal for things that make more sense as a bunch of containers on one host.
I think my ideal offering would be something where I could install a fly.io management/control plane on my own hardware for a small monthly fee and use that until it runs out of resources. I imagine it's a pretty niche case for enterprise unless you can get a bunch of customers with on-prem hardware, but homelabbers would probably be happy.