When it comes to companies I mostly support cloud these days but when it comes to me and my family I accept every downside and host as almost all of our digital lives in a 42u rack in a gutted closet in our house with static IPs and business fiber.
I know where our data lives and no one can access it without a warrant and my explicit knowledge. I also save myself several hundred a month in third party cloud provider fees to host the same services and can reboot upgrade or repair anything whenever I want, but in general no more maintenance than cloud servers . I also never end up with exciting bills when experiments are forgotten about.
You pretty much get all the pros and cons of home ownership. For me it is mostly pros. Also keeps me dogfooding all the same practices I recommend to my clients.
IIRC we ended up using it as a disposable replica for some non-real time but heavy operations.
API access for managing configuration, version updates/rollbacks, and ACL.
A solution for unlimited scheduled snapshots without affecting performance.
Close to immediate replacement of identical setup within seconds of failure.
API-managed VPC/VPN built in.
No underlying OS management.
(Probably forgot a few...) I get that going bare metal is a good solution for some, but comparing costs this way without a lot of caveats is meaningless.
My wife is quite concerned about her being left with a web of home automation, hosted emails etc. I understand her and I am trying to find a way out.
My current idea is to document how to de-automatize the home and how to deal with emails and fiber access (the main things to worry about).
Any ideas are very much welcome
Technically option 3 should be the best since it also engineers around generations (options 1 and 2 would roughly be locked to a narrower age group). But it also can be a double edge sword, what if they don’t like tech? Or if you overdo it with trying to make them interested in tech and self hosting and it backfires?
So yeah, no real solution yet. But I’d subscribe to that newsletter if there was one
Maybe some kind of script(s) that could be run that just do all the de-automation?
People who are experts in cars can own very expensive cars and tools to tune them.
People who have been working in music can have very expensive instruments and expensive headphones, microphones, sequencers, etc.
We seem to be looking down on experienced "computer experts" and wanting to take their tools away. It's been grinding my gears lately.
> You pretty much get all the pros and cons of home ownership.
And that's the heart of the matter. Everyone here is arguing in circles based on their feelings about cloud vs their feelings on bare metal (and from what I can tell, pride in their own abilities), but at the end of the day it's a cost-benefit tradeoff. Everyone picks that tradeoff for themselves. As my immigrant parents are getting older, I'm thinking of getting them a business internet line and a SIP phone in their house so that if they need me in an emergency (health or otherwise) they can reach me quickly/reliably. It's something I'm weighing based on the cost of the service/infrastructure/maintenance against my parents' technical and social capabilities (which are limited as older immigrants from a non-Western country.)
It's all about tradeoffs. As computing professionals we have the knowledge and the skills to uniquely take advantage of these tradeoffs in our personal lives. Much like I know quite a few cabinetmakers who do actually make furniture for their house with their skills, their tools, and the shop they work in. That doesn't stop most people from buying furniture from Ikea or most businesses from using cloud hosting.
Your main point still stands though, not everyone can or wants to do that.
Then there's a set of about six or seven $4 per month KVM VM at geographically distributed offsite 3rd party hosting companies for things like secondary authortative nameservers and such.
And an offsite backup system that mirrors everything on the system in colocation to a medium sized disk array that lives in the corner of a closet in a family member's basement.
I don't think systems administration is a lost art, people just wanted to do things easier for less money and staff, and no one can blame them or the industry as a whole. The art is still right there, linux hasn't changed that much in 20+ years, I still have my first discs. It also takes some experience to "feel" your way through systems administration problems, working from the bottom up of course, feeling like its a network problem still has a high rate of success when troubleshooting for me!
I try everyday to teach some of these skills to people I work with, I just call it devops, or SRE, or some cyber cloud support position someone makes up, they are all systems admins/engineers to me still. I enjoy watching people learn and apply that knowledge to future problems, getting to the root cause or close to it, and the satisfaction that comes from fixing the issue from start to finish on your own, looking things up is not cheating in systems administration!
[0] https://raspberrypi.stackexchange.com/questions/135610/conne...
[ 150.076220] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:468 dev_watchdog+0x308/0x30c
[ 150.076255] NETDEV WATCHDOG: eth0 (bcmgenet): transmit queue 1 timed out
Essentially Ethernet chucks a wobbly until reboot and it's only possible to connect via Wi-Fi.I run my stuff from home too, though it is smaller scale than yours currently. Off-site & soft-offline backups are on encrypted volumes on servers & VMs elsewhere.
I get my static IP by using the smallest VPS from vultr with a wireguard tunnel forwarding http/s traffic to a docker container running nginx proxy manager.
For those wanting to learn, I highly recommend joining r/homelab and r/selfhosted. Those communities have a lot in common and you can learn a lot.
Most modern routers have an option to integrate with dynamic dns providers.
Knowing how to keep your server running and understanding internals is a great skill, but that doesn't mean that progress should stop.
Standalone servers are great, but this greatness comes at a price. It takes time to maintain server, it takes time to configure additional services. But at the same time they bring you joy (and frustrations) and much more knowledge and deeper understanding of what goes under the hood.
I'm gonna run my private cloud merging 3 different un-backed up physical computers, and migrate services off Google.
That's my second free 42U track, but the other was mostly used as shelf space. I've got a third rack rusting in my backyard which I bought for locally for $200, originally intended to run my former employer test infra, that I brought back home after they laid us off.
I have venting into my ceiling which connects outside but I am mostly using consumer gear like Raspberry pis and Intel NUCs and the rack fully fills the doorway so volume and heat is not a real issue.
As far as you know?
Your data is exposed to The Internet so someone could be accessing it.
Who will keep maintaining your infra when you die, say you get hit by the infamous bus later today?
I however mostly detest how almost every third party SaaS sells out our metrics and data for profit, long term consequences be damned. The less I empower profit maximizing machine learning to manipulate me and my family the better.
I do not even need to trust my friend as duplicity encrypts all data against a yubikey held pgp keychain before it leaves.
The backup NAS could phone home from anywhere with internet access.
I'm pretty sure most people sysadmin'ing their Linux servers are actually doing it with rented dedicated servers. TFA btw specifically mentions: "don't manage physical hardware". Big companies like Hetzner and OVH have hundreds of thousands of servers and they're not the only players in that space.
They don't take care of "everything" but they take care of hardware failure, redundant power sources, Internet connectivity, etc.
Just to give an idea: 200 EUR / month gets you an EPYC 3rd gen (Milan) with shitloads of cores and shitloads of ECC RAM and a fat bandwith.
And even then, it's not "dedicated server vs the cloud": you can have very well have a dedicated server and slap a CDN like CloudFlare on your webapp. It's not as if CloudFlare was somehow only available to people using an "entire cloud stack" (whatever that means). It's the same for cloud storage / cloud backups etc.
I guess my point is: being a sysadmin for your own server(s) doesn't imply owning your own hardware and it doesn't imply either "using zero cloud services".
Here is my Wireguard server (cheap VPS) and client (my home servers) config:
# # Client (the actual self-host local server) #
[Interface] ## This Desktop/client's private key ## PrivateKey = redacted
## Client ip address ## Address = 10.10.123.2/24
[Peer] ## Ubuntu 20.04 server public key ## PublicKey = redacted
## set ACL ## AllowedIPs = 0.0.0.0/0
## Your Ubuntu 20.04 LTS server's public IPv4/IPv6 address and port ## Endpoint = redacted:12345
## Key connection alive ## PersistentKeepalive = 15
# # Server (in the Wireguard context, exposed to the Internet) #
[Interface] ## My VPN server private IP address ## Address = 10.10.123.1/24
## My VPN server port ## ListenPort = 12345
## VPN server's private key i.e. /etc/wireguard/privatekey ## PrivateKey = redacted
PostUp = iptables -i eth0 -t nat -A PREROUTING -p tcp --dport 80 -j DNAT --to-destination 10.10.123.2 # Add lines for more ports if desired
PostDown = iptables -i eth0 -t nat -D PREROUTING -p tcp --dport 80 -j DNAT --to-destination 10.10.123.2 # Add lines for more ports if desired
[Peer] ## Desktop/client VPN public key ## PublicKey = redacted
## client VPN IP address (note the /32 subnet) ## AllowedIPs = 10.10.123.2/32
Enter big-three cloud egress pricing. Designed to make sure that you have to go all-in.
Their prices have spiked because of electricity costs, the entry level used to be around 20 euros. They come with 1Gbit
Running your own server is an investment that doesn't make sense for everyone. If you can get it, it is better than you might imagine. Being in full control--the master of your own destiny--is so liberating and empowering. It feels the difference between constantly ordering Lyft/Uber/riding with friends, vs. owning your own car.
Not to mention, again, my hardware resources are so much better. This one server can run multiple profitable SaaS apps / businesses and still have room for experimental projects and market tests. Couldn't be happier with my decision to get off the cloud.
And when some component of the server fails, your app is unavailable until you can repair it. So you need another server for redundancy. And a load balancer. And a UPS. And a second internet connection.
If your app is at all critical, you need to replicate all of this at a disaster recovery site. And buy/run/administer DR software.
And hardware has a limited lifespan, so the $3000 was never a one-time investment.
I think there is often still a case to be made for self-hosting but the numbers are not as rosy as they seem at first glance.
This sounds expensive if your talking one server and vs a year of AWS charges, but is a tiny bump if it turns out you need to buy a dozen servers to replace a large AWS bill.
Plus, I think most people underestimate how reliable server grade hardware is. Most of it gets retired because its functionally obsolete, not because a power supply/whatever fails. Which brings up the point, that the vast number of failures with server grade hardware are on replaceable components like power supplies, disks, SFP's, etc. Three or four years out those parts are available on the secondary markets frequently for pocket change.
Also "if some component fails or the app is critical" has a lot of nuance, I agree with your sentiment but you should know:
1) Component failures in hardware are much rarer than you think
2) Component failures in hardware can be mitigated (dead ram, dead PSU, dead hard disk, even dead CPUs in some cases: all mitigated) The only true failure of a machine is an unmitigated failure due to not configuring memory mirroring or something' or a motherboard failure (which is extremely uncommon)
3) The next step after "single server" isn't "build a datacenter", it's buying a couple more servers and renting half a rack from your local datacenter, they'll have redundant power, redundant cooling and redundant networking. They'll even help you get set up if it's 2-3 machines with their own hardware techs.
I do this last one at a larger scale in Bahnhof.
also, $3000 will get you about 3-5 years out of hardware, at which point, yeah, you should think about upgrading, if for no other reason than it's going to be slower.
So you have some downtime. Big deal. If this happens once every few years and you need a day to repair it, your uptime is still better than AWS.
Not just everyone hosts a realtime API millions of users depend on every second of the day.
If you are trying to go commercial you might have a different attitude but for those of us who do this mostly for fun and for some donations on the side, over complicating our setups to ensure we add a 10th of a percent to our uptime stats just isn't worth it.
Most applications don't actually have a four 9s uptime requirement. Like, how many otherwise healthy businesses closed up shop during the cloud providers we've seen in the last year because they didn't have their stuff implemented and deployed such that it would remain fully functional when these issues happen?
https://news.ycombinator.com/item?id=13198157
On one meeting we had a typical discussion with ops guys:
- "why wouldn't we optimise our hardware utilisation by doing things a, b, and c."
- "hardware is crap cheap these days. If you need more capacity, just throw more servers at that"
- "is $24k a month in new servers crap cheap by your measure?"
- "comparatively to the amount of how much money these servers will make the same month, it is crap cheap. It is just a little less than an annual cost of mid-tier software dev in Russian office. We account only 12% increase in our revenue due to algorithmic improvements and almost 80 to more traffic we handle. A new server pays back the same month, and you and other devs pay off only in 2 years"
Performance is a complex, many-faceted thing. It has hidden costs that are hard to quantify.
Customers leave in disgust because the site is slow.
No amount of “throwing more cores at it” will help if there’s a single threaded bottleneck somewhere.
Superlinear algorithms will get progressively worse, easily outpacing processor speed improvements. Notably this is a recent thing — single threaded throughout was improving exponentially for decades so many admins internalised the concept that simply moving an app with a “merely quadratic” scaling problem to new hardware will always fix the problem. Now… this does nothing.
I’ve turned up at many sites as a consultant at eyewatering daily rates to fix slow apps. Invariably they were missing trivial things like database indexes or caching. Not Redis or anything fancy like that! Just cache control headers on static content.
Invariably, doing the right thing from the beginning would have been cheaper.
Listen to Casey explain it: https://youtu.be/pgoetgxecw8
You need to have efficiency in your heart and soul or you can’t honestly call yourself an engineer.
Learn your craft properly so you can do more with less — including less developer time!
Basically, I think cloud provides a loooooot of details that you have to now take on yourself if you self-host (at least if you want to do it "legitimately and professionally" as a reliable service). It's not clearly a win-win.
That all said, I recently canceled my cloud9 dev account at amazon because the resources I needed were getting too expensive, and am self-hosting my new dev env in a VM and accessing it from anywhere via Tailscale, so that's been nice.
So yes, for those of us who have done Systems Administration as a lifestyle/career, yeah you do all of those things and it's part of the fun. I started doing OS upgrades, monitoring, firewalls, and home backups of my own Linux Servers some time in High School. Over-utilization of bandwidth isn't really a "problem" unless you're doing something weird like streaming video, a 1Gbps circuit can support thousands upon thousands of requests per second.
How do you handle traffic spikes, especially from the networking point of view? What kind of connection do you have? How do you make your service as fast for all customers around the world (saying you have a succesful Saas). How do you prevent a local blackout from taking down your service? Where do you store your backups, in case your building gets flooded or your machine blows up? What would you do in case a malicious process takes over the machine? These are some things that are managed in a cloud environment.
I understand investing in a datacenter rack where you own your hardware, if you have the skills, but running it in a home office cannot support a successful business nowadays IMO.
You need to disassociate yourself from the start-up mindset when you DIY a side project app or site. Having said that, there are ways to cache and improve your write performance and maintain HA on a budget. The only thing that's hard to replicate in self-hosting is a high performance global presence.
To be fair, I'm not 100% off the cloud. Backups are on an hourly snapshot thru Restic https://restic.net/ and stored in Google Cloud Storage off-prem in case of catastrophes. Also, my Postgres database is hosted in Cloud SQL because frankly I'm not feeling experienced enough to try hosting a database myself right now.
It's really not as unrealistic as most people seem to think. People have been building online businesses for years without the cloud. Believing it's suddenly not possible is just their marketing going to work for them making them new customers imo.
I don't know about GP but managing your own server doesn't mean you cannot use a CDN with your webapp.
I have a homelab too but getting “enterprise grade” service from comcast seems to be my biggest barrier to scaling without leaning on aws.
Comcast doesn't actually change your public IP address between DHCP renewals and thus it's effectively static. The only time that it'll change is when the modem is powered off for an amount of time, or the upstream DOCSIS concentrator is powered off for maintenance or otherwise.
Especially when someone else is making money decisions. Of course in horror stories admins have to run their servers in broom closets or in sheds because business owner is too cheap to get a proper space for something that whole company is running.
Most businesses have nightly cronjobs generating some kind of report that is then emailed to stakeholders. Why on Earth would you run a dedicated Linux box for that anymore? Glue a nightly trigger to AWS Lambda, send the report via AWS SES, and it's free. Literally, because it fits quite easily within the free plan. No $5/month VPS box, no patching, no firewalling, no phone calls from execs at 6 AM wondering why their report isn't in their inbox and you track it down to the server being down when the cronjob was supposed to fire.
With that said, if you come to me and tell me what you want to add a new business feature to stream video for our customers off AWS, I'll first ask you why didn't you tell me you won the lottery, then I'll berate you for being stupid enough to spend your lottery winnings on the AWS bill.
Pick the right tool for the job.
You are forgetting the future cost of losing knowledge and control over your infrastructure.
This is such a tired expression. It basically means nothing in the industry, and exactly because of comments like yours.
Exactly who are you to say what my infrastructure desires are? Software is personal and people ignore this completely.
Measure twice, cut once. Fail fast is load of nonsense and burns a lot of money for no good reason.
Thankfully one person's cloud is another person's on prem infrastructure so sysadmin skills will always be in demand.
From my perspective in enterprise computing, I now see people taking 2 paths. One where they become super deep sysadmins and work on infra teams supporting large scale deployments (cloud or not) and the other being folks who write code and think of infra as an abstraction upon which they can request services for their code.
Both are noble paths and I just hope folks find the path which brings them the most joy.
At least in my experience, my hobby of maintaining my own home server helped out immensely in my path in the industry due to knowing what tools are available when working on multi-faceted software designs.
You don't even know if Sys Admin is doing any backups at all.
From perspective of a person that does not know anything about administering systems tweaking stuff in AWS is I would say a lot easier than setting up a server properly.
So people that know nothing about administering systems pay more because they don't have the knowledge.
If you have the knowledge then yes it is cheaper to run your own sever but what is obvious or easy for one person is not really true for someone else.
... and adding "cloud administration".
What is it with people doing completely one-sided analysis even when they experiment the thing by themselves? Is cloud administration less time consuming than system administration? That's not my experience, so I'm quite interested on how it got so.
This is off course the highly subjective meaning of a greybeard unixadmin.
While cloud may have a lot of advantages, I don't think it's trivial to run or manage. The AWS dashboard is simply overwhelming. Trying to decide between different, but overlapping services is time consuming. And while you can rebuild somewhat easily, you're also almost certainly going to have to do that as you learn about Amazon's little quirks. Your general RDBMS experience doesn't map to DynamoDB very well and you'll be in for a rough time when you learn you can't just add a new index or whatever.
Then you have all these provider-specific APIs. My experience with both Amazon and Google is that their services will return errors that they claim should be impossible, so you get to have fun debugging that in a service that you don't manage. Your application will invariably add a bunch of handling for exceptional cases and accumulate your best guess about how this impossible situation came about.
Then you have the constantly shifting devops orchestration tooling space and "best practices". I've lost track of the number of times I've needed to pull my Terraform state file and manually edit this gigantic JSON file because some plugin updated an internal struct in an incompatible way.
I'm sure there are people that get an environment up and running in one of IaaS platforms using just the web console. I've never seen it managed that way at any company I've been at. Instead, the devs own the IaaC stuff. It's certainly easier to be in multiple regions that way, but I have a hard time believing any time or money is really saved. Sure, no dedicated ops people, but now your expensive devs have to deal with and probably be on call. Moreover, all that archaic time-sucking Unix admin knowledge acquisition everyone is worried about is just replaced by time-sucking knowledge acquisition of a proprietary service and all its quirks.
Maybe we're talking about different levels of "cloud"? I can buy that Heroku is easier than AWS.
The amount of money set aflame is astounding.
You don’t go cloud to save money. You go cloud to get flexible and reduce capital expense. It’s like leasing a building vs buying. More about tax and accounting.
The cloud is also too damn expensive.
People are getting ripped off big time and don't want to be embarrassed by the truth, plane and simple.
However, every team I've been on recently has spent a lot of time struggling with gluing their AWS stuff together, diagnosing bugs etc. It didn't seem to save a heck of a lot of time at all.
I couldn't figure out AWS. But I could figure out how to host sites on a linux VPS.
So what's the story here - is serverless something that only makes sense at a certain scale? Because with tools like Caddy the 'old fashioned' way of doing seems really, really easy.
Other way around. With enough scale you should be able to make hosting your own datacenter work.
The problem is that the people you hire tend to go off buying too much Enterprise-class shit and Empire building and the whole thing winds up costing 10 times as much as it should because they want stuff to play with to resume stuff and to share risk with the vendor and have them to blame.
Only thing Amazon did to build out their internal IT ops exceptionally cheaply and eventually sell it as the AWS cloud service was to focus on "frugality" and fire anyone who said expensive words like "SAN". And they were ordered in no uncertain terms to get out of the way of software development and weren't allowed to block changes the way that ITIL and CRBs used to.
I didn't realize how difficult that would be to replicate anywhere else and foolishly sold all my AMZN stock options thinking that AWS would quickly get out competed by everyone being able to replicate it by just focusing on cheap horizontal scalability.
These days there is some more inherent stickiness to it all since at small scales you can be geographically replicated fairly easily (although lots of people still run in a single region / single AZ -- which indicates that a lot of businesses can tolerate outages so that level of complexity or cost isn't necessary -- but in any head-to-head comparison the "but what if we got our shit together and got geographically distributed?" objection would be raised).
I did not know about it until I googled it right now. I have spent days/even two weeks figuring out how to set up Nginx and for all I know I did it terribly wrong. I paired it with other tools that I do not even remember. But I would be starting from scratch again if I needed to set another one up.
So a lot might come down to that. I was on a team that transitioned from a owned server to cloud as one day one of the test servers went down and after a week of trying, nobody knew how to fix it. We realized at that point that if a server caused a production error, we were utterly screwed as someone who had left set it up and nobody had a clue where to begin fixing it beyond reading endless tutorials and whatever came up in Google searches.
The server infrastructure was cobbled together in the first place and for a period was theoretically maintained by people who didn't even know the names of all the parts.
At least with cloud, there is an answer of sorts that can be had from the support team.
- Small API that is used by a couple people everyday? Lambda.
- Need to store some data? DynamoDB.
- Need to store some files? S3.
- Cron? Step Functions.
- Need services to communicate with each other? SQS.
Those end up being either free or very cheap.
If you have high traffic, serverless is actually really expensive. It is only worth it if you have high scale but unpredictable / bursty traffic.
> However, every team I've been on recently has spent a lot of time struggling with gluing their AWS stuff together, diagnosing bugs etc. It didn't seem to save a heck of a lot of time at all.
I understand AWS so others don't need to and I'm making the money of my life. All 3 big cloud providers have really terrible developer experience and I feel really sorry for folks who just want to get their shit done.
I used to think some other player would come in and offer something much easier and simpler to take over this space, but I'm not really seeing any serious contender. Just low code / no code platforms and managed k8s stuff.
But what if the cost is $.0001 per request? It becomes a very convenient way to make all of my personal projects permanently accessible by hosting on S3 + Lambda.
Even in large workloads it makes sense. Much of AWS is migrating from instances to AWS Lambda. There are some workloads where persistent instances make sense, but a lot of common use cases are perfect for Lambda or similar serverless technologies.
Numerous times there's something weird going on and you're stuck trying to guess and retry based on largely useless logs until it somehow works better but you never really know what the root cause truly was.
Meanwhile on my own server I'll ssh in and have complete visibility, I can trace and dump network traffic, bpftrace userspace and kernel code, attach debuggers, there's no limit to visibility.
Yes lambda/serverless saves you a day or three in initial setup but you'll pay that time back with 10-100x interest as soon as you need to debug anything.
Your competitors would salivate at this statement, fyi. Speed is a competitive advantage. AWS is not "let's rent a big ball of EC2 servers and call it a day", and anyone who treats it like that is going to get eaten alive. If you have not looked at -- for example -- Dynamo, you should. If you have not looked at SQS, you should. The ability to have predictable, scalable services for your engineers to use and deploy against is like dumping kerosene onto a fire, it unlocks abilities and velocity that more traditional software dev shops just can't compete against.
I wonder how you folks manage to work with AWS and not hate it.
They can't be doing one-off undocumented config, package, and network/firewall changes which make it impossible to setup another server reliably. At $company I moved us to Terraform+Packer (to get them used to immutable deploys, but still just an EC2 instance) then Pulumi+Docker+Fargate so we could fix our deployment velocity. The CTO was constantly afraid everything would break; mostly cause it actually would break all the time. Now basically anyone can deploy even if they're not a SysAdmin.
That's not to say you can't automate a Pet Server, but it's a lot more likely for someone to "just once" make some changes and now you don't trust your automation. In our case we had SaltStack and we were blocked by the CTO from running it unless it was off-hours/weekend.
I'm completely missing. I have searched arpund and I have some solutions, but back in my head, some people have something else.
The part before Ansible or puppet to kick in.
It was so trivial to terminate and restart dozens of servers at any given time since unless there was a mistake in the cloud-init, we could bootstrap our entire infrastructure from scratch within an hour.
It was amazing, never had to deal with something missing on a server or a config being wrong in a special case. Dozens of hosts just purring along with 0 downtime since the moment anything became unhealthy, hosts would start auto-booting and terminate the old instance.
Senior sysadmins are really hard to come by today, not to mention someone who wants to do architecture also.
My hunch is that the 5000 onprem pet servers are not going away any day soon, because a massive amount of it is legacy systems that take a long time to migrate to cloud, if ever. Also the work stress is just ridiculous. So much stuff to do, even with automation. Only reason I still do this is that I like the "old school" tech stack vs. cloud IaaS/PaaS alternatives.
I am not so sure... I am a well seasoned sysadmin, been doing server, network, architecture. I consider myself a solid linux/network expert and have managed datacenters. When I look for a new/more exciting job, or for a pay raise, all I see are "cloud, AWS, devops". I never see "old school" sysadmin jobs e.g. as you say, we have a room full of linux boxes and we manage them with ansible/scripts/etc, but we design and maintain them ourselves, come join our team".
You don't need pet servers. Puppet or Ansible make your baremetal cattle.
However! When I spin up my own side projects. It is sooo much easier to just go into the command line and spin something up directly --- it does make me wonder whether some small amount of expertise can really change things. By the time your orchestrating AWS services, docker containers, kubernetes and more --- Would it have been so bad to run a 10 line bash script on few cheap VMs to set yourself up?
Even typing that, I realize how much time managed services saves you when you need it. Change management is really what those services offer you - even if a momentary setup is easier by hand.
Last week my choice was vindicated: I ran into a critical hardware issue on my linux instance which required a complete OS reinstallation. Wiped my server clean, and was back up and running in an hour. I feel much more secure in the fact that I KNOW I can spin up a completely functional version of my app on any Linux server in the world in less than an hour, rather than relying on opaque cloud backup/load balancers/serverless configs which could fail in unexpected ways, and are usually locked in to a particular vendor. As for a few hours downtime here and there, my business is designed to handle it.
Only thing I dislike is YML, which I think is yucky!
That was our 'perfect world'. Reality was different and we still have a lot of servers running stuff, but what we did push into K8s really reduced our operations workload and we're pretty happy about that.
After I discovered https://efs2.sh I switched over to this simple config management solution, which simply executes commands and scripts over ssh. It is so much simpler and faster (both in regards to creation and execution) than Ansible.
Get your hands a little dirtier installing a lightweight desktop environment like LXDE, programming language of choice & an IDE. Install VNC and you then have a cloud desktop you can code in from anywhere at the same time that it runs your personal website.
Cost: ~$5 per month and a bunch of good experience. Or just do it once as an exercise and cancel after a month.
This "cattle not pets" mentality doesn't make sense for everything and is highly inefficient if the OS itself seamlessly supports immutable workloads and configuration.
The worst thing I’ve had to deal with recently is debugging some faulty RAM sticks and NVMe failure. Obviously hardware quirks are still at play and there’s not much that can be done there, but in terms of making life easier on the software side, NixOS and reproducible config definitely helps over traditional distros.
It was probably the most difficult thing I tried to, unsuccessfully, use on my desktop. I imagine learning how to use Vim/Emacs as a complete beginner would probably be several magnitudes of order easier than learning the Nix DSL and the Nix way of doing things. And from what I've read about the experience of other people who do use NixOS and talk about both the good AND the bad, using it seems like an unhealthy relationship.
Not to mention that the Nix package manager feels slow as hell and reminds me of my unpleasant hours spent using rpm and dnf.
In terms of the DSL, I’m really surprised that people find it a problem as much as they do. When I first tried Nix I moved my first nonprod server over to it that very afternoon. It helped looking at the syntax to start with as “hmm this is a bit like JSON” and to worry about things like lazy evaluation later on. I guess it helped that I was familiar with JSONNET beforehand so maybe that helped.
My first few servers I got by with just a basic understanding of the language and copying/pasting examples from the website and using https://search.nixos.org/options
Using on a desktop, now that is a whole other experience. I personally quite enjoy it for my desktop but there is definitely more of a learning experience and you might not find the benefits of Nix worth it anyway as your desktop is an always changing environment.
You're wise to keep staff around who understand the low level stuff, in addition to the shiny new abstraction based tools.
You'll only find those jobs at one of the handful of cloud companies. Nobody will know how to do anything for themselves anymore and all this experience and knowledge will be lost.
There are no more actual administrators. Just users paying rent.
Rent to AWS actually drives demand up quite a lot since the bills are huge and very few people understand what is under the hood and how it can be optimized.
I doubt very much things will change in the near future. In the far one... who knows.
Edit: car mechanics with their own shop make significantly more money than me and it seems to only get better for them as cars become more complex.
A few years ago I participated in a Splunk deployment and the cloud solution utterly dwarfed an in-house enterprise solution, in regards to cost. Even in the event that cost was irrelevant, certain sectors (financial institution(s)) are going to have a difficult time pivoting to a cloud-based solution and relinquishing control over the underlying infrastructure.
Yes, I know that isn't what DevOps is supposed to be, but we all know how Agile turned out, management has a magic touch to distort such concepts.
wait, what? definitely not in eastern eu
it seems like there's one mechanic per a few kms
but maybe due to the fact that average car is relatively old
I've always been a fan on "standing on the shoulders of giants" and it's served me very well to have this mindset. I'm fine to dive deep when I have to but diving deep just to dive deep.... not so much.
Semi-recently I had need of a simple blog for a friends/family thing, I spun up a wordpress and mysql container and was done. Over a decade ago I used to setup and manage wordpress installs but it's not a skill I need.
I find this article a little odd since they talk about server admin but then also scripting setup script for your server which is more in the "cattle" category for me and less in the "pet" that I would consider "server administration".
I sometimes wonder whether we need another metaphor, something like a dairy cow, where you only have one, but when it fails you can shoot it and plug in another very quickly and simply (e.g. using a script).
This rings true to me. On Azure anyway. Like the rest of tech you gotta keep up on the hamster wheel! Example: they canned Azure Container Services because of k8s - just imagine if you tightly integrated with that and now you meed to rewrite.
Also not mentioned in the article is cost. Hertzner is loved on HN for this reason.
That said k8s is probably a stable and competitive enough platform it makes a good tradeoff and by using it you invest in ops skills rather than specifically sys admin and I believe k8s skills will be long lasting and less fadish than proprietary vendor cloud skills.
Does anyone have a good source of learning that is comprehensive and practical? I’m talking about a good guided book/tutorial on how to administer a server properly and what things one should know how to fix, not just how to set up Wordpress.
http://www.linuxcommand.org/tlcl.php/
From there picking up configuration management should be pretty straightforward.
That advice can cause substantial headache on Ubuntu/Debian, where the Almquist shell is /bin/sh. This does not implement much of bash and will fail spectacularly on the simplest of scripts. This is also an issue on systems using Busybox.
A useful approach to scripting is to grasp the POSIX shell first, then facets of bash and Korn as they are needed.
-"As a practical goal, you should be able to recreate your host with a single Bash script."
This already exists as a portable package:
https://relax-and-recover.org/
-"For my default database, I picked MySQL."
SQLite appears to have a better SQL implementation, and is far easer in quickly creating a schema (set of tables and indexes).
At least for Debian and Ubuntu, that's why we start bash scripts with #!/bin/bash, of course.
Your point is valid for Busybox, though.
That's not really a problem as long as you use #!/bin/bash shebang, and there is nothing wrong in doing that.
It's frustrating that most google search results and shell script search results on SO almost always mean bash and sh.
#!/bin/bash
There, I fixed your "what shell is /bin/sh" problem.
I disagree with this. A single bash script configuring an entire hosts can be overly complex and very difficult to follow. As someone who has created complex bash scripts, this will become very time consuming and prevent you from making many changes without significant efforts. I'd suggest familiarizing yourself with tools like cloud-init and Ansible.
The hardware is the cheapest part, then you have to pay electricity, manage backups, fix raid problems, have a good internet. Pay attention to how the server is doing. And if you're serving a business, you have to be available debug any issue. Investing a lot of time you could be actually working on the project
But definitely most devs should have a small home server for trying unimportant things. Nothing complicated, just keep the standard hardware config. There are second hand servers available for 50$. Install some Linux and have it running 24/7. Quite fun experimenting and hosting simple things
This means I can run my own servers and the only thing they do is running rke2.
I can take out a node and upgrade the base is without issues or anything.
And still get all the benefits of a high quality cluster is (k8s)
I love it.
And yes it's easier in my opinion and more streamlined to install a storage software (openebs) on my rke2 cluster and backing up those persistent volume than doing backup for my hard drives.
And my expectation is that while it works already very very good that it only gets even more stable and easier.
We are back in timesharing days, only using SSH and Web instead.
I guess I can at least switch my cloud shell colours to green to feel at home.
There are plenty of books around. And there are literally thousands of people worldwide practicing this "lost" art daily.
Starting from small corp up to the major cloud providers. (Someone has to support those computers, running the "serverless" things")
My word of advice: start with the "philosophy". One program doing only one task but extremely well, "everything is a file" etc.
Understand why people are unhappy with SystemD. :-) Find out how kernel schedulers impact databases' IO. Write a boring program in C - network server which forks on accept4. Tip your toe in Perl 5 - there is lots of it in *nix and BSD. Still most stable and efficient way of writing CGI script ... Find out why Ksh is faster than Bash.
It is truly exciting world, and the best news is that it "fits" as a glove the modern world of JS and async programming etc.
I wouldn't call it "lost" - it is just dozen of levels of abstractions down, efficient, boring and complex. But powerful and unforgiving to typos :-)
I am glad someone actually is reading about all that.
Either you need to have pets on tin, or cattle via cloud, but that never was the case. I worked at a hosting company ~2007 whom was an early IaaS provider. We PXE booted xen nodes, that automatically connected to our management layer, allowing customers to provision virtual machines. Most of our own fleet would be cattle well before this was meme worthy.
Today, you could bootstrap a k8s cluster with almost no effort on tin. You'll quickly have autoscaling cattle and a distributed cron. Sure you'd probably pet etcd and maybe the API servers. Running a database, API, and small management layer is well within the responsibilities of a professional system administrator. If this is beyond your orgs / teams capabilities you probably should use the cloud provider.
P.S. Not having a team that can run production services without outsourcing the database is fine. We all have different specilisms.
The storage layer is a bit more complex if you want to roll PVC.
You shouldn't bootstrap a $1m team to defeat a $500k cloud bill.
There is not a day going by where a recruiter doesn't tell me "we are urgently looking for an experienced Linux sysadmin. Are you interested?"
I will steal this.
As for the term "DevOps", I am never sure what people mean when they use it. You seem to be using in contrast to traditional linux sysadmin. What exactly does DevOps mean in your definition?
Or the absurd prices for stuff that basically does not make sense.
Or for practices that would otherwise be absolutely unlawful but that people let be because cloud providers are just too big to fight.
For business related services I use root servers hosted by e.g. Hetzner. I don't want to deal with hardware maintenance nor the 24/7 power bill.
For private stuff (pictures, videos, movies) I have a cheap old desktop machine at home with lots of storage running Ubuntu. Easy to administer, and I can switch it off if not needed. Data is mirrored and snapshotted.
For long-term backup I encrypt my data and upload it to Amazon Glacier Deep Archive (around 1$/TB!)
That said the cloud in general is great and you can do some things today for cheap that weren't possible for most companies 10 years ago. For some use cases it's the best choice.
In general a lot of workloads can be served orders of magnitudes cheaper than 10 years ago.
Any good resources / practices on making your server safe? and maybe not those kernel level tricks
also automated deployment
so I can commit and it'll be deployed on the server
I thought about using GitHub Actions so when I push, then the server receives HTTP Ping and clones repo and setups the app
Don't use "here documents" or "here strings" for passwords. Even in bash versions as recent as 2020, they create a temporary file within `/tmp/` with the secret inside. If the timing is unlucky, it will get written to disk and therefore leave permanent traces even after reboot. Only shredding will securely delete the data.
In my opinion: when you have choice, get to know all the options (within reason). I have Apache as my default, purely because nginx didn't exist for many years. When nginx turned up, I gave it a while to calm down and now I deploy it quite often. I deploy something like 75% Apache and 25% nginx.
I tend to Apache from inertia but I quite like the clean easy setup for a simplish site with nginx - this is with Debian/Ubuntu style defaults, which do not favour nginx.
It's like manufacturing tires without knowing how an engine works. Don't you want to know how torque and horsepower affect acceleration and velocity? How else will you know what forces will be applied to the tires and thus how to design for said forces?
The total flexibility of such a server (compared to un/managed services) is a (great) bonus (not only at the beginning).
Everything else is per-software files configuration and running commands from the software setup documentation.
Plus, I would run a server with a DE simply because I want to be able to look into databases with a GUI and do config files editing with a nice text editor.
Or, the way things are going, systemd, systemd[1], systemd[2], systemd[3], systemd[4] and systemd[5].
[1] https://www.freedesktop.org/software/systemd/man/journalctl.... [2] https://www.freedesktop.org/software/systemd/man/systemd-jou... [3] https://www.freedesktop.org/software/systemd/man/systemd-mac... [4] https://www.freedesktop.org/software/systemd/man/systemd-log... [5] https://www.freedesktop.org/software/systemd/man/systemd-nsp...
There's much less of a margin for error now.
I think this is ideal, but I've yet to be able to do this or see a solid example.
At my old job I had to do exactly this, and it was really hard to get things right.
I'm much more seasoned now, but I still don't think I could do it lol
Interviewer: That's nice, but how much AWS experience? :(
it would up the status of the industry overnight if everyone was at this level...