* rsyncs the directories containing the files you want to back up
* mysqldumps/pg_dumps your databases
* zips/gzips everything up into a dated archive file
* deletes the oldest backup (the one with X days ago's date)
Put this program on a VPS at a different provider, on a spare computer in your house, or both. Create a cron job that runs it every night. Run it manually once or twice, then actually restore your backups somewhere to ensure you've made them correctly.
I don't delete and/or gzip my oldest uploads though.
#!/bin/sh
DATE=$(date +%d-%m-%Y@%H:%M:%S.%3N)
DB_USER="qux"
DB_PASS="foo"
DB_NAME="bar"
DROPBOX_TOKEN="baz"
/usr/bin/mysqldump -u${DB_USER} -p${DB_PASS} ${DB_NAME} > /tmp/${DATE}.sql
/usr/bin/curl -H "Authorization: Bearer ${DROPBOX_TOKEN}" https://api-content.dropbox.com/1/files_put/backup/ -T /tmp/${DATE}.sqlOne alternative is to put these backups into S3 using pre-signed requests rather than Dropbox. An S3 pre-signed request gives permission only to upload files, perhaps only to a certain location in a certain bucket.
It's a bit harder to set up, but the shell script will look almost the same.
And whatever you do, check that you can actually recover from these backups every once in a while.
That is the _only_ reason I have for looking at something else.
Then on the local Linux box I had a separate script that would doing snapshots of that directory to a complete different place on the filesystem.
I've used it for many, many years. Setup is a bit of a pain, especially if it's your first time, but it's a totally reliable backup system and gives you something much better than just a pile of zip archives.
All of our servers get BackupPC'd (rsync-over-ssh, pulled) twice a day to an in-house server that's totally unreachable from the internet. I get emails from BackupPC when something goes wrong, which is pretty much never. Backups aren't a thing I have to worry about much anymore.
We basically create a backup folder (our assets and MySQL Dump, then rsync it to rsync.net). Our source code is already on git, so basically backuped on Github, and all developers computer.
On top of it, rsynch has a very clear and simple documentation to implement it very quickly with any Linux distrib.
I hope that you know that your account, like all accounts at rsync.net, is on a ZFS filesystem.
This is important because it means that inside your account, in the .zfs directory, are 7 daily "snapshots" of your entire rsync.net account, free of charge.
Just browse right in and see your entire account as it existed on those days in the past. No configuration or setup necessary. Also, they are immutable/readonly so even if an attacker gains access to your rsync.net account and uses your credentials to delete your data, the snapshots will still be there.
Not sure I'd agree there, but it's not inscrutable. I use rsync for almost all file transfers, backups included, so I'm used to it. But there are oddities here and yon.
I do believe you can take images and snapshots and download them, so using the api, a user could prob rig up a script to make it refundant if it was mission critical
I use tarsnap, as many others in this thread have shared. I also have the Digital Ocean backups option enabled, but I don't necessarily trust it. For the handful of servers I run, the small cost is worth it. Tarsnap is incredibly cheap if most of your data doesn't change from day to day.
[1] info@rsync.net
The only annoying thing is that duplicity uses an old version of the boto s3 library that errors out if your signatures tar file is greater than 5gb unless you add `DUPL_PARAMS="$DUPL_PARAMS --s3-use-multiprocessing "` to your duply `conf` file. Took me days to figure that out.
It took a little time to set up, but it is conceptually simple, very inexpensive (especially if you set up S3 to automatically send older files to Glacier, and/or remove old backups every now and then)... and I like that the backups are off-site and stored by a different company than the web hosts.
Here are the relevant docs: http://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecy...
My main site runs a complex series of workers, CGI-scripts, and deamons. I can deploy them from scratch onto a remote node via fabric & ansible.
That means that I don't need to backup the whole server "/" (although I do!). If I can setup a new instance immediately the only data that needs to be backed up is the contents of some databases, and to do that I run an offsite backup once an hour.
Github takes care of code and config.
AWS S3 takes care of uploaded static files.
But Tarsnap takes care of my database backups.
The only thing to be aware of is that restore times can be very slow.
For a database driven dynamic site or a site with content uploads you can also use your version control via cron job to upload that content. Have the database journal out the tables you need to backup before syncing to your DVCS host over choice.
If you're looking for a backup service to manage multiple servers with reporting, encryption, dedupelication, etc. I'd love your feedback on our server product: https://www.jungledisk.com/products/server (starts at $5 per month).
Lots of people only do a full test of their backup solution when first installing it. Without constant validation of the backup->restore pipeline, it is easy to get into a bad situation and not realize it until it is too late.
OVH has a backup by FTP premium service but the FTP server is accessible only by the VPS it backups. Pretty useless because in my experience if an OVH VPS fails the technical support has never been able to take it back online.
[1]http://duplicity.nongnu.org/
http://mindfsck.net/incremental-backups-amazon-s3-centos-usi...
For database, I use a second VPS running as a read only slave. A script runs daily to create database backups on the VPS.
Make sure you check the status of backups, I send journald and syslog stuff to papertrail[0] and have email alerts on failures.
I manually verify the back-ups at least once a year, typically on World Back-up Day [1]
[0] https://papertrailapp.com/ [1] http://www.worldbackupday.com/en/
Stupid simple and stupid cheap. Install, select directories you want backed up, set it and forget it.
All for $7.00 a month.
Collect your files, rsync/scp/sftp them over.
Read only snapshots on the rsync.net side means even an attacker can't just delete all your previous backups.
I just use a simple scheduled AWS lambda to PUT to the redeploy webhook URL.
I use an IAM role with put-only permissions to a certain bucket. Then, if your box is compromised, the backups cannot be deleted or read. S3 can also be setup to automatically remove files older than X days... Also very useful.
Then the script sends it to s3 using aws s3 sync. If versioning is enabled you get versioning applied for free and can ship your actual data and webdocs type stuff up extremely fast and it's browsable via the console or tools. Set a retention policy how you desire. Industry's best durability, nearly the cheapest too.
I can't praise restic enough. It's fast, secure, easy to use and set up (golang) and the developer(s) are awesome!
Use pg_dump and tar then just s3cp
All the databases and other data are backed up to s3. For mysql, we use the python mysql-to-s3 backup scripts.
But the machines themselves are "backed up" by virtue of being able to be rebuilt with saltstack. We verify through nightly builds that we can bring a fresh instance up, with the latest dataset restored from s3, from scratch.
This makes it simple for us to switch providers, and can run our "production" instances locally on virtual machines running the exact same version of CentOS or FreeBSD we use in production.
If you're not using a modern Unix variant with ZFS... well there isn't a good reason why you would be.
You can also use https://r1softstorage.com/ and receive storage + R1soft license (block based incremental backups) -- or just purchase the $5/month license from them and use storage where you want.