This means that, of course, it works perfectly with rsync.net and that we have yet another chance to offer the HN readers discount, which you may email us to find out about.
I see the calculator floating around in the comments, but it's not formatted in a backup-friendly fashion / use case (ie. set storage size with XYZ% churn for changed files, or continuously expanding snapshots with aged snapshot deletion).
I've always been curious for usage for backup storage. The costs for retrieval are expensive enough (in the time frames I'd care to see for a personal computer) that I think I'll still avoid that aspect for the time being.
Having used the past Glacier support, the new S3 Glacier Lifecycle will be much better.
I am wondering about when (if?) the open source arq_restore and format documentation will be updated.
The software I use: CrashPlan, Arq, DropBox, Carbon Copy Cloner.
CrashPlan: I have backing up my data, configuration /Users, /Library. I have a bunch of regex restrictions to not backup various files such as caches, VMWare images, etc. Crashplan backups up to their cloud servers every 15 minutes. When I am at home (where I work from) it also backups to a local copy of CrashPlan on a server.
Arq: I am doing daily backups to Arq. These are now being done to Glacier for long term / last resort backups. This only backups my /Users with heavy restrictions on which files.
DropBox: I have many of my documents stored in Dropbox with the PackRat feature to keep copies of every version and deletion. I don't consider DropBox to be backup by itself, but I often find it is much faster to find and restore something via Dropbox than other methods. I also take care about the types of data I put on Dropbox.
Carbon Copy Cloner: as I mentioned in another part of this thread, I think SuperDuper is better for most people. However, I do use CCC's ability to remotely do a boot able bare metal backup to my home office server. When I travel, I typically take an external backup drive with a current mirror of my system.
I don't use Apple's Time Machine. I think it is a good choice for most home users. As Apple has added more features to Time Machine, I do think about adding it to my mix.
That covers most things. I do have somethings under SVN or Git which could be considered another layer of backup.
Currently, the biggest pain point for me in backups is VMWare images. I currently have 4 Linux and 3 Windows images on this system, and they can cause a huge amount of data needing to be backed up every time they are used.
1. Bug in your backup software This is addressed by using more than 1 piece of backup software.
2. Corruption in your live data (i.e. your filesystem corrupts your favourite baby photo) This is addressed by having lots of incremental backups going back into history. Note that Time Machine throws away historical incrementals over time, so does not protect against this, given long enough time windows.
3. Failure of your backup hardware This is addressed by using more than 1 piece of backup hardware.
4. Destruction of your backup hardware This is addressed by having your backups exist in more than 1 physical location, so you can never lose your live data and all your backups because of, say, a house fire.
5. User error deletion of data This is addressed by having backups that run frequently.
My strategy is:
* Time Machine to a Time Capsule on my LAN
* Time Machine to an external disk on my Mac
* Nightly Carbon Copy Cloner clone of my entire disk to (the same) external disk on my Mac
* Nightly Arq backup to Glacier's Ireland location (I live in London)
So (in addition to the live copy of my data on my Mac's main disk) I have 4 copies of my data, from 3 different pieces of backup software, on 3 different pieces of hardware, in 2 different locations. The CCC clone is there mainly because it's bootable, so if my mac's SSD fails, I can reboot and hold a key and I'm no more than 24 hours behind.
It's unfortunate a few things appear to be backwards - why can you include wifi APs, yet not exclude them - despite the example suggesting you exclude from tethered devices.
Likewise, why can you email on success, but not failure.
If you want to stick with Arq 3 you can. Delete Arq 4. Download Arq 3 (http://www.haystacksoftware.com/arq/Arq_3.3.4.zip). Launch Arq 3. You'll have to find your old backup set under "Other Backup Sets", select it, and click the "Adopt This Backup Set" button. Sorry for the hassle.
CrashPlan: I have used since soon after they first appeared for Mac. They have really strong compression and de-duplication to minimize and speed data transfers. I personally use their consumer and small business solutions. However, I also maintain their CrashPlan PROe enterprise backup for several clients. The fact that they have a very strong enterprise product, provides me with a great deal of trust in the quality of CrashPlan's work. I think they have be best solution I have used for a notebook that is on the move. I backup to both their remote servers and to my own home office server. Thus, I have the option to quickly restore from my own local server, their much slower remote server AND I can have a CrashPlan Next Day send me a copy of my data on a disk drive. I do wish they could get rid of the Java dependency for their Mac Client software since it is a RAM hog. CrashPlan rates very well at saving Mac OS X meta data.
Arq: I like the approach to backing up to Amazon S3 which I know is a very reliable storage environment, and Glacier has made it dirt cheap for last resort archival backups. I like the fact that at least through Version 3 there has been an open source software GitHub hosted restore. If Haystack software disappears there are still options to restore. I believe Arq is one of the very few Mac OS X remote backup systems that preserves ALL meta data.
I have used lots backup software over 30 years. Every backup system has failings and bugs. And the operator (normally me) is capable of making mistakes. That is why I use multiple products to do backup.
I am interested in exploring Arq new features especially using SSH/SFTP which will allow me to self host, and may cause me to re-evaluate my overall backup approach.
I left JungleDisk because it went sideways and S3 was too expensive. After that was CrashPlan; I liked its free remote backup option. But then my backup destination disappeared behind carrier grade NAT. That left me with paying for regular CrashPlan or looking elsewhere. Enter Arq.
Based on my estimated usage, for two computers, I calculated the following estimated yearly cost.
JungleDisk S3 $288
CrashPlan $120
Arq Glacier $ 32
Assuming I didn't screw up my estimate, Glacier was a no-brainer, even with up-front cost of two Arq licenses ($70).This month is the first full month in which I'm not seeding my initial Arq backup to Glacier. I'm hopeful that the cost will be significantly lower than CrashPlan.
Even if it was more expensive for me, I would still switch, because I don't trust Crashplan completely. There have been stories from users of backups getting corrupted when they needed to recover, and the upload speed to Crashplan is so slow it took months for the full 300GB to upload (I'm getting around 0.5 - 2Mbps up on my 100Mbps/100Mbps connection, I believe they are artificially throttling it to discourage people from storing a ton of data). This means new data takes a long time to be 100% safe, especially when for example I dump my camera's memory to disk.
On top of that, if their upload speed is this low, their download speed probably is, too. If my data crashes, I need the backup yesterday. I can't wait a week to download the 300GB at 10Mbps.
I believe Amazon's speeds would be much higher.
I'm curious what backup tools people use on Linux if they want to back up files on Glacier. I use git-annex[0] for certain files (it works well for pictures and media). The rest of my backup process is a fairly rudimentary (though effective) rsync script, but it doesn't use Glacier.
My current setup works fine for me, but I imagine there are better tools out there.
It tars a directory, naming the output with a hash of the original directory name. Then it encrypts it with gpg and breaks it into small parts (100M) so I can pace any needed Glacier restores so as not to break the bank. Then it runs par2 on each part, to make it more likely that I can recover from any file corruption. Then it uploads each part and the par2 files to an S3 bucket which is set (via the S3 admin web dashboard UI) to automatically transition the files to Glacier.
The shortcoming is it's not a whole-system backup. Also it doesn't do differential backups, though that's not a problem for me because I organize things such that old stuff doesn't change often if ever. It's dirt cheap, one-command simple, and feels pretty reliable... though I must admit I haven't tested a restore!
Today v4 has been released and comes with new storage options (GreenQloud, DreamObjects, Google Cloud Storage, SFTP aka your own server), multiple backup targets, unified budget across S3 and S3/Glacier, Email notifications and many more clever features.
Can anyone recommend a SFTP backup provider?
My Arq backups are designed to be worst-case. I have other, local backup options in case of failure. I was using Glacier, but I ran into Arq3 sync problems and I need to re-upload all my data. Glacier is very slow from where I live. I assume SFTP will be a bit faster.
The new-user-HN-discount is 10c per GB, per month, and there are no other (traffic/bandwidth/usage) costs. Our platform is ZFS and there are 7 daily snapshots for free.
We would be happy to serve you, as we've been serving thousands of users since 2001.
It's called Filosync and is like Dropbox but secure and with your own (or with Amazon) servers. Check out http://www.filosync.com/ for more information. And be warned, it's pricey!
Could anyone clarify whether my calculation is wrong?
48 a day * 31 days a month = 1488 uploads a month
5.5 cents per 1000 uploads
so this would cost roughly 1.5 * 5.5cents, so about $0.0825 a month.
There would be data and storage costs but I'm just going by your upload request calculations.
S3 is more suitable for things you are working on like codes.
I'm traveling in hotels with shitty wifi most of the time. It's hard enough to browse so I'd really only like to backup while I'm sleeping. Also, in the interest of not hogging all the bandwidth I'd like it to stop by say 6am so that as guests are waking up they can use the bad internet.
You can. After you setup a target open the Preferences window. Select the target therein and click the "Edit..." button.
The dialog that follows has an option to "Pause between [00:00] and [00:00]", where [00:00] is a drop-down which lets you pick the top of any hour of the day.
1. In former versions, it was difficult to see what was actually part of the backup. Has that changed?
2. Is there any reliable way of calculating how much an Arq backup will cost? Storage costs are easy to calculate but with Amazon S3, changes etc. are a major cost factor.
In other words, can someone sell me on the idea of paying for AWS storage when I have dirt cheap storage around my house and even a remote location that I can stuff a huge drive in.
I'd recommend that someone using Sync for backup purposes take snapshots of they're syncing for backup purposes.
Alternatively, you could roll your own solution on top of Sync by encrypting your files on your own and creating a Sync folder of the encrypted files.
I generally prefer SuperDuper for its simplicity, and recommend it to most people.
CCC is not really harder to use, but presents a bunch more options which most people don't need and with which they may get into trouble. One great feature of CCC is the ability to do bootable backup to a remote volume. I have my MacBook set up to backup to a server in this fashion. However, this requires you to configure your remote server with root SSH access via certificates.
http://www.haystacksoftware.com/arq/ links to ?product=arq3 though, text "per computer for Arq 3".
Please consider integrating or linking http://liangzan.net/aws-glacier-calculator/ (as found in the HN comments somewhere around here), it's been invaluable when talking with people about Arq today.
pay close attention to the part about providing a clear call-to-action. :-)
For my own Mac I backup to a sort of off-site NAS with TimeMachine, which is fine for all but the worst-case, meteor to the city situations. However, Glacier makes a perfect option for me to deal with this actual worst-case scenario.