First, rsync took too long probably because you used just one thread and didn't optimize your command-line options - most of the performance problems with rsync with large filesystem trees comes from using one command to run everything, something like:
rsync -av /source/giant/tree /dest/giant/tree
And the process of crawling, checksumming, storing is not only generally slow, but incredibly inefficient on today's modern multicore processors.
Much better to break it up into many threads, something like:
rsync -av /source/giant/tree/subdir1 /dest/giant/tree/subdir1
rsync -av /source/giant/tree/subdir2 /dest/giant/tree/subdir2
rsync -av /source/giant/tree/subdir3 /dest/giant/tree/subdir3
That alone probably would have dramatically sped things up, BUT you do still have your speed of light issues.
This is where Amazon import/export comes in - do a one-time tar/rsync of your data to an external 9TB array, ship it to Amazon, have them import it to S3, load it onto your local Amazon machines.
You now have two copies of your data - one on s3, and one on your amazon machine.
Then you use your optimized rsync to run and bring it up to a relatively consistent state - i.e. it runs for 8 hours to sync up, now you're 8 hours behind.
Then you take a brief downtime and run the optimized rsync one more time, and now you have two fully consistent filesystems.
No need for drbd and all the rest of this - just rsync and an external array.
I've used this method to duplicate terabytes and terabytes of data around, and 10s of millions of small files. It works, and is a lot fewer moving parts than drbd
But it took you 3 weeks anyway?
That way instead of scanning the whole file system you just rsync the files that have changed.
[1] Assuming you can't hook into the app(s) making the changed directly. You can even just look for new/changed files if deletions are not a priority.
rsync in general is quite optimized and usually the limiting factor to a data transfer is the network or disk rather than CPU speed (unless crypto is involved, for example, over a ssh connection)
A truck ? I think two 3.5" 5TB hard drives would be enough, no ?
1. Ask someone else who has already done what you're thinking of doing. They have already made all the mistakes you might make and have figured out a way that works.
2. Assume that whatever you think will work will fail in an unexpected way with probably-catastrophic results. Test everything before you try something new.
(I jest)
Here, you have your entire business in one non-distributed Amazon instance. Amazon does not provide excellent service, availability, flexibility, or value for this model. It is in every way inferior to what you would get from colo or managed dedicated server hosting. Hosting your whole business on a single anonymous Amazon cloud instance that you can't walk up to and touch is engineering malpractice.
I've done this sort of thing with rsync, and with ZFS send/receive.
And of course, mailing hard disks.