Also, intentional namespace pollution with existing backup tool, which IS gpl'ed.
Not cool. Not cool at all.
____________________________________
(response, since I'm submitting 'too fast'... ):
Github has commercial repos, and private repos.
It's pretty simple, really. If you want the free options on GH, you choose from a list of standard Open Source licenses. https://github.com/blog/1964-open-source-license-usage-on-gi...
It's also asked you create a LICENSE file, to go along with this.
Their license, however, is very much NON-FREE. As in, if I click clone, since I work for an employer of 50k people, I'm in violation. Full stop. And we're not even talking about developing on it, or submitting PR's, or what have you. This is simple copy which puts me in violation.
It's very much against the spirit of GitHub, and probably against the license on GH as well.
And it also is attempting to dilute another project that does similarly. Just so happens they're 2 letters different. Duplicacy vs Duplicity. That's an asshole thing to do.
Here's a few names I just devised: ClouDuplicate , Clouder, DupliCloud, CfC (cloud file cloud)..
Instead, it's very uncool to try to pollute an existing namespace of the same thing. Talking about pro-level bad will here.
I agree the name is confusing, however this is not intentional. As I explained in the other comment, I chose duplicacy because the domain name was available and this is a very good name for a backup tool (even better than duplicity).
I chose this fair source license because this is basically the only free-for-personal-use license. Many people here ask why I didn't go with a free license like GPL. Here is why. I believe software should be free for personal users, but I don't like for-profit companies using it for free. This software can potentially help companies solve a painful everyday problem (and therefore make more money) and yet there isn't a license to require them to pay if they don't distribute the software. In my opinion, this is extremely unfair to independent developers like me.
GitHub does not restrict licensing on their public repositories; I'm not interested in declaring myself a shaman for the "spirit of GitHub" to address that point.
http://www.infoworld.com/article/2615869/open-source-softwar...
I do agree with this part of your post:
Not cool. Not cool at all.
Either naivety or guerilla growth hacking / marketing; we'll see how things shake out.
I'm not a lawyer, but the user cap would seem to apply to "use" of the software, not simple copying.
I don't think this is against the spirit of GitHub, either. I ain't GitHub, though, so that opinion is by no means authoritative.
All this is different from, say, SourceForge, where using SourceForge to host your code did (does?) require licensing your code under a FOSS license.
----
Regardless, still scummy to take a name so close to an existing actually-FOSS project with similar goals. Additionally scummy to call the license "fair" (if it ain't free, it ain't fair), though that's probably not the developer's fault.
Or you know, since it's written in Go, how about GoDuplicate, GoBackup, etc.
I like giving people the benefit of the doubt, but it's just so similar and they have so many obvious pun options that even the most uncreative person probably would have come up with a more unique name.
[0] - https://github.com/gilbertchen/goamz/blob/master/LICENSE
[1] - https://github.com/gilbertchen/azure-sdk-for-go/blob/master/...
"It is the only cloud backup tool that allows multiple computers to back up to the same storage simultaneously without using any locks (thus readily amenable to various cloud storage services)"
"What is novel about lock-free deduplication is the absence of a centralized indexing database for tracking all existing chunks and for determining which chunks are not needed any more. Instead, to check if a chunk has already been uploaded before, one can just perform a file lookup via the file storage API using the file name derived from the hash of the chunk."
Tahoe-LAFS's immutable file model (based on convergent encryption) was capable of doing this same thing a decade ago, and also features a pretty nifty capability-based security model:
I recently released the source code under the Fair Source 5 License (https://fair.io/) which means it is free for individuals or businesses with less than 5 users. Otherwise the license costs only $20 per user/year.
Questions and suggestions are welcome.
> Fair Source has the power to promote diversity within the developer community. To date, contributing to open source has been an expensive proposition for developers. You have to have a stable income and a lot of extra time to work on side projects for free, which means talented developers from underprivileged backgrounds often aren’t able to contribute. Fair Source allows developers to monetize their side projects, which means more people can afford to join the ranks of developers who pursue these initiatives.
I find it funny that some people feel a need to justify charging money for something by coming up with bogus social justice rationalizations.
I'm not sure this license is the way to go, though. Unusual licenses tend to turn people off, and it's not clear how profits from this license would go to contributors.
I object to the name. It is clearly an attempt rebrand proprietary licensing by capitalizing on associations with open source. Trademark law of course doesn't apply, and I wouldn't want it to if it did, but this is pretty much the definition of causing confusion in the marketplace.
1. Does this support encryption?
2. Can this do one-command restore of files to a previous revision or day?Duplicacy follows the git/hg command model. To initialize the repository (the backup to be backed up), run the init command:
duplicacy init repository_id storage_url -e
The -e option turns on the encryption.To backup:
duplicacy backup
To restore: duplicacy restore -r revision_number [files]Also, how do you see this as being different from Duplicati and Arq?
The main use case supported by Duplicacy but not any others including Duplicati and Arq is backing up multiple clients to the same storage while still taking advantage of cross-client deduplication. This is because Duplicacy saves each chunk as an individual file using its hash as the file name (as opposed to using a chunk database to maintain the mapping between chunks and actual files), so no locking is required with multiple clients. Another implication is that the lock-free implementation is actually simpler without the chunk database and thus less error-prone.
one of our users wrote a long post (https://duplicacy.com/issue?id=5651874166341632) comparing Duplicacy with other tools including Arq, based on his experience. I also added a comment to that thread comparing Duplicacy with Arq based on my read of their documentation.
Cloud storage back-ends are a somewhat similar story. It wouldn't be that complex, although locking is a problem due to the EC model of most of these services. Plans have existed for quite some time now to enable this — just no time to implement them, and other features are requested more frequently.
The only operation which inherently has to be guarded by a lock in Borg is inserting the archive pointer into the manifest (root object, see https://borgbackup.readthedocs.io/en/latest/internals/data-s...). I suppose it would be possible to work around that without locking or to use the usual hacks around EC, put/get/check/get/check?put/get/check?put etc. until it's "probably there".
Deleting / pruning archives would still require a full lock due to the same conceptual issues that your two-phase GC avoids. The same goes for "check".
- Off-site storage, preferably not costing too much.
- Option for on-site storage (e.g., to store a backup "in the cloud" and on my NAS)
- Keeps version history, with the associated goodies (purging old backups, etc)
- Able to run on FreeBSD and Linux, with Windows and MacOS being nice to have but not required.
- Able to back up multiple machines to one account.
I strongly suspect that my solution will involve two separate things - one to actually do the backups and another for the storage.
So far, not having looked at Duplicacy, I'm leaning strongly towards attic/borg with rsync.net for off-site storage. At first glance, Duplicacy looks like it will meet my requirements so I will have to give it a closer look before I pick a solution.
https://www.stavros.io/posts/holy-grail-backups/
I have posted it to maybe help a few people who want to do backups: https://news.ycombinator.com/item?id=14507656
Are you using rsync.net's "hidden" attic/borg option? This makes the price very attractive.
You mention using "attic check" to guard against bitrot on the provider's storage. How is this in terms of bandwidth used? Does it have to transfer every byte or does it compute a checksum on the encrypted data (since rsync.net doesn't have the raw data) and just send that?
It would be a case of multiple machines each backing up their /etc, /home, /var, etc to one place.
A quick Google search gives me the feeling that it runs under FreeBSD's Linux emulation and the port seems to break occasionally. I could run it on a real Linux in a VM on FreeBSD though, so that's a potential option.
Duplicity is a pretty straight good old-fashioned incremental backup program.
Duplicacy on the other hand is hash-based deduplication (BorgBackup / Attic, Restic etc. are some others).
The design of Duplicacy is slightly different from that of e.g. BorgBackup. Duplicacy, as the title says, uses a lock-free approach. BorgBackup and the handful of open source tools in the same spirit use a synchronized approach.
The only question remaining would be the amount of data (i.e. filenames) you'll have to download per amount of data in the backups, which you can vary by adjusting the chunking size.
[1]: https://github.com/borgbackup/borg
[2]: https://restic.github.io/
[3]: http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET...
Edit: Of course, because S3's PUT OBJECT[4] is idempotent in this case (i.e. ignoring hash collisions as their probability should be orders of magnitude lower than a doomsday scenario), you could of course just transfer each chunk every time. Realistically, all this would do is hog your bandwidth and ruin your performance. That's why it's possible to make the whole thing lock-free; otherwise you could always run into the problem of uploading the same chunk twice.
[4]: http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPUT...