https://arstechnica.com/gadgets/2020/01/linus-torvalds-zfs-s...
Replacing a core system component with an out-of-repo version is always going to hurt, yes.
> I switched to btrfs; it just working is worth the few extra warts over ZFS.
I'm not sure I'd call "catastrophic failure and data loss" a "wart". In all my years of distro hopping, I've had 3 root filesystems become unbootable: 1 F2FS system early on, which I actually did manage to fsck out of, and 2 on an openSUSE tubleweed system using BTRFS as root.
https://gist.github.com/xenophonf/76fd44ae24772e457cb63d00c0...
`apt-get update && apt-get dist-upgrade -y` works as expected. I plan to switch to a similar config on my Lenovo laptop when I upgrade it to the next Ubuntu LTS release.
So cold data (cold write, cold/hot read) will take less and less space over time while still having the same read performance.
(It would also be a performance nightmare - you'd have a permanent indirection table you'd need to use for _everything_, and if you've ever seen how ZFS dedup performs with its indirection table not on dedicated SSDs, you can understand why this is terrible.)
Compression settings are set at a per dataset level, so applying this to only some files in a dataset isn't practical.
* https://openzfs.github.io/openzfs-docs/Basic%20Concepts/dRAI...
Would be great for home use, where I have a lot of drives that I collected over the years that are not the same size.
EDIT: The more I read into this, it still seems assume that all drives must be of the same size.
That way, if one disk fails, the reserved space is used to write the data necessary to keep the array consistent. Because the free space is distributed randomly across the array, the write performance of a single drive doesn't become a bottleneck.
This is unrelated to the ability to remove drives from a pool (which is difficult to support in ZFS due to design constraints)
dRAID, Finally![0]
[1] https://lore.kernel.org/linux-btrfs/20200627032414.GX10769@h... [2] https://lore.kernel.org/linux-btrfs/20200627030614.GW10769@h... [3] https://lore.kernel.org/linux-btrfs/20200520013255.GD10769@h...
One thing I am wondering about is this:
> Redacted zfs send/receive - Redacted streams allow users to send subsets of their data to a target system. This allows users to save space by not replicating unimportant data within a given dataset or to selectively exclude sensitive information. #7958
Let’s say I have a dataset tank/music-video-project-2020-12 or something and it is like 40 GB and I want to send a snapshot of it to a remote machine on an unreliable connection. Can I use the redacted send/recv functionality to send the dataset in chunks at a time and then at the end have perfect copy of it that I can then send incremental snapshots to?
> Redacted send/receive is a three-stage process. First, a clone (or clones) is made of the snapshot to be sent to the target. In this clone (or clones), all unnecessary or unwanted data is removed or modified. This clone is then snapshotted to create the "redaction snapshot" (or snapshots).
Think of it like a selective sync in Dropbox or SyncThing at the FS level.
That's not to say rsync doesn't work. It does. But it doesn't scale well, and the data integrity guarantees aren't there.
btrfs seems like the main alternative if you want native kernel support, but when I checked a couple years ago there seemed to be a lot of concerns about the stability. Is that still the case?
[1] https://lore.kernel.org/linux-btrfs/20200627032414.GX10769@h...
[2] https://www.man7.org/linux/man-pages/man8/mkfs.btrfs.8.html#...
[1] https://lore.kernel.org/linux-btrfs/
[2] https://lore.kernel.org/linux-btrfs/CAD7Y51i=mTDnEWEJtSnUsq=...
[3] https://lore.kernel.org/linux-btrfs/CAMXR++KUj2L7qpR7QZeiM2T...
(But as others have pointed out, there are options for using zfs on linux, too)
1. It often happens that the main repo offers a new kernel, but the corresponding module is not ready on obs yet. This means upgrading to the latest rolling release cannot just happen at any time, but requires careful planning. This is a big inconvenience.
2. In the past dracut sometimes just failed to pick up the module for the initrd, causing a boot failure at the next system start. I could not figure out why, however this never happened with the first class supported ext/xfs.
3. The distro's boot/rescue media do not contain the driver. This means a third-party boot medium is required to go into a broken system, and repairing it when chroot is involved is now much more complicated because of the different distro.
fileSystems."/zfs/media" =
{ device = "tank/media";
fsType = "zfs";
};
in my hardware-configuration.nix. tank/media is defined as using a legacy mount-point or whatever the ZFS terminology is. Done.ETA: I mean, I had to do all the gruntwork to get the pool built, yeah. But once it was defined, getting it mounted and all the kernel bits and bobs set was trivial like that.
A friend did a video based on my blog: https://www.youtube.com/watch?v=PILrUcXYwmc
That said, I generally agree with you in that do one thing and do it well is a laudable design goal. However, I also am very excited about encrypted ZFS for one main reason: backups.
Okay two. Snapshots and backups!
ZFS is absolutely amazing to use as a home NAS that does daily (or more) snapshots and then nightly differential syncs to a second location. In the past I had to run all my own infrastructure to do this, as the data was in the clear.
Now my ZFS nerd friend and I can simply swap backup space and have "zero knowledge" of the others' files, while retaining the amazing features of ZFS snapshots+zfs send/receive.
This also tickles the "create an encrypted ZFS backups as a service" service itch for me, but then I realize I'd be creating it for all 13 potential users of the service. That said, I'm sure rsync.net will offer this functionality shortly - which would make them a viable backup target for me.
This is why I really wish btrfs would get native encryption, but maybe my info is out of date.
Or you can use the latest Ubuntu that is shipped with ZFS.
For the most part, yes. Occasionally a kernel developer who seems to be bitter about a company that doesn't exist any more tries to break compat with ZFS, but it's generally smooth sailing on Fedora, Debian, and CentOS, with dkms handling the building of modules seamlessly.
Do we have encryption,yet?
Use BTRFS trust me it's stable now...well the commands are terrible compared to ZFS. All my Server are FreeBSD but on the Laptop and on one Workstation i have openSUSE Tumbleweed since like 2 years and it works great.
Really? I don’t think so, I find btrfs usage extremely straightforward and easy to grok. ZFS on the other hand has all that confusing lingo about vdevs, etc...
I get that this is subjective but I disagree.
As an example, you're running low on space and need to find out which datasets (subvolumes) are using the most space. How do you do that? With ZFS it's a single command which runs in a few milliseconds. With Btrfs...
what does that mean?
ZFS on the other hand has just two commands for common administration tasks: zpool and zfs. zpool controls pool-level operations, mainly ones that deal with the storage layer; zfs controls the logical file systems and volumes that are contained within a pool. The zpool and zfs commands have been meticulously crafted to not expose much of the underlying software architecture and focus only on what administrators want, and all of it is clearly documented.
There are actually a few other commands that come with ZFS if you really want or need to deal with low-level and difficult details, commands like zdb, zinject, zstreamdump. You almost never need any of them.
So I guess that the GP considers /usr/sbin/{zfs,zpool} more intuitive than /usr/sbin/btrfs.
>what does that mean?
Not functional but logical (for me)
I switched my freebsd box over to debian about two years ago. No complaints so far :)
For me, that gives a unicorn 100% of the time (tried across several minutes), instead of showing the developer profile.
Anyone else seeing that?
Many thanks to the various OpenZFS contributors.
I've seen people use it as a rootfs on RPis, and have personally run it on Pis for brief occasions without encountering any RAM problems.
(Sorry if noise; I'm just trying to get an idea of how relevant this 2.0 release is to me.)
Previously it was called ZFS on Linux, but now ZFS development is unified on the "OpenZFS" codebase shared both between Linux and FreeBSD as much of the development effort for ZFS in general ended up there.
I realized how bad the performance was when it took about 2 hours to delete 1000 files.
Deduplication is the process for removing redundant data at the block level, reducing the total amount of data stored. If a file system has the dedup property enabled, duplicate data blocks are removed synchronously. The result is that only unique data is stored and common components are shared among files.
Deduplicating data is a very resource-intensive operation. It is generally recommended that you have at least 1.25 GiB of RAM per 1 TiB of storage when you enable deduplication. Calculating the exact requirement depends heavily on the type of data stored in the pool.
Enabling deduplication on an improperly-designed system can result in performance issues (slow IO and administrative operations). It can potentially lead to problems importing a pool due to memory exhaustion. Deduplication can consume significant processing power (CPU) and memory as well as generate additional disk IO.
ZFS also has a huge legacy. Right now the license (probably) prevents you from legally shipping a compiled zfs module with the linux kernel, just solving that seems insurmountable. It's also supported on Illumos and FreeBSD, trying to refactor it to use the linux page cache would have a chance of introducing bugs to these platforms.