BTRFS works fine. I use it on my everyday laptop without problems. Compression c...

vetinari · on March 7, 2022

> Also I wouldn't trust BTRFS for the purpose of data archiving, for the fact that ext4 is a proven (and simpler) filesystem, thus it's less likely to became corrupt, and it's more likely to being able to recover data from it if it will become corrupt (or the disk has some bad sectors and that sort of stuff).

On the contrary; I'm using btrfs and not ext4 on NAS (Synology) specifically, because the former does checksumming and bitrot detection and the latter does not.

rinron · on March 7, 2022

I was using urbackup with ext4 and was having issues that caused corruption and couldn't figure out why, seen a recommendation to use urbackup with BTRFS and have had no corruption since. I have used ext4 in every other use case and had no issue so im not saying ext4 is at fault but so far BTRFS has worked great for me.

alerighi · on March 8, 2022

A NAS it's not a backup, it's something you use to store data on it.

The backup I was referring to is the offline one. If I need to backup data, something that unfortunately I don't do as often as I should, I need a filesystem that is reliable, that is proven (ext4 is around since more than a decade, and if we count the previous version even more, so in 20 years I'm confident that I would be able to mount an ext4 hard drive that I forgot in the garage in a modern system, with BTRFS, who knows), and that for it they exist a lot of tools in case something goes wrong (there are ton of tools to recover data from damaged ext4 drives, are we sure that with BTRFS is as easy? If I have a filesystem with compression recovering data I don't think is a simple as running photorec...)

Also, the filesystem of a backup drive is not something you can change easily. I still have an old 1Tb drive that I formatted long long time ago in NTFS, and I never changed the filesystem since having to backup all the data to another drive (find it another 1Tb drive), format the drive, and copy back the data will take 1 day. Not that there are things super important on that drive, mostly is stuff I downloaded from the internet years ago, still it's an example why for a backup drive I don't want to have the cutting edge choice that then creates problems in the future.

Ext4 is ubiquitous, so it's my filesystem of choice for all the purpose that have the requirement that the data must be archived for more than 2 years.

cmurf · on March 7, 2022

I'm not sure how you prove that ext4 is less likely to become corrupt. But it is easily shown that it's less likely to inform you that there's corruption.

Quite a lot of the assumptions of earlier file systems is the hardware either returns correctness, or reports a problem e.g. uncorrectable read error or media error. That's been shown to be untrue even with enterprise class hardware, largely by the ZFS developers, hence why it exists. And also why ZFS has had quite a lot less "bad press" where Btrfs wasn't developed in a kind of skunkworks, it was developed out in the open where quite a lot of early users were using it with ordinary every day hardware.

And as it turns out, we see most hardware by make/model doing mostly the right things, a small number of make/models, making up a significant minority of usage volume, don't do the right things. Hence, Btrfs has always had full checksumming of data and metadata. Both XFS and ext4 were running into the same kinds of problems Btrfs (and ZFS before it) revealed - torn writes, misdirected writes, bit rot, memory bit flips, and even SSD's exhibit prefail behavior by returning either zeros or garbage instead of data (or metadata). XFS and ext4 subsequently added metadata checksums, which further reinforced the understanding that devices sometimes do the wrong thing and also lie about it.

It is true that overwriting filesystems have a better chance of repairing metadata inconsistencies. A big reason why is locality. They have fixed locations on disk for different kinds of metadata, thus a lot of correct assumptions can be made about what should be in that location. Btrfs doesn't have that at all, it has very few fixed locations for metadata (pretty much just the super blocks). Since no assumptions can be made about what's been found in metadata areas, it's harder to fix.

So the strategy is different with Btrfs (and probably ZFS too since it has a fairly nascent fsck even compared to Btrfs's) - cheap and fast replication of data via snapshots and send/receive, which requires no deep traversal of either the source or destination. And equally cheap and fast restore (backwards replication) using the same method. Conversely, conventional backup and restore are meaningfully different when reversing, so you have to test both the backup and restore to really understand if your backup method is reliable. That's going to be your disaster go to rather than trying to fix them. Fixing is almost certainly going to take much longer than restoring. If you don't have current backups, at least Btrfs now has various rescue mount options to make the file system more tolerant of broken file systems, but as a consequence you also have to mount read-only. Pretty good chance you can still get your data out, even if it's inconvenient to have to wipe the file system and create a new one. It'll still be faster than mucking with repair.

Also, Btrfs since kernel 5.3 has both read time and write time tree checkers, that verify certain trees for consistency, not just blindly accepting checksums. Various problems are exposed and stopped before they can cause worse problems, and even helps find memory bitflips and btrfs bugs. Btrfs doesn't just complain about hardware related issues, it'll rat itself out if it's to blame for the problem - which at this point isn't happening any more often than ext4 or XFS in very large deployments (millions of instances).

alerighi · on March 8, 2022

> I'm not sure how you prove that ext4 is less likely to become corrupt. But it is easily shown that it's less likely to inform you that there's corruption.

I didn't talk only about corruption of the filesystem itself (I don't know if it's more or less likely with BTRFS, someone says that BTRFS is more likely to become corrupt with power failures, I don't know if it's true), but also from hardware failures. In case of a disk with damaged sectors (I know that we should have 3 backups with one offsite, but you always have that one disk with important data on it that it's a year that you are promising to backup next day till it breaks) I think that a filesystem with a simpler structure will lead to an higher probability of recovering the data, while I think that with BTRFS, or any filesystem that is COW, uses compression, volumes, etc that is more difficult, because files are not stored as plain blocks on the disk, but have a more complex structure that must need to be decoded.

Also BTRFS is kind of a new filesystem, that has two disadvantages, there are not all the tools that were developed over the years for ext4, and also BTRFS driver is continuing evolving. Why I can be pretty confident that if I format an hard disk today with an ext4 filesystem in 20 years I will find a driver for a modern Linux (or whatever OS will replace it in 20 years) to mount it, can we have the same assurance with BTRFS? I don't know.

So for the purpose of making backups and archiving data, I think that I will stick with ext4 for a while. While on my laptop, and systems that I use, I use BTRFS without any problems.

cmurf · on March 8, 2022

>someone says that BTRFS is more likely to become corrupt with power failures

No. If the drive honors flush/FUA, Btrfs is less likely to corrupt data or metadata than overwriting file systems because the interruption won't result in incomplete overwrites. So this would hold true for any copy-on-write vs overwriting file system (and probably also log based file systems). The trouble is if the drive is transiently lying about flush/FUA success, and then there's an ill timed crash. There's the chance the super blocks written point to trees that don't exist yet because the write order hasn't been honored due to flush/FUA being ignored. There are backup trees, so it might be possible to work around this defect with the `rescue=usebackuproot` mount option, but sometimes the defect is so bad that you get all kinds of write reordering such that Btrfs only finds trees with the wrong generation, and it fails to mount. Often it's still possible to get your data out with the offline scrape tool, `btrfs restore`. But it's a difficult problem to deal with. In theory it's similar on ZFS but I know nothing about its on-disk format so maybe its metadata has some locality in which case certain assumptions could be made to allow it to better work around such a drive firmware defect? I'm not sure. On a power fail, it is possible Btrfs loses the most recently written data if the writes that were in-progress and thus not yet fully committed to stable media. How much data really depends on the application doing the writes.

>In case of a disk with damaged sectors

Btrfs by default keeps two copies of metadata and it automatically deals with this problem, while also self-healing when such problems are encountered.

>a filesystem with a simpler structure will lead to an higher probability of recovering the data, while I think that with BTRFS, or any filesystem that is COW, uses compression, volumes, etc that is more difficult, because files are not stored as plain blocks on the disk, but have a more complex structure that must need to be decoded.

The ondisk format is fairly simple and extendible. Metadata isn't subject to compression. In the case of bad sectors with compressed (user) data, you'll certainly lose more data than if it weren't compressed. There's an expected trade off here, it's not really a Btrfs issue but just the way all compression algorithms work. You get some small corruption and it'll have a bigger effect.

>So for the purpose of making backups and archiving data, I think that I will stick with ext4 for a while.

I used to hedge my bets by having multiple copies of data on different file systems (including ZFS) but haven't done that in years. I've seen too many cases of (hardware induced) data corruption being replicated into backups and archives without any warning it was happening until it was too late - and only corrupt copies remained.

nine_k · on March 7, 2022

With this, I wonder why ext4 and not XFS.

chris37879 · on March 7, 2022

I didn't know about the swapfile thing... but TIL. I had been wondering how to to make a non-snapshotted volume for some other reasons, though, so that's a 2 birds with one stone thing, thank you!