Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There’s something disturbing about the idea of silent data loss, it totally undermines the peace of mind of having backups. ZFS is good, but you can also just run rsync periodically with checksum and dryrun args and check the output for diffs.


Absolutely, if you can't use a filesystem with checksums (zfs, btrfs, bcachefs) then rsync is a great idea.

I think filesystem checksums have one big advantage vs rsync. With rsync if there's a difference it isn't clear which one is wrong.


I have an MP3 file that still skips to this day, because a few frames were corrupted on disk twenty years ago.

I could probably find a new copy of it online, but that click is a good reminder about how backups aren’t just copies but have to be verified.


It happens all the time. Have a plan, perform fire drills. It's a lot of time and money, but there's no equivalent feeling to unfucking yourself quite like being able to get your lost, fragile data back.


The challenge with silent data loss is your backups will eventually not have the data either - it will just be gone, silently.

After having that happen a few times (pre-ZFS), I started running periodic find | md5sum > log.txt type jobs and keeping archives.

It’s caught more than a few problems over the years, and allows manual double checking even when using things like ZFS. In particular, some tools/settings just aren’t sane to use to copy large data sets, and I only discovered that when… some of it didn’t make it to it’s destination.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: