I dream of having a BTFS that will fix my "damaged" media files. E.g. ones I media shift, if my disk was scratched and portions are missing, or if the codec options I picked suck, it could download the "damaged" portions of my media and fix it seamlessly.
Not the same as what you are talking about, but your comment reminded me of AccurateRip [1] which I used to make extensive use of back when I was ripping hundreds of CDs every year.
Pretty sure AccuRip is only a collections of checksums to validate your rips. http://cue.tools/wiki/CUETools_Database actually improved on it to provide that healing feature (via some kind of parity, I guess?).
Do you have any tricks you can share on how to rip a large library of CDs? It would be nice to semi-automate the ripping process but I haven't found any tools to help with that. Also the MusicBrainz audio tagging library (the only open one I am aware of?) almost never has good tags for my CDs that don't have to be edited afterwards.
I’ll be honest, this was around 2005-2008 — it was a long time ago and at the time I really enjoyed the ritual of it all.
The main advice I can give you is to use ripping software that integrates with AccurateRip (XLD, EAC, etc) and use a widely supported lossless format (like FLAC).
Also — I can’t remember all the details, but there’s a way to store a CUE file, along with some metadata alongside your rip such that you can recreate an exact copy of the original physical media.
At least for now, I’ve moved on to streaming services, but I’m happy to know that I have a large library of music that I ripped myself to fall back to using instead, should I ever choose to.
I never had it fully working because the last time I tried, I was too focused on using VMs or Docker and not just dedicating a small, older computer to it, but I think about it often and may finally just take the time to set up a station to properly rip all the Columbia House CDs I bought when I was a teen and held on to.
In the distant past iTunes was great at this (really). Insert a disc, its metadata is pulled in automatically, it’s ripped and tagged using whatever coded settings you want and when it’s done the disc is ejected.
Watch a show do some other work and when the toast pops out a new one in.
Ripping DVDs with HandBrake was almost as easy, but it wouldn’t eject the disc afterwards (though it could have supported running a script at the end, I don’t recall).
It really was. In the early 2000s I had a stack of Mac laptops doing exactly this. Made some decent cash advertising locally to rip people's CD collections!
I was ripping my CD's with KDE's own KIO interface, which also does CDDB checks and embeds original information to ID3 tags. Passing through MusicBrainz Picard always gave me good tags, but I remember fine tuning it a bit.
Now, I'll start another round with DBPowerAmp's ripper on macOS, then I'll see which tool brings the better metadata.
another use of this is to share media after I've imported it into my library. if I voluntarily scan hashes of all my media, if a smart torrent client could offer those files only (so a partial torrent because I always remove the superfluous files) it would help seed a lot of rare media files.
I used to use magnetico and wanted to make something that would use crawled info hashes to fetch the metadata and retrieve the file listing, then search a folder for any matching files. You'd probably want to pre-hash everything in the folder and cache the hashes.
I hope bitmagnet gets that ability, it would be super cool
I’ve done a lot of archival of CD-ROM based games, and it’s not clear to me this is possible without a lot of coordination and consistency (there are like 7 programs that use AccurateRip, )and those only deal with audio). I have found zero instances where a bin/cue I’ve downloaded online perfectly matches (hashes) to the same disc I’ve ripped locally. I’ve had some instances where different pressings if the same content hash differently.
I’ve written tools to inspect content (say in an ISO file system), and those will hash to the same value (so different sector data but the same resulting file system). Audio converted to CDDA (16-bit PCM) will hash as well.
If audio is transcoded into anything else, there’s no way it would hash the same.
At my last job I did something similar for build artifacts. You need the same compiler, same version, same settings, the ability to look inside the final artifact and avoid all the variable information (e.g. time). That requires a bit of domain specific information to get right.
Sorry I think you misunderstood me. I mean when I download a torrent called "Rachmaninov Complete Discography" I copy the files to the Music/Classical folder on my NAS. I can no longer seed the torrent unless I leave the original in the shared folder. But if I voluntarily let a crawler index and share my Music folder, it could see the hash of track1.flac and know that it associates with a particular file in the original torrent, thus allowing others to download it.
If you’re worried about bit-flipping, you could just store multiple copies of the hash and then do voting, since it’s small. If you’re worried about correlated sources of error that helps less, though.
As someone with no storage expertise I'm curious, does anyone know the likelyhood of an error resulting in a bit flip rather than an unreadable sector? Memory bit flips during I/O are another thing but I'd expect a modern HDD/SSD to return an error if it isn't sure about what it's reading.
Thanks for the link. I think that 10^14 figure is the likelyhood of the disk error correction failing to produce a valid result from the underlying media, returning a read error and adding the block to pending bad sectors. A typical read error that is caught by the OS and prompts the user to replace drives.
What I understand by bit flip is a corruption that gets past that check (ie the "flips balance themselves" and produce a valid ECC) and returns bad data to the OS without producing any errors. Only a few filesystems that make their own checksums (like ZFS) would catch this failure mode.
It's one reason I still use ZFS despite the downsides, so I wonder if I'm being too cautious about something that essentially can't happen.
Maybe this is a joke that’s over my head, but the OP wants a system where damaged media can be repaired. They have the damaged media so there’s no way to make a hash of the content they want.
Distribute parity files together with the real deal, like they do on Usenet? Usenet itself is pretty much this anyway. Not sure if the NNTP filesystem implementations work. Also, there's nzbfs [1]