Good points. As for compression: I think most people compress logs post-rotation. So you'll be unlikely to have a corrupted compressed log. Either the file will be compressed, then then original removed, or compression will fail, and the original will remain untouched.
It's more of an issue with whatever is written to disk by running processes. As you point out, it's debatable whether or not unicode/ascii vs "binary" is a sensible distinction ... I'd say taking a hex-editor to a mangled but mostly ascii text file is easier than some (any) binary format, but perhaps you know of some easy-to-use tool that will take a file-description and give you back data? Things like figuring out integer encoding and offset for timestamps across different files etc... is much easier with a rather redundant ascii timestamp than some binary number?
If your filesytem eats your files... well then that's a different problem.
Exactly my point: this isn't a complex database format. It's a tagged binary format, written in an append only fashion. So you're only going to lose data if the tool decides to write bad data, but that's just as true of a text log format - your logs are useless if all those numbers don't actually relate to the values they claim to.
So any tool which can read a journald journal can happily do so until it hits hard corruption - which is about as well as you ever do with syslog. I'll gladly trade an unlikely and really narrow recovery profile, for smaller, easily machine readable, well-defined log files (in the sense that, to write an entry, someone wrote down the exact struct somewhere, and had to keep using it that way. No regex's which fail in some case which happens once every 1 million lines of log file). Especially since the compatibility layer is just "forward text logs to syslog".
Fair enough. But you still then need to fit an additional tool into your recovery image. As long as it is possible to do with a small static binary that can be expected to be available (say built into a version of busybox) I don't have a great problem with it.
It's more of an issue with whatever is written to disk by running processes. As you point out, it's debatable whether or not unicode/ascii vs "binary" is a sensible distinction ... I'd say taking a hex-editor to a mangled but mostly ascii text file is easier than some (any) binary format, but perhaps you know of some easy-to-use tool that will take a file-description and give you back data? Things like figuring out integer encoding and offset for timestamps across different files etc... is much easier with a rather redundant ascii timestamp than some binary number?
If your filesytem eats your files... well then that's a different problem.