Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They found a memory safety bug in their Rust code and assumed it was the only memory safety bug in their Rust codebase. And then they compared it to the historical average in C++ code that's been around for almost two decades in production. I can't be the only one here who sees how biased this comparison is right?


Rather, they found one memory safety bug in their Rust codebase, and measured it against the legions of memory safety bugs they found in their C++ codebase. In neither case are they measuring against bugs not found, so no, it's not biased.


Except it's not an apples-to-apples comparison. The C++ code has been around a lot longer, and a lot of it was written with older versions of C++ which didn't have modern safety features. I'm sure there is a bunch of new/delete in their codebase still. And I'm sure they're actively looking for memory safety issues in C++, and probably not so hard (if at all) with Rust.


> The C++ code has been around a lot longer

They made an earlier report where they found out that older C/C++ code has actually a lot less new vulnerabilities compared to new code, so I guess here they are comparing to new C/C++ code to get the higher ratio, meaning the comparison should actually be apples-to-apples.


Is it a usual thing you do that when you're given data about a literal thousandfold improvement, in a context where there are well-understood theoretical and practical reasons why you might have expected to see such an improvement, you make up reasons why it is actually not an improvement at all, without either investigating to see whether those reasons are actually true or demonstrating that even if they were true, they could possibly explain more than a tiny fraction of the improvement?


I usually am skeptical about a literal thousandfold improvement, yes. And I'm not saying it's impossible, but rather that the data and the way it's presented has inherent biases. Its on the people making grandiose claims to prove them.


The fact that the safe subset of Rust can indeed be made safe, and that unsafe code can be encapsulated like this, have already been formally verified. This is an empirical demonstration that matches these formal results across a large company with a large amount of code (technically, it exceeds them, but this is only under the assumption that memory safety issues per line of unsafe Rust are identical to memory safety issues per line of C, which is really an unwarranted simplifying assumption).

Do you really believe that "they're not actively looking for memory safety issues in Rust" is (1) true (at least outside of Google, there is actually a ton of security work done specifically targeting just the unsafe blocks, since those are obviously where the memory safety issues lie) or (2) could possibly be responsible for a literal thousandfold reduction in memory safety issues? Remember that the Rust code is often integrated with C++ code--there is not necessarily a way to just test the C++ part even if you wanted to. Additionally, Google has explicitly prioritized code that interacts with untrusted data (like parsers and networking code) meaning it's likely to be easier to fuzz most of this Rust code than most new C++ code.

Also remember that, again, there are mechanized proofs of memory safety for a large subset of the safe portion of Rust, which constitutes 96% of the code under consideration here. The rate of memory safety bugs would have to be 25x as high per LOC in unsafe Rust code as in C code for the number of vulnerabilities to match. It would be far more shocking if we didn't see a dramatic reduction. Google is empirically demonstrating that the observed memory safety bugs per line of unsafe Rust is actually far lower than per line of C, but my point is that even if you think that is the result of bias or them not applying the same scrutiny to Rust code (something that is certainly not true of Rust vs. C code in the wild), the effect of this underrepresentation cannot possibly explain most of the reduction they observe.

Google deployed numerous state of the art mitigations prior to adopting Rust and still found that 70% of their CVEs were due to memory safety issues--your assertion that they are engaged in motivated reasoning and just wanted Rust to work out is pretty ill-founded. In fact when I worked for Google prior to Rust's release, they were strongly averse towards adopting any new language and believed that good engineering practices, automation, and a rigorous review process always outweighed the benefits of adopting a new language past their core ones, whatever its purported reliability or performance benefits. Security researchers are highly skeptical as a rule of these sorts of claims and have a lot of say at Google. They themselves changed their minds based on this sort of internal evidence.

I agree that skepticism is warranted, because we are constantly being sold things by industry. At a certain point, though, when the effect size is massive and persistent and the mechanism extremely clear, that skepticism (not in general, but of a particular claim) becomes the unscientific position. We are well past that point with Rust wrt memory safety.


You're sure of a lot of things.


Because I don't blindly accept bad science? It's more like others are sure that this data confirms their biases.


Where is your data? You seem to be biased.


Why do you blindly accept the status quo, though? You should be skeptical of both sides.


If you want something more comparable, they estimate that only 5% of their Rust code is within unsafe blocks. That makes only 5% of the code with potential for memory safety issues: that's already a 20x improvement. Let's make it 10x because unsafe blocks tend to be trickier, but you still get a lot of Rust guarantees.

The thing is with Rust, you know where to look for memory safety issues: the unsafe blocks. C and C++? GLHF that's your whole codebase. As they mentioned, you don't opt-out of all of Rust guarantees by going the unsafe route. Of course you can ditch them, but that'll be hugely visible during code review. Overall, you can be much more confident saying "yup there's no bug there" in Rust than in C or C++.


Even if they are off by a factor of 100 it's a huge win and the point stands.

It's fair to point this out and worth the mention. Still, I'd like to think that the engineers behind this can at least gauge the benefit of this endeavor with some accuracy despite the discrepancy in available data, and stating the data that is available only makes sense.


and they found 1000 memory safety bugs and assumed it was the only 1000 memory safety bugs in that code which have been in production for 2 decades. How naive can they be?


Even so, relatively speaking if C is not 50,000x worse, it must be at least 2000x worse.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: