Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Generally speaking, the purpose of a program is not to minimize the number of memory safety bugs. All other things being equal, yes, having fewer memory safety bugs is better than having more. But perhaps you're trading legible bugs for illegible bugs? The rust implementation is most likely going to be more complex than the c implementation (which is fair since it almost eliminated a whole class of bugs), and in that complexity there is extra room for non-memory safety related bugs.

There's probably also 500x more people who know c to a given level then know rust to a given level.

If we have an analyzer that can find memory safety bugs in C, we could also just put that in the CI pipeline, or as a pre-submit hook before you're allowed to add code to a code base.



This idea that if Rust doesn't have all those memory safety bugs it must somehow have loads of other bugs we haven't discovered reminds me of Americans insisting that countries which don't have their lousy gun safety problems must have the same effects by some other means they haven't detected - Like, OK England doesn't have lots of gun murders like America, but surely loads of English people are just dropping dead because someone threw a yoghurt at them, or used harsh language, and we just missed them off our statistics ?

No man, it is possible to just do better, and this is an example of just doing better. The Rust is just better software. We can and should learn from this sort of thing, not insist that better is impossible and the evidence suggesting otherwise must be a mirage.



I don't see the connection, did you reply in the wrong place?


You're not fully understanding the issue with memory safety. When you write C or C++, you're promising that you won't violate memory safety at all. That's just a basic requirement of what it means to write in those languages.

The graph about reverted code also addresses the "illegible bugs" argument.

As for an analyzer, that's what ASAN is. I hope I don't need to explain why that's not a universal solution (even though everyone should be using it).


> You're not fully understanding the issue with memory safety. When you write C or C++, you're promising that you won't violate memory safety at all.

The post you reply to does not indicate a misunderstanding of memory safety at all. .


The comment I'm responding to implicitly assumes memory safety violations are like other bugs where that it's meaningful to speak of programs being more or less correct depending on the number of issues. What I'm emphasizing is that code with safety violations, strictly speaking, isn't C/C++ at all. It's more like parsing paint splatters as perl [0]. You might get something resembling what you want if you're lucky, but you also might not depending on how the compiler feels that day.

Let's use an example: https://godbolt.org/z/TP6n4481j

The code shows main immediately calling a nullptr. What the compiler generates is a program that calls unreachable() instead. These are two different programs. If memory safety is "just" a bug, this would be a miscompilation. It's not a miscompilation though, because what I've given the compiler is something that resembles C++, but is actually some similar language where null dereferences are meaningful. The compiler only knows about C++ though and C++ doesn't have nullptr dereferences, so it assumes I haven't done that. Instead it generates a program corresponding to an execution trace that is valid C++, even if it can't see the call to NeverUsed(). If you use -O0, you get the segfault as expected.

A single instance of memory unsafety (or other UB) can take your program arbitrarily far from "correct". All other things being equal, a program with 1 violation might be just as incorrect as a program with 100. I could add a hundred more lines of safety violations after Do() without changing the compiled behavior. You don't even need to execute the unsafety to have "spooky action at a distance" cause that change.

[0] https://web.archive.org/web/20190406194101/http://colinm.org...


Finally someone who actually gets it.

Many Rust proponents are experienced C and C++ developers who have dealt with this situation for decades. Given the language, it's understandable that compilers make the choices that they do. It's also understandable that programmers find it unreasonably difficult to reason about code written in such a language.


> What I'm emphasizing is that code with safety violations, strictly speaking, isn't C/C++ at all.

This isn't really correct and many programming language standards (including that of C and C++) don't support this view. Many language standards define a notion of conformance. Strictly conforming programs aren't allowed to invoke behaviors that which are undefined[1].

Conforming programs do not have this requirement and basically any non-trivial C and C++ programs are written to this rather than the notion of "strictly conforming".

Most non-trivial programs are not strictly conforming (including some C compilers themselves), generally because restricting the set of targets to something smaller than "any possible C implementation" is useful.

It is perfectly legal (and very desirable in cases where the standards fall short of usefulness) for a C compiler to define undefined behavior. What you compiled is still a C program, just one that isn't portable across the entire potential set of implementations.

[1]: Or unspecified or implementation-defined, for that matter, but this part tends to get left out of discussions.


The C++ ISO document describes conforming implementations of their language, ie compilers and similar tools - that conformance isn't a property of your program at all.

So far as I can tell there is no mention of the program conformance you're describing.


There's a line in the standards that basically says a conforming program is anything acceptable by a conforming implementation. In theory you could have an implementation that gives semantics to UB like Fil-C or CCured do. No mainstream implementation does that for memory unsafety due to the performance overhead, and conforming implementations are required to document those extensions. I don't think there's a sane argument for an implementation to intentionally choose the behavior in the example I provided and Clang certainly doesn't, so it's non-conformant regardless.


> No mainstream implementation does that for memory unsafety due to the performance overhead

It depends on what is considered memory safety here (especially when some of them are arguably unforced errors in the standards), but many implementations do in fact have options for this ("no delete null pointer checks" for example is an example of one such option, for example, which is used extensively by the Linux kernel for example).

The performance impact tends to be much more negligible outside of, sometimes contrived, benchmarks, especially when compared to algorithmic efficiencies or the like.


> There's a line in the standards that basically says a conforming program is anything acceptable by a conforming implementation.

Perhaps it "basically" says that, but it certainly doesn't appear to literally say any such thing, so you're going to need to specify where you believe you saw this so that I can have any idea what it actually says.


C standard N3096, Section 4:

    A conforming program is one that is acceptable to a conforming implementation.
That definition goes all the way back to C89. The C++ standard drops it for the term "well-formed program", but adds enough clarifications in 1.4 to mean essentially the same thing.


Ah, no. Most C++ programs that compile are not well-formed programs. This functions as an escape hatch for Rice's Theorem. You see, C++ even more so than C has semantic requirements - but Rice says all non-trivial semantic requiremnts are Undecidable. So, if you want what C++ says it wants it appears that compilers would be entirely impossible and that's awkward. To "fix" that C++ says it's fine if the compiler will compile your program even though it is not well-formed, the program doesn't have any meaning of course, only well-formed programs have meaning, but it did compile, so as a programmer you're happy...

C++ has a recurring phrase in its standard document "Ill-formed No Diagnostic Required" or IFNDR which carries this intent. The compiler can't tell you made a mistake, but you didn't actually write a valid C++ program so -shrug-

Because there's no way to tell for sure without exhaustive human examination we don't know for sure how many C++ programs aren't actually well-formed but experts who've thought about it tend to think the answer for large C++ software projects is all or most of them.


Chances are, a Rust implementation of certain things may be simpler than C implementation. C is a low-level language, so you have to do more housekeeping, and express things obliquely, via implementation, vs expressing things more declaratively in Rust.

Being simpler is not a given though.

"Knowing C" as being able to read and understand what's happening is quite separate from "knowing C" as being able to write it competently. Same thing with Rust: an algorithm written in rust is far from impenetrable for a non-expert, and even someone who sees Rust the first time but has enough experience with other languages.


The idea that people occasionally throw around that C is more 'simple' and less 'complex' than C++ or Rust and therefore it leads to more maintainable or easy to understand code is, IMO, completely bogus.

C is not simple, it is inept. There are so, so many bargain-bin features and capabilities that it just cannot do that it ends up creating much MORE complex code, not less complex code.

I mean, just the pretense that simple tool = simple engineering isn't necessarily true. Building a home using an excavator and drills is fairly straight forward. You know what's complicated? Trying to build a home using only a screwdriver. Yeah. Good luck with that, you're gonna have to come up with some truly insane processes to make that work. Despite a screwdriver being so much more simple than an excavator.

Trivial example: you want to build a container that can hold data of different types and perform generic operations on them.

C++ and Rust? Easy. Templates and generics. C? Up until a few years ago, your options were: 1. copy and paste (awful) or 2. use void * (also awful).

Copy and paste means your implementations will diverge and you just artificially multiplied your maintenance burden and complexity. And void pointer completely throws away any semblance of type safety, forces you to write stupid code that's way more complex than it needs to be, and, to top it off, is horrible for performance!

That's just one example, but there's so, so many when you look around C++ or Rust enough. And these are not rare things, to me. To me, these are everyday coding problems.

Anonymous functions? There's another one. Encapsulation? Just making not literally every piece of data universally mutable? Not possible in C. Trivial in C++ and Rust, and it makes your programs SO much easier to reason about.


> Just making not literally every piece of data universally mutable? Not possible in C. Trivial in C++ and Rust, and it makes your programs SO much easier to reason about.

And Rust is significantly better at this than C++ for the simple reason that mut is a modifier. I’ve lost track of how many times I’ve listened to Kate Gregory extol the virtues of const-ing all the things, but people still don’t systematically add it, and, as readers, we’re left wondering whether things actually need to be mutable, or the author forgot/didn’t know to add const-ness to their code. With Rust having opt-in mutability, you know for a fact that mutability was a deliberate choice (even if sometimes the only motivation was “make the compiler happy”).


> I’ve lost track of how many times I’ve listened to Kate Gregory extol the virtues of const-ing all the things, but people still don’t systematically add it

Adding const to _function-local_ variables only really matters when you "leak" a pointer or ref, whether mutable or const, to a function or variable the compiler can't optimize away:

    std::size_t sz = 4096;
    const std::size_t &szRef = sz;
    some_opaque_func(szRef);
    if (sz != 4096) std::abort(); // cannot be optimized away unless sz is const
as there is no way to know if something obtains a mutable ref to sz down the line.

In other cases like RVO, adding const is actually detrimental as it prevents the move-constructor from being selected (likewise with the move assignment operator).

Rust _needs_ to have const by default due to its aliasing model ("only one mutable ref per object") and you can't have cheap bound checks without this. But that, too, is a tradeoff (some classes of programs are hard to code in Rust)


Pretty sure the std::abort() can't be optimized away if sz is mutable since it's legal for some_opaque_func() to cast away szRef's const and modify sz via that. sz itself needs to be const for the if statement to be removable as dead code.

https://cpp.godbolt.org/z/Pa3bMh9Ee shows that both GCC and Clang keep the abort when sz is not const. Add const and the abort goes away.


Yes, that is what I said - sorry if this wasn't clear


Looking back I think I might have misread your comment and thought you meant that the const on the reference was what mattered. Sorry about that!


> The idea that people occasionally throw around that C is more 'simple' and less 'complex' than C++ or Rust and therefore it leads to more maintainable or easy to understand code is, IMO, completely bogus.

This, this, this.

C compilers are simple, but the C language is not, and let’s not even talk about C++.


I like to focus on the ways that C is actually quite complicated, especially the complications that directly provoke UB when you don't know about them. Integer promotion and strict aliasing are at the top of my list.


>Trivial example: you want to build a container that can hold data of different types and perform generic operations on them.

Do I?

I would simplify the problem to not need different types or generic operations.

Or if I really need generic operations, break them down to smaller operations so you don't need to take a bunch of type parameters everywhere.

For example containers, instead of having container<T>, have the container operations return an index or 'opcode', then the user applies that to their data. The container doesn't need to know about T, void pointers or sizes, just its own internal bookkeeping stuff.


That's valid, even sensible, but you have now left significant performance on the table.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: