Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> We have automated validation and automated proofs.

Example?

> Proof is necessary. Do you validate the theorem prover, or trust that it works? Do you prove the compiler is correctly compiling the program (when it matters, you should, given they do sometimes re-write things incorrectly) or trust the compiler?

I trust that the people who wrote the compiler and use it will fix mistakes. I trust the same people to discover compiler backdoors.

As for the rest of what you wrote: you're missing the point entirely. Rowhammer, the fdiv bug, they're all mistakes. And sure, malevolence also exists. But when mistakes or malevolence are found, they're fixed, or worked around, or at least documented as mistakes. With an LLM you don't even know how it's supposed to behave.



> Example?

Unit tests. Lean. Typed languages. Even more broadly, compilers.

> I trust the same people to discover compiler backdoors.

https://micahkepe.com/blog/thompson-trojan-horse/

> you're missing the point entirely. Rowhammer, the fdiv bug, they're all mistakes. And sure, malevolence also exists.

Rowhammer was a thing because the physics was ignored. Calling it a mistake is missing the point, it demonstrates the falseness of the previous claim:

  We have properly abstracted away the physics of computation. A modern computer operates in a way where, if you use it the way you've been instructed to, the physics underlying the computations cannot affect the computation in any undocumented way.
Rowhammer *is* the physics underlying the computations affecting the computation in a way that was undocumented prior to it getting discovered and, well, documented. Issues like this exist before they're documented, and by definition nobody knows how many unknown things like this have yet to be found.

> But when mistakes or malevolence are found, they're fixed, or worked around, or at least documented as mistakes.

If you vibe code (as in: never look at the code), then find an error with the resulting product, you can still just ask the LLM to fix that error.

I only had a limited time to experiment with this before Christmas (last few days of a free trial, thought I'd give it a go to see what the limits were), and what I found it doing wrong was piling up technical debt, not that it was a mysterious ball of mud beyond its own ability to rectify.

> With an LLM you don't even know how it's supposed to behave.

LLM generated source code: if you've forgotten how to read the source code it made for you to solve your problem and can't learn how to read that source code and can't run the tests of that source code, at which point it's as interpretable as psychology.

The LLMs themselves: yes, this is the "interpretability" problem, people are working on that.


> Unit tests.

Not proof.

> Lean.

Fantastic. But what proportion of developers are ready to formalize their requirements in Lean?

> Typed languages. Even more broadly, compilers.

For sufficiently strong type systems, sure! But then we're back in the above point.

> https://micahkepe.com/blog/thompson-trojan-horse/

I am of course aware. Any malevolent backdoor in your compiler could also exist in your LLM. Or the compiler that compiled the LLM. So you can never do better.

> Rowhammer is the physics underlying the computations affecting the computation in a way that was undocumented prior to it getting discovered and, well, documented. Issues like this exist before they're documented, and by definition nobody knows how many unknown things like this have yet to be found.

Yep. But it's a bug. It's a mistake. The unreliability of LLMs is not.

> If you vibe code (as in: never look at the code), then find an error with the resulting product, you can still just ask the LLM to fix that error.

Of course. But you need skills to verify that it did.

> LLM generated source code: if you've forgotten how to read the source code it made for you to solve your problem and can't learn how to read that source code and can't run the tests of that source code, at which point it's as interpretable as psychology.

Reading source code is such a minute piece of the task of understanding code that I can barely understand what you mean.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: