Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We also have machines that can perfectly and deterministically check written code for correctness.

And the stohastic LLM can use those tools to check whether its work was sufficient, if not, it will try again - without human intervention. It will repeat this loop until the deterministic checks pass.



> We also have machines that can perfectly and deterministically check written code for correctness.

Please do provide a single example of this preposterous claim.


It's not like testing code is a new thing. Junit is almost 30 years old today.

For functionality: https://en.wikipedia.org/wiki/Unit_testing

With robust enough test suites you can vibe code a HTML5 parser

- https://ikyle.me/blog/2025/swift-justhtml-porting-html5-pars...

- https://simonwillison.net/2025/Dec/15/porting-justhtml/

And code correctness:

- https://en.wikipedia.org/wiki/Tree-sitter_(parser_generator)

- https://en.wikipedia.org/wiki/Roslyn_(compiler)

- https://en.wikipedia.org/wiki/Lint_(software)

You can make analysers that check for deeply nested code, people calling methods in the wrong order and whatever you want to check. At work we've added multiple Roslyn analysers to our build pipeline to check for invalid/inefficient code, no human will be pinged by a PR until the tests pass. And an LLM can't claim "Job's Done" before the analysers say the code is OK.

And you don't need to make one yourself, there are tons you can just pick from:

https://en.wikipedia.org/wiki/List_of_tools_for_static_code_...


> It's not like testing code is a new thing. Junit is almost 30 years old today.

Unit tests check whether code behaves in specific ways. They certainly are useful to weed out bugs and to ensure that changes don't have unintended side effects.

> And code correctness:

These are tools to check for syntactic correctness. That is, of course, not what I meant.

You're completely off the mark here.


What did you mean then if unit tests and syntactic correctness aren't what you're looking for?


Algorithmic correctness? Unit tests are great for quickly poking holes in obviously algorithmically incorrect code, but far from good enough to ensure correctness. Passing unit tests is necessary, not sufficient.

Syntactic correctness is more or less a solved problem, as you say. Doesn't matter if the author is a human or an LLM.


It depends on the algorithm of course. If your code is trying to prove P=NP, of course you can't test for it.

But it's disingenuous to claim that even the majority of code written in the world is so difficult algorithmically that it can't be unit-tested to a sufficient degree.


Suppose you're right and the "majority of code" is fully specified by unit testing (I doubt it). The remaining body of code is vast, and the comments in this thread seem to overlook that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: