Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's not a new phenomenon. Time was, people would copy-paste from blog posts with the same effect.




Always the same old tiring "this has always been possible before in some remotely similar fashion hence we should not criticise anything ever again" argument.

You could intuitively think it's just a difference of degree, but it's more akin to a difference of kind. Same for a nuke vs a spear, both are weapons, no one argues they're similar enough that we can treat them the same way


Yes, I'm so over this argument. It can literally be made for anything, and it is!

At the end of the day we're not performing war by poking other people with long sticks and we're not getting the word out by sending out a carrier pigeon.

Methods and medium matters.


I would bet in most organizations you can find a copy-pasted version of the top SO answer for email regex in their language of choice, and if you chase down the original commit author they couldn't explain how it works.

I think it's impossible to actually write an email regex because addresses can have arbitrarily deeply nested escaping. I may have that wrong. I'd hope that regex would be .+@.+ and that's it (watch me get Cunninghammed because there is some valid address wherein those plusses should be stars).

TIL Cunningham's Law[0]. I knew about that phenomenon but not the proper name. Thanks!

[0] https://en.wikipedia.org/wiki/Ward_Cunningham#Law


Yeah, but being able to produce nuclear-sized 10k+ LOC PRs to open-source projects in minutes with relatively-zero effort definitely is. At least you had to use your brain to know which blog posts/SO answers to copypasta from.

I don't see the problem with fentanyl given that people have been using caffeine forever.

I used to do that in simpler days. I'd add a link to where I copied it from so we could reference it if there were problems. This was for relatively small projects with just a few people.

> I'd add a link to where I copied it from

LLMs can't do this.

Your code is unambiguously better than any LLM code if you can comment a link to the stackoverflow post you copied it from.


Agreed on the first part for sure since an LLM is the computer/software version of a blender.

So, I'm agreed on the second part too then.


> Your code is unambiguously better than any LLM code if you can comment a link to the stackoverflow post you copied it from.

This is not a truism. "My" code might come from an LLM and that's fine if I can be reasonably confident it works. I might try to gain that confidence by testing the code and reading it to understand what it's doing. It is also true of blog post code, regardless of how I refer to the code; if I link to the blog post, it's because it does a better job of explaining than I ever could in code comments. Whether LLMs make one more productive is hard to measure but it seems to be missing the point to write this.

The point is, including the code is a choice and one should be mindful of it, no matter the code's origin. At that point, this comes off like you just have something to prove; there doesn't seem to be a reason not to use the LLM code if you know it works and you know why it works.


Believing you know how it works and why it works is not the same as that actually being the case. If the code has no author (in that it's been plagiarised by a statistical process that introduces errors), there's nowhere to go if you realise "oops, I didn't understand that as well as I had thought!".

> If the code has no author ... there's nowhere to go if you realise "oops, I didn't understand that as well as I had thought!"

That's also true if I author the code myself; I can't go to anyone for help with it, so if it doesn't work then I have to figure out why.

> Believing you know how it works and why it works is not the same as that actually being the case.

My series of accidental successes producing working code is honestly starting to seem like real skill and experience at this point. Not sure what else you'd call it.


> so if it doesn't work then I have to figure out why.

But it's built on top of things that are understood. If it doesn't work, then either:

• You didn't understand the problem fully, so the approach you were using is wrong.

• You didn't understand the language (library, etc) correctly, so the computer didn't grasp your meaning.

• The code you wrote isn't the code you intended to write.

This is a much more tractable situation to be in than "nobody knows what the code means, or has a mental model for how it's supposed to operate", which is the norm for a sufficiently-large LLM-produced codebase.

> My series of accidental successes

That somewhat misses the point. To write working code, you must have some understanding of the relationship between your intention and your output. LLMs have a poor-to-nonexistent understanding of this relationship, which they cover up with the ability to regurgitate (permutations of) a large corpus of examples – but this does not grant them the ability to operate outside the domain of those examples.

LLM-generated codebases very much do not lie within that domain: they lack the clues and signs of underlying understanding that human readers and (to an extent) LLMs rely on. Worse, the LLMs do replicate those signals, but they don't encode anything coherent in the signal. Unless you are very used to critically analysing LLM output, this can be highly misleading. (It reminds me of how chess grandmasters blunder, and struggle to even remember, unreachable board positions.)

Believing you know how LLM-generated code works, and why it works, is not the same as that actually being the case – in a very real sense that is different to that of code with human authors.


> "nobody knows what the code means, or has a mental model for how it's supposed to operate"

> Believing you know how LLM-generated code works, and why it works, is not the same as that actually being the case

This is a strawman argument which I'm not really interested to engage. You can assume competence. (In a scenario where one doesn't make these mistakes, what's left in your argument? It is a sufficiently strong claim to say these cannot be avoided such that it is reasonable to dismiss the claim unless supporting evidence is provided. In other words, the solution is as simple as not making these mistakes.) As I wrote up-thread, including the code is a choice and one should be mindful of it.


I am assuming competence. Competent people make these mistakes.

If "assume competence" means "assume that people do not make the mistakes they are observed to make", then why write tests? Wherefore bounds checking? Pilots are competent, so pre-flight checklists are a waste of time. Your doctor's competent: why seek a second opinion? Being mindful involves compensating for these things.

It's possible that you're just that good – that you can implement a solution "as simple as not making these mistakes" –, in which case, I'd appreciate if you could write up your method and share it with us mere mortals. But could it also be possible that you are making these mistakes, and simply haven't noticed yet? How would you know if your understanding of the program didn't match the actual program, if you've only tested the region in which the behaviours of both coincide?


Just like there are some easy "tells" with LLM generated English, vibecode has a certain smell to it. Parallel variables that do the same thing is probably the most common one I've seen in the hundreds of thousands of lines of vibecode I've generated and then reviewed (and fixed) by now. That's the philosophical Chinese room thought experiment though. It's a computer. Some sand that we melted into a special shape. Can it "understand"? Leave that for philosophers to decide. There's code, that was generated via LLM and not yacc, fine. Code is code though. If you sit down and read all of the code to understand what each variable, function, and class does, it doesn't matter where the code came from, that is what we call understanding what the code does. Sure, most people are too lazy to actually do that, and again, vibecode has a certain smell to it, but to claim that some because some artificial intelligence generated the code makes it incomprehensible to humans seems unsupported. It's fair to point out that there may not be humans that have bothered to, but that's a different claim. If we simplify the question, if ChatGPT generates the code to generate the Fibonacci sequence, can we, as humans, understand that code? Can we understand it if a human writes that same seven lines of code? As we scale up to more complex code though, at what point does it become incomprehensible to human grade intelligence? If it's all vibecode that isn't being reviewed and is just being thrown into a repo, then sure, no human does understand it. But it's just code. With enough bashing your head against it, even if there are three singleton factory classes doing almost the exact same thing in parallel and they only share state on Wednesdays over an RPC mechanism that shouldn't even work in the first place, but somehow it does, code is still code. There's not arcane hidden whitespace that whispers to the compiler to behave differently because AI generated it. It may be weird and different, but have you tried Erlang? You huff enough of the right kind of glue and you can get anything to make sense. If we go back to the Chinese room thought experiment though. If I, as a human, am able to work on tickets to cause intentional changes to the behavior of the vibecoded program/system that results in desired behavior/changes, at what point does it become actual understand vs merely thinking I understand the code.

Say you start at BigCo and are given access to their million line repo(s) with no docs and are given a ticket to work on. Ugh. You just barely started. But after you've been there for five years, it's obvious to you what the Pequad service does, and you might even know who gave it that name. If the claim is LLMs generate code that's simply incomprehensible by humans, the two counterexamples I have for you are TheDailyWtf.com, and Haskell.


> but to claim that some because some artificial intelligence generated the code makes it incomprehensible to humans seems unsupported

That's not my claim. My claim is that AI-generated code is misleading to people familiar with human-written code. If you've grown up on AI-generated code, I wouldn't expect you to have this problem, much like how chess newbies don't find impossible board states much harder to process than possible ones.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: