Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The idea is unsettling because it reframes human agency

Not really, it's called discovery, aka science.

This weird framing is just perpetuating the idea of LLMs being some kind of magic pixie dust. Stop it.





Like magic pixie dust, nobody knows in detail how AI models work. They are not explicitly created like GOFAI or arbitrary software. The machine learning algorithms are explicitly written by humans, but the model in turn is "written" by a machine learning algorithm, in the form of billions of neural network weights.

I think we do know how they work, no? We give a model some input, this travels through the big neural net of probabilities (gotten with training) and then arrives at a result.

Sure, you don't know what the exact constellation of a trained model will be upfront. But similarly you don't know what, e.g, the average age of some group of people is until you compute it.


If it solves a problem, we generally don't know how it did it. We can't just look at its billions of weights and read what they did. They are incomprehensible to us. This is very different from GOFAI, which is just a piece of software whose code can be read and understood.

Any statistical model does this.

Statistical models have just a few parameters, machine learning models have billions. Possibly more than a trillion.

The number can be anything, is there a number at which "we don't know" starts?

The model's parameters are in your RAM, you insert the prompt, it runs through the model and gives you a result. I'm sure if you spend a bit of time, you could add some software scaffolding around the process to show you each step of the way. How is this different from a statistical model where you "do know"?


For just a few parameters, you can understand the model, because you can hold it in your mind. But for machine learning models that's not possible, as they are far more complex.

So a 150 parameter model we "don't understand how it works"?

May I point out that we don't know in detail how most code runs? Not talking about assembly, I am talking about edge cases, instabilities, etc. We know the happy path and a bit around it. All complex systems based on code are unpredictable from static code alone.

We know at least quite well how it runs if we look at the code. But we know almost nothing about how a specific AI model works. Looking at the weights is pointless. It's like looking into Beethoven's brain to figure out how it came up with the Moonshine sonata.

This applies to pretty much every technology:

When we built nuclear powerplant we had no idea what really mattered for safety or maintenance, or even what day-to-day operations would be like, and we discovered a lot of things as we ran them (which is why we have been able to keep expanding their lifetime much longer than they were planned for).

Same for airplanes: there's tons of empirical knowledge about them, and people are still trying to build better models for why things that works do works the way they do (a former roommate of mine did a PhD on modeling combustion in jet engines, and she told me how much of the details were unknown, despite the technology being widely used for the past 70 years).

By the way, this is the fundamental reason why waterfall often fails, we generally don't understand enough about something before we build it and use it extensively.


GOFAI software ≈ airplane

ML model ≈ bird




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: