Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Look. I get that we can debate about what's truly novel. I never even actually claimed that humans regularly do things that are actually all that novel. That wasn't the point. The point is that LLMs struggle with novelty because they struggle to generalize. Humans clearly are able to generalize vastly better than transformer-based LLMs.

Really? How do I know that with such great certainty?

Well, I don't know how much text I've read in one lifetime, but I can tell you it's less than the literally multiple terabytes of text fed into the training process of modern LLMs.

Yet, LLMs can still be found failing logic puzzles and simple riddles that even children can figure out, just by tweaking some of the parameters slightly, and it seems like the best thing we can do here is just throw more terabytes of data and more reinforcement learning at it, only for it to still fail, even if a little more sparingly each time.

So what novel things do average people do anyways, since beating animals with rocks apparently took 100,000 years to figure out? Hard call. There's no definitive bar for novel. You could argue almost everything we do is basically just mixing things we've seen together before, yet I'd argue humans are much better at it than LLMs, which need a metric shit load of training data and burn tons of watts. In return, you get some superhuman abilities, but superhuman doesn't mean smarter or better than people; a sufficiently powerful calculator is superhuman. The breadth of an LLM is much wider than any individual human, but the breadth of knowledge across humanity is obviously still much wider than any individual LLM, and there remain things people do well that LLMs definitely still don't, even just in the realm of text.

So if I don't really believe humans are all that novel, why judge LLMs based on that criteria? Really two reasons:

- I think LLMs are significantly worse at it, so allowing your critical thinking abilities to atrophy in favor of using LLMs is really bad. Therefore people need to be very careful about ascribing too much to LLMs.

- Because I think many people want to use LLMs to do truly novel things. Don't get me wrong, a lot of people also just want it to shit out another React Tailwind frontend for a Node.js JSON HTTP CRUD app or something. But, a lot of AI skeptics are no longer the types of people that downplay it as a cope or out of fear, but actually are people who were at least somewhat excited by the capabilities of AI then let down when they tried to color outside the lines and it failed tremendously.

Likewise, imagine trying to figure out how novel an AI response is; the training data set is so massive, that humans can hardly comprehend the scale. Our intuition about what couldn't possibly be in the training data is completely broken. We can only ever easily prove that a given response isn't novel, not that it is.

But honestly maybe it's just too unconvincing to just say all of this in the abstract. Maybe it would better to at least try to come up with some demonstration of something I think I've come up with that is "novel".

There's this sort-of trick I came up with when implementing falling blocks puzzle games for handling input that I think is pretty unique. See, in most implementations, to handle things like auto-repeating movements, you might do something like have a counter that increments, then once it hits the repeat delay, it gets reset again. Maybe you could get slightly more clever by having it count down and repeat at zero: this would make it easier to, for example, have the repeat delay be longer for only the first repeat. This is how DAS normally works in Tetris and other games, and it more or less mirrors the key repeat delay. It's easier with the count down since on the first input you can set it to the high initial delay, then whenever it hits zero you can set it to the repeat delay.

I didn't like this though because I didn't like having to deal with a bunch of state. I really wanted the state to be as simple as possible. So instead, for each game input, I allocate a signed integer. These integers are all initialized to zero. When a key is pressed down, the integer is set to 1 if it is less than 1. When a key is released, it is set to -1 if it is greater than 0. And on each frame of game logic, at the end of the frame, each input greater than 0 is incremented, and each input less than 0 is decremented. This is held in the game state and when the game logic is paused, you do nothing here.

With this scheme, the following side effects occur:

- Like most other schemes, there's no need to special-case key repeat events, as receiving a second key down doesn't do anything.

- Game logic can now do a bunch of logic "statelessly", since the input state encodes a lot of useful information. For example, you can easily trigger an event upon an input being pressed by using n == 1, and you can easily trigger an event upon an input being released using n == -1. You do something every five frames an input is held by checking n % 5 == 0, or slightly more involved for a proper input repeat with initial delay. On any given frame of game logic, you always know how long an input has been held down and after it's released you know how many frames it has been since it was pressed.

Now I don't talk to tons of other game developers, but I've never seen or heard of anyone doing this, and if someone else did come up with it, then I discovered it independently. It was something I came up with when playing around with trying to make deterministic, rewindable game logic. I played around with this a lot in highschool (not that many years ago, about 15 now.)

I fully admit this is not as useful for the human race as "hitting animals with a rock", but I reckon it's the type of thing that LLMs basically only come up with if they've already been exposed to the idea. If I try to instruct LLMs to implement a system that has what I think is a novel idea, it really seems to rapidly fall apart. If it doesn't fall apart, then I honestly begin to suspect that maybe the idea is less novel than I thought... but it's a whole hell of a lot more common, so far, for it to just completely fall apart.

Still, my point was never that AI is useless, a lot of things humans do aren't very novel after all. However, I also think it is definitely not time to allow one's critical thinking skills to atrophy as today's models definitely have some very bad failure modes and some of the ways they fail are ways that we can't afford in many circumstances. Today the biggest challenge IMO is that despite all of the data the ability to generalize really feels lacking. If that problem gets conquered, I'm sure more problems will rise to the top. Unilaterally superhuman AI has a long way to go.



That seems fair and accurate.

I guess disagreement about this question often stems from what we mean by "human", even more than what we mean by "intelligence".

There are at least 3 distinct categories of human intelligence/capability in any given domain:

1) average human (non-expert) - LLMs are already better (mainly because the average human doesn't know anything, but LLMs at least have some basic knowledge),

2) domain expert humans - LLMs are far behind, but can sometimes supplement human experts with additional breadth,

3) collective intelligence of all humans combined - LLMs are like retarded cavemen in comparison.

So when answering if AI has human-level intelligence, it really makes sense to ask what "human-level" means.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: