It’s called emergent behavior. We understand how an llm works, but do not have e...

fc417fc802 · 2026-02-26T06:19:01 1772086741

> but do not have even a theory about how the behavior emerges from among the math

Actually we have an awful lot of those.

I'm not sure if emergent is quite the right term here. We carefully craft a scenario to produce a usable gradient for a black box optimizer. We fully expect nontrivial predictions of future state to result in increasingly rich world models out of necessity.

It gets back to the age old observation about any sufficiently accurate model being of equal complexity as the system it models. "Predict the next word" is but a single example of the general principle at play.

hnfong · 2026-02-26T06:41:03 1772088063

> black box optimizer

This is admission we don't know how it emerges.

Sure, we expect the behavior to emerge, but we don't know how.

fc417fc802 · 2026-02-26T07:07:13 1772089633

No, as I said, we have _lots_ of theories about exactly that at various levels of detail. The theories vary based on (at least) the specifics of the loss function being employed to construct the gradient. Giving an overview of that is far beyond the scope of this comment section (but it's well trodden ground so you can just go ask an LLM).

The "black box" bit refers to a generic, interchangeable optimization algorithm that simply makes the number go down (or up or whatever).

There are certainly various details about the internal workings of models that we don't properly understand but a blanket claim about the whole is erroneous.

netfortius · 2026-02-26T06:24:35 1772087075

I'd rather go the route of bats [1]

[1] https://en.wikipedia.org/wiki/What_Is_It_Like_to_Be_a_Bat%3F

devmor · 2026-02-26T06:11:11 1772086271

The good news is that despite being incredibly complex, it’s still a lot simpler than ants because it is at least all statistical linguistics (as far as LLMs are concerned anyways).

themafia · 2026-02-26T06:14:09 1772086449

> but do not have even a theory about how the behavior emerges

We fully do. There is a significant quality difference between English language output and other languages which lends a huge hint as to what is actually happening behind the scenes.

> but how exactly does anthill behavior come from ant behavior?

You can't smell what ants can. If you did I'm sure it would be evident.

spiralcoaster · 2026-02-26T06:23:38 1772087018

Two very big revelations here that I would love to know more about:

1. Can you reveal "what's actually happening behind the scenes" beyond the hint you gave? I can't figure it out.

2. Can you explain how an ants sense of smell leads to anthills?

jen729w · 2026-02-26T06:34:01 1772087641

> 2. Can you explain how an ants sense of smell leads to anthills?

Ant 0: doesn’t seem to be dangerous here. I’ll drop a scent.

Ant 1: oh cool, a safe place. And I didn’t die either. I’ll reinforce that.

Ant 142,857,098,277: cool anthill.

fc417fc802 · 2026-02-26T07:00:15 1772089215

The dynamics of ant nest creation are way more complicated than that. The evolved biological parallel of a procedural generation algorithm. In addition, the completed structure has to be compatible with the various programmed behaviors of the workers.

kristiandupont · 2026-02-26T06:17:23 1772086643

I am very curious about this significant hint, could you point me to some material?

canjobear · 2026-02-26T06:26:40 1772087200

> There is a significant quality difference between English language output and other languages

?

floren · 2026-02-26T06:29:45 1772087385

They're saying LLMs do better when outputting English than other languages, an assertion I'm not really able to test but have heard elsewhere.

bryanrasmussen · 2026-02-26T06:33:06 1772087586

and this is somehow not related to the size and availability of corpora in English?

floren · 2026-02-26T06:48:11 1772088491

No, I'm quite sure that's why it's better.

bryanrasmussen · 2026-02-26T07:00:27 1772089227

OK but then that goes back to their other assertion that it gives a huge hint at what is going on behind the scenes, is that huge hint just "more data gives better results!" if so, that doesn't seem at all important since that is the absolutely central idea of an LLM. That is not behind the scenes at all, that is the introduction to the play as written by the author.

Not your fault obviously, but they have not yet described what that huge hint is, and I'm just at the edge of my seat with anticipation here.