>I think the test is better than many other commenters are giving credit. The te...

godelski · on Oct 5, 2024

  > The conclusion drawn from it, not so much. If humans fail your test for x and you're certain humans have x then you're not really testing for x

I think you misunderstand, but it's a common misunderstanding.

Humans have the *ability* to reason. This is not equivalent to saying that humans reason at all times (this was also started in my previous comment)

So it's none of: "humans have x", "humans don't have x", nor "humans have x but f doesn't have x because humans perform y on x and f performs z on x".

It's correct to point out that not all humans can solve this puzzle. But that's an irrelevant fact because the premise is not that human always reason. If you'd like to make the counter argument that LLMs are like humans in that they have the ability to reason but don't always, then you got to provide strong evidence (just like you need to provide strong evidence that LLMs can reason). But this (both) is quite hard to prove because humans aren't entropy minimizers trained on petabytes of text. It's easier to test humans because we generally have a much better idea of what they've been trained on and we can also sample from different humans that have been trained on different types of data.

And here's a real kicker, when you've found a human that can solve a problem (meaning not just state the answer but show their work) nearly all of them can adapt easily to novel augmentations.

So I don't know why you're talking about trickery. The models are explicitly trained to solve problems like these. There's no slight of hand. There's no magic tokens, no silly or stage wording that would be easily misinterpreted. There's a big difference between a model getting an answer wrong and a promoter tricking the model.

famouswaffles · on Oct 5, 2024

>I think you misunderstand, but it's a common misunderstanding. Humans have the ability to reason. This is not equivalent to saying that humans reason at all times (this was also started in my previous comment)

>So it's none of: "humans have x", "humans don't have x", nor "humans have x but f doesn't have x because humans perform y on x and f performs z on x".

This is all rather irrelevant here. You can sit a human for some arbitrarily long time on this test and he/she will be unable to solve it even if the human has theory of mind (the property we're looking for) the entire duration of the test, ergo the test is not properly testing for the property of theory of mind.

>So I don't know why you're talking about trickery. The models are explicitly trained to solve problems like these.

Models are trained to predict text. Solving problems is just what is often the natural consequence of this objective.

It's trickery the same way it can be considered trickery when professors would do it to human testers. Humans and Machines that memorize things take shortcuts in prediction when they encounter what they've memorized "in the wild". That's the entire point of memorization really.

The human or model might fail not because it lacks the reasoning abilities to solve your problem, but because its attention is diverted by misleading cues or subtle twists in phrasing.

And if you care about the latter, fine!, that's not a bad thing to care about but then don't pretend you are only testing raw problem solving ability.

godelski · on Oct 5, 2024

  > You can sit a human for some arbitrarily long time on this test and he/she will be unable to solve it even if the human has theory of mind

Correct. I suggest you sit longer

empath75 · on Oct 5, 2024

This test does not require theory of mind or test for "theory of mind" because there are many people who have a well formed theory of mind who cannot solve this problem, and well formulated, it can be solved by a simple logic program, which again, would not have any kind of theory of mind. It'd produce a large number of false positives _and_ false negatives.

godelski · on Oct 5, 2024

  > it can be solved by a simple logic program

Which relies on understanding that Albert and Bernard have mental states and disjoint information.

  A theory of mind includes the knowledge that others' beliefs, desires, intentions, emotions, and thoughts may be different from one's own.
  - https://en.wikipedia.org/wiki/Theory_of_mind