This argument by Penrose using Godel's theorem has been discussed (or, depending on who you ask, refuted) before in various places, it's very old. The first time I've seen it was in Hofstadter's "Godel, Escher, Bach", but a more accessible version is this lecture[1] by Scott Aaronson. There's also an interview with Aaronson with Lex Friedman where he talks about it some more[2].
Basically, Penrose's argument hinges on Godel's theorem showing that a computer is unable to "see" that something is true without being able to prove it (something he claims humans are able to do).
To see how the argument makes no sense, one only has to note that even if you believe humans can "see" truth, it's undeniable that sometimes humans can also "see" things that are not true (i.e., sometimes people truly believe they're right when they're wrong).
In the end, stripping away all talk about consciousness and other stuff we "know" makes humans different from machines, and confine the discussion entirely over what Godel's theorem can say about this stuff, humans are no different from machines, and we're left with very little of substance: both humans and computers can say things that are true but unprovable (humans can "see" unprovable truths, and LLMs can hallucinate), and both also sometimes say things that are wrong (humans are sometimes wrong, and LLMs hallucinate).
By the way "LLMs hallucinate" is a modern take on this: you just need a computer running a program that answers something that is not computable (to make interesting, think of a program that randomly responds "halts" or "doesn't halt" when asked whether some given Turing machine halts).
(ETA: if you don't find my argument convincing, just read Aaronson's notes, they're much better).
I think you're being overly dismissive of the argument. Admittedly my recollection is hazy but here goes:
Computers are symbol manipulating machines and moreover are restricted to a finite set of symbols (states) and a finite set of rules for their transformation (programs).
When we attempt to formalize even a relatively basic branch of human thinking, simple whole-number arithmetic, as a system of finite symbols and rules, then Goedel's theorem kicks in. Such a system can never be complete - i.e. there will always be holes or gaps where true statements about whole-number arithmetic cannot be reached using our symbols and rules, no matter how we design the system.
We can of course plug any holes we find by adding more rules but full coverage will always evade us.
The argument is that computers are subject to this same limitation. I.e. no matter how we attempt to formalize human thinking using a computer - i.e. as a system of symbols and rules, there will be truths that the computer can simply never reach.
> Computers are symbol manipulating machines and moreover are restricted to a finite set of symbols (states) and a finite set of rules for their transformation (programs).
> [...] there will be truths that the computer can simply never reach.
It's true that if you give a computer a list of consistent axioms and restrict it to only output what their logic rules can produce, then there will be truths it will never write -- that's what Godel's Incompleteness Theorem proves.
But those are not the only kinds of programs you can run on a computer. Computers can (and routinely do!) output falsehoods. And they can be inconsistent -- and so Godel's Theorem doesn't apply to them.
Note that nobody is saying that it's definitely the case that computers and humans have the same capabilities -- it MIGHT STILL be the case that humans can "see" truths that computers will never be able to. But this argument involving Godel's theorem simply doesn't work to show that.
I don’t see the logic of your argument. The fact that you can formulate inconsistent theories - where all falsehoods will be true - does not invalidate Gödel’s theorem. How does the fact that I can take the laws of basic arithmetic and add the axiom “1 = 0” to my system mean that Gödel doesn’t apply to basic arithmetic?
Godel's theorem only applies to consistent systems. From Wikipedia[1]:
First Incompleteness Theorem: Any consistent formal system F within which a certain amount of elementary arithmetic can be carried out is incomplete; i.e. there are statements of the language of F which can neither be proved nor disproved in F.
If a system is inconsistent, the theorem simply doesn't have anything to say about it.
All this means is that an "inconsistent" program is free to output unprovable truths (and obviously also falsehoods). There's no great insight here, other than trivially refuting Penrose's claim that "there are truths that no computer can ever output".
You’re equating computer programs producing “wrong results” and the notion of inconsistency - a technical property of formal logic systems. This is not what inconsistency means. An inconsistent formalization of human knowledge in the form of a computer program is trivial and uninteresting - it just answers “yes that’s true” to every single question you ask it. Such formalizations are not interesting or even relevant to the discussion or argument.
I think much of the confusion arises from mixing up the object language (computer systems) and the meta language. Fairly natural since the central “trick” of the Gödel proof itself is to allow the expression of statements at the meta level to be expressed using the formal system itself.
> An inconsistent formalization of human knowledge in the form of a computer program is trivial and uninteresting - it just answers “yes that’s true” to every single question you ask it.
That's only true if you make the program answer by following the rules of some logic that contains the principle of explosion. Not all systems of logic are like that. A computer could use fuzzy logic. It could use a system we haven't thought of yet.
You're imposing constraints on how a computer should operate, and at the same time allowing humans to "think" without similar constraints. If you do that, you don't need Godel's theorem to show that a human is more capable than a computer -- you just built computers that way.
I’m not imposing any constraints - the point is that inconsistent formulations are not interesting or relevant to the argument no matter what system of rules you look at. This has nothing to do with any particular formalism. I think the difficulty here is that words like completeness and inconsistency have very specific meanings in the context of formal logic - which do not match their use in everyday discussion.
I think we're talking past each other at this point. You seem to have brushed past without acknowledging my point about systems without the principle of explosion, and I'm afraid I must have missed one or more points you tried to make along the way, because what you're saying doesn't make much sense to me anymore.
This is probably a good point to close the discussion -- I'm thankful for the cordial talk, even if we ultimately couldn't reach common ground.
Yes! I think this medium isn’t helpful for understanding here but it’s always pleasant to disagree while remaining civil. It doesn’t help that I’m trying to reply on my phone (I’m traveling at moment) - in an environment which isn’t conducive to subtle understanding. All the best to you!
> We can of course plug any holes we find by adding more rules but full coverage will always evade us.
So if we assume that clever software can automate the process of plugging this holes. Is it then like the human mind? Are their still holes that can not be plugged not due to lack of cleverness in the software but due to limitations of the hardware sometimes called the substrate?
> The argument is that computers are subject to this same limitation. I.e. no matter how we attempt to formalize human thinking using a computer - i.e. as a system of symbols and rules, there will be truths that the computer can simply never reach.
If computers are limited by their substrate though it seems like humans might be limited by their substrate too, though the limits might be different.
Yes I think this is one way to attack the argument but you have to break the circularity somehow. Many of the dismissals of the Hofstadter/Penrose argument I’ve read here, I think, do not appreciate the actual argument.
Without Penrose giving solid evidence people making counter arguments tend to get dismissive then sloppy. Why put in the time to make well tuned arguments filled with evidence when the other side does not bother after all.
I've read from Hoftadter "I am a strange loop" that should go around those ideas too. The point of how you define consciousness (he does it in a more or less computable way, a sort of self-referential loop), so it may be within the reach of what we are doing with AIs.
But in any case, it is about definitions, not having very strict ones for consciousness, intelligence and so on, and human perception and subjectivity (the Turing Test is not so much about "real" consciousness but if an observer can decide if is talking with a computer or a human).
He misrepresents Penrose's argument. I remember Scott Aaronson met Penrose later on, and there was a clarification though they still dont agree.
In any case, here's a response to the questions (some responses are links to other comments in this page).
> Why does the computer have to work within a fixed formal system F?
The hypothesis is that we are starting with some fixed program which is assumed to be able to simulate human reasoning(just like starting with the largest prime assuming that there are finitely many primes in order to show that there are infinitely many primes). Of course, one can augment it to make it more powerful and this augmentation is in fact, how we show that the original system is limited.
Note that even a self-improving AI is itself a fixed process. We apply the reasoning on this program including its improvisation capability.
This argument would fall apart if we could simulate a human mind, and there are good reasons to think we could. Human brains and their biological neurons are part of the physical world, they obey the laws of physics and operate in a predictable manner. We can replicate them in computer simulations, and we already have although not on the scale of human minds (see [1]).
It's on Penrose and dualists to show why simulated neurons would act differently than their physical counterparts. Hand-waving about supposed quantum processes in the brain is not enough, as even quantum processes could be emulated. So far, all seems to indicate that accurate models of biological neurons behave like we expect them too.
It stands to reason then, that if a human mind can be simulated, computers are capable of thought too.
Hand-waving is not required as one is not going into microfoundations at all in the original argument. Someone making a thermodynamic argument for some conclusion, can later on try to speculate about how the conclusion is implemented with statistical mechanics. But the original argument is itself independent. Similarly, Penrose speculates about Orch-OR(which is not dualism btw, Penrose is a physicalist), but the truth of the argument isn't dependent on Orch-OR itself.
The first question is not the question I'd like answered. What I want to know is this:
> Why does the computer have to work within a CONSISTENT formal system F?
Humans are allowed to make mistakes (i.e., be inconsistent). If we don't give the computer the same benefit, you don't need Godel's theorem to show that the human is more capable than the computer: it is so by construction.
Take a group of humans who each make observations and deductions, possibly faulty. Then, they do extensive checking of their conclusions by interacting with humans and computer proof assistants etc. Let us name this process as HC.
A program which can simulate individual humans should also be able to simulate HC - ie. generate proofs which are accepted by HC.
---
Penrose's conclusion in the book is more weak - that a knowably correct process cannot simulate humans.
We now have LLMs which hallucinate etc that are not knowably correct. But, after reasoning based methods, they can try to check their output and arrive at better conclusions, as is happening currently in popular models. This is fine, and is allowed by Penrose's argument. The argument is applied to the 'generate, check, correct' process as a whole.
(I don't see how that relates to Godel's theorem, tough. If that's the current position held by Penrose, I don't disagree with him. But seeing the post's video makes me believe Penrose still stands behind the original argument involving Godel's theorem, so I don't know what to say...)
That a knowably correct program cant simulate human reasoning is basically Godel's theorem. One can use a diagonalization argument similar to Godel's proof for programs which try to deciding which Turing machines halt. Given a program P which is partially applicable, but always correct, we can use diagonalization to construct P' a more widely applicable and correct program ie. P' can say that some Turing machine will not halt while P is undecided. P'. So, this doesn't involve any logic or formal systems, but it is more general - Godel's result is a special case as the fact that a Turing machine halts can be encoded as a theorem and provability in a formal system can be encoded as a Turing machine.
Penrose indeed, believes both in the stronger claim - a program can't simulate humans and the weaker claim, a knowably correct program can't simulate humans.
The weaker claim being unassailable firstly shows that most of the usual objections are not valid and secondly, it is hard to split the difference ie. to generate the output of HC using a program which is not knowably correct. A program whose binary is uninterpretably but by magic only generates true theorems. Current AI systems including LLMs don't even come close.
Any theory which purports to show that Roger Penrose is able to "see" the truth of the consistency of mathematics has got to explain Edward Nelson being able to "see" just the opposite.
Basically, Penrose's argument hinges on Godel's theorem showing that a computer is unable to "see" that something is true without being able to prove it (something he claims humans are able to do).
To see how the argument makes no sense, one only has to note that even if you believe humans can "see" truth, it's undeniable that sometimes humans can also "see" things that are not true (i.e., sometimes people truly believe they're right when they're wrong).
In the end, stripping away all talk about consciousness and other stuff we "know" makes humans different from machines, and confine the discussion entirely over what Godel's theorem can say about this stuff, humans are no different from machines, and we're left with very little of substance: both humans and computers can say things that are true but unprovable (humans can "see" unprovable truths, and LLMs can hallucinate), and both also sometimes say things that are wrong (humans are sometimes wrong, and LLMs hallucinate).
By the way "LLMs hallucinate" is a modern take on this: you just need a computer running a program that answers something that is not computable (to make interesting, think of a program that randomly responds "halts" or "doesn't halt" when asked whether some given Turing machine halts).
(ETA: if you don't find my argument convincing, just read Aaronson's notes, they're much better).
[1] https://www.scottaaronson.com/democritus/lec10.5.html
[2] https://youtu.be/nAMjv0NAESM?si=Hr5kwa7M4JuAdobI&t=2553