[[Published in a PSYCHE symposium on Roger Penrose's book Shadows of the Mind in 1995. Penrose's reply is here.]]
In his stimulating book SHADOWS OF THE MIND, Roger Penrose presents arguments, based on Gödel's theorem, for the conclusion that human thought is uncomputable. There are actually two separate arguments in Penrose's book. The second has been widely ignored, but seems to me to be much more interesting and novel than the first. I will address both forms of the argument in some detail. Toward the end, I will also comment on Penrose's proposals for a "new science of consciousness".
The best way to address Gödelian arguments against artificial intelligence is to ask: what would we expect, given the truth of Gödel's theorem, if our reasoning powers could be captured by some formal system F. One possibility is that F is essentially unsound, so that Gödel's theorem does not apply. But what if F is sound? Then we would expect that:
(a) F could not prove its Gödel sentence G(F);
(b) F could prove the conditional "If F is consistent, then G(F) is true";
(c) F could not prove that F is consistent.
If our reasoning powers are capturable by some sound formal system F, then, we should expect that we will be unable to see that F is consistent. This does not seem too surprising, on the face of it. After all, F is likely to be some extremely complex system, perhaps as complex as the human brain itself, and there is no reason to believe that we can determine the consistency of arbitrary formal systems when those systems are presented to us.
There does not seem to be anything especially paradoxical about this situation. Many arguments from Gödel's theorem, such as that given by Lucas, founder at just this point: they offer us no reason to believe that we can see the truth of our own Gödel sentence, as we may be unable to see the consistency of the associated formal system. How does Penrose's argument fare?
Penrose is much more cautious in his phrasing. In Chapter 2, he argues carefully for the conclusion that our reasoning powers cannot be captured by a "knowably sound" formal system. This seems to be correct, and indeed mirrors the analysis above. If we are a sound formal system F, we will not be able to determine that F is sound. So far, this offers no threat to the prospects of artificial intelligence. The real burden of Penrose's argument is carried by Chapter 3, then, where he argues that the position that we are a formal system that is not "knowably sound" is untenable.
One position that an advocate of AI might take is to argue that our reasoning is fundamentally unsound, even in an idealization. I will not take this path, however. For a start, I have some sympathy with Penrose's idea that we have an underlying sound competence, even our performance sometimes goes astray. But further, it seems to me that to hold that this is the only problem in Penrose's argument would be to concede too much power to the argument. It would follow, for example, that there are parts of our arithmetical competence that no sound formal system could ever duplicate; it would seem that our unsoundness would be essential to our capacity to see the truth of Gödel sentences, for example. This would be a remarkably strong conclusion, and does not seem at all plausible to me. So I think that the deepest problems with Penrose's argument must lie elsewhere.
I will concede to Penrose that we are fundamentally sound, then. As before, the natural position for an advocate of AI is that our powers are captured by some sound formal system F that cannot demonstrate that F is sound. What is Penrose's argument against this position? He has two sub-arguments here, depending on whether we can know that F is the formal system that captures our reasoning.
If we could know that F captures our reasoning, Penrose's argument would be very straightforward:
(1) We know that we are sound;
(2) We know that F captures our reasoning;
so (3) We know that F is sound.
One might question premise (1) - I will raise some problems with it later - but it does have a certain plausible quality. Certainly, it seems antecedently more plausible than the much stronger position that we know that F is sound. But all this is irrelevant, as premise (2) is so implausible. There is very little reason to believe that if our reasoning is captured by F, then we could know that fact.
It might seem plausible that we could know that F underlies our processing - why couldn't we just investigate our underlying brain processes? But to do this would be to change the game. It is of no help to Penrose if we can know using external resources (such as perception inputs) that F captures our reasoning. For to use external resources would be to go beyond the resources provided by F itself. And there would be no contradiction in the supposition that F could know, using external resources, that F is consistent, and therefore that G(F) is true. A contradiction would only arise if F could know this wholly under its own steam.
For this argument to be at all relevant, then, we would need to know that F captures our reasoning powers wholly using our internal resources - that is, the resources that F itself provides. But there is not the slightest reason to believe that we could do this. If we are a formal system, we certainly cannot determine which formal system we are on the basis of introspection! So again, the advocate of artificial intelligence is in no danger. She need simply hold the unsurprising position that we are a formal system F, but that we can't tell through introspection that we are F.
To make his case, Penrose needs to argue that if we are a sound formal system F, then we could determine that F is sound, independently of any knowledge that we are F. That is, he needs to make the case that if F is presented to us, we could determine that it is sound through an analysis of F alone. This is the burden that Penrose tries to meet in section 3.3. It is this section that effectively carries all the crucial weight; if it does not succeed, then this line of Penrose's argument simply fails.
How does Penrose argue that we could see that F is sound? He argues in 3.3 that we can see F as a system of axioms and inference rules. Clearly, we can see that each of the axioms is true: if F can see their truth, so can we. Further, Penrose argues, we must be able to see that each of the basic inference rules is valid, as it is extremely implausible that our reasoning could rely on inference rules that we regard as "fundamentally dubious". And if we know that the axioms are true and that the inference rules are valid, then we know that F is sound.
But why should we accept that F consists of a set of axioms and inference rules? F, after all, is supposed to potentially correspond to any sort of computational system - it might be a simulation of the whole of the human brain, for example. This will not look anything like a neat logical system: we will not be able to decompose it into an underlying set of "axioms" and "rules of procedure". Rather, it will be a big computational system that churns away on a given statement as input, and eventually outputs "yes" or "no".
It is true that for any Turing machine that accepts a certain class of statements, we can find a corresponding axiom-plus-rules system that accepts the same class (or at least the closure of that class under logical consequence). There is a lemma by Craig to this effect; without it applications of Gödel's theorem to draw conclusions about Turing machines would not even get off the ground. But the "axiom-plus-rules" system that we end up with may be extraordinarily complex. In particular, the "inference rules" may be just about as complex as the original system - perhaps equivalent to a complex connectionist procedure for generating further theorems. And as before, there is no reason why we should be able to see that this sort of "rule" should be valid, any more than we could see from an analysis that an overall computational brain process is sound. This is not to say that we think we are relying on "fundamentally dubious" procedures - it is just that the procedures that govern the dynamics of our brain are too complex for us to analyse them as sound or otherwise.
In this section, Penrose seems to assume that the relevant class of computational systems are all something akin to theorem-provers in first-order logic, but of course there is no reason to make such an assumption. For his argument to have its full generality, proving that our physical processes could not even be simulated computationally, it must apply to any sort of computational process. Even within the realm of existing AI research, there are many computational procedures, such as connectionist networks, which are not decomposable into axioms and rules of inference.
(I suspect that even an advocate of logic-based AI might have a response to make here. It might be held, for example, that we may occasionally use certain complex inference rules (when we generate Gödel sentences by transfinite counting, for example), whose validity is not obvious to us on analysis, without this in any way impugning the reliability of our reasoning. We might soundly "use" a procedure despite its resistance to our analysis. This indeed is just what we might expect around the "outer limits" of Gödelization, which after all is really where Penrose's argument gains its force. There is no difficulty in the idea that the reasoning methods we use in everyday mathematics can be seen to be sound - Penrose's arguments really apply at the level of our unusual "Gödelizing" procedures, which rely on our ability to count transfinite ordinals. But to be able to see that some Gödelizing rule is valid would be akin to making that last step in a Gödelization procedure, the one that is just complex enough to be beyond us. But I leave these difficult issues aside for now.)
It is section 3.3 that carries the burden of this strand of Penrose's argument, but unfortunately it seems to be one of the least convincing sections in the book. By his assumption that the relevant class of computational systems are all straightforward axiom-and-rules system, Penrose is not taking AI seriously, and certainly is not doing enough to establish his conclusion that physics is uncomputable. I conclude that none of Penrose's argument up to this point put a dent in the natural AI position: that our reasoning powers may be captured by a sound formal system F, where we cannot determine that F is sound.
Hiding at the back of Chapter 3, however, Penrose has a new argument that escapes many of these problems. It is unfortunate that this argument was so deeply buried; most commentators seem to have missed it. Unlike the previous argument, this argument does not depend on the claim that we if we are a sound formal system F, we would be able to see that F is sound. Because of this, it is a more novel and interesting argument, and more worthy of attention.
The argument is developed in a roundabout way (which may have led some readers astray), but is summarized in the fantasy dialogue with a robot mathematician in 3.23. The argument is given in a somewhat indirect form, involving complex procedures by which a given formal system might have evolved, but its basic structure is very simple. In a simplified and somewhat loose form, the argument goes as follows:
(1) Assume my reasoning powers are captured by some formal system F (to put this more briefly, "I am F"). Consider the class of statements I can know to be true, given this assumption.
(2) Given that I know that I am F, I know that F is sound (as I know that I am sound). Indeed, I know that the larger system F' is sound, where F' is F supplemented by the further assumption "I am F". (Supplementing a sound system with a true statement yields a sound system.)
(3) So I know that G(F') is true, where this is the Gödel sentence of the system F'.
(4) But F' could not see that G(F') is true (by Gödel's theorem).
(5) By assumption, however, I am now effectively equivalent to F'. After all, I am F supplemented by the knowledge that I am F.
(6) This is a contradiction, so the initial assumption must be false, and F must not have captured my powers of reasoning after all.
(7) The conclusion generalizes: my reasoning powers cannot be captured by any formal system.
Strictly speaking, the conclusion that must be drawn is that I cannot know that I am identical to a formal system F; in showing that I can see the truth of G(F'), we assumed not just that I am F but that I know I am F. But this is still a strong conclusion. For example, it would rule out even the possibility that we could empirically discover that we were identical to some system F - if we were to "discover" this, the reasoning would lead us to a contradiction. So even this would be threatening to the prospects of AI.
The power of this argument stems from the fact that is does not depend on one's ability to determine that a system F is sound, or to determine that we are F. Rather, it relies on the assumption that one is F to reach the relevant conclusions, thus contradicting the assumption. On the face of it one might have thought that making such an assumption would show only that the larger system F' could prove the Gödel sentence of the smaller system F, but the insight of the argument is that things can be bootstrapped into a situation where F' sees its own Gödel sentence, leading to trouble.
As far as I can determine, this argument is free of the obvious flaws that plague other Gödelian arguments, such as Lucas's argument and Penrose's earlier arguments. If it is flawed, the flaws lie deeper. It is true that the argument has a feeling of achieving its conclusion as if by magic. One is tempted to say: "why couldn't F itself engage in just the same reasoning?". But although there are various directions in which one might try to attack the argument, no knockdown refutation immediately presents itself. For this reason, the argument is quite challenging. Compared to previous versions, this argument is much more worthy of attention from supporters of AI.
On reflection, I have come to believe that the greatest vulnerability in this argument lies in the assumption that we know (unassailably) that we are consistent. This assumption seems relatively innocuous, compared to the previous strong claim that we could determine that F is consistent; on the face of it, it does not seem vastly stronger than the assumption that we are consistent. But I think that in fact, it is this assumption, and not the assumption that we know we are F, that carries the central responsibility for generating the contradiction. I have largely become convinced of this through discussions with Daryl McCullough, and the central argument below (an adaptation of a result of Lob's) was suggested by him.
The best way to see this is to show that the assumption that we know we are consistent already leads to a contradiction in its own right, even without the further assumption that we know we are F. Specifically, we can argue that any system that "unassailably" believes in its own consistency will in fact be led to a contradiction (under certain plausible further assumptions). This can be done as follows.
In these matters, we are concerned with a system's reasoning about its own beliefs, as well as about mathematics. So we can assume it has a symbol B, representing belief, where B(n) corresponds to the statement that it believes the statement with Gödel number n. (Below, I abbreviate by writing "B(A)" instead of "B(`A')", where `A' is the Gödel number of A.) And let us write |- A if the system has the power to "unassailably" assert A. (By using this notation I do not intend to beg the question about whether the system is computational!) Then the following assumptions are reasonable (suppressing universal qualifiers):
(1) If |- A, then |- B(A).
(2) |- B(A_1) & B(A_1 -> A2) -> B(A2)
(3) |- B(A) -> B(B(A))
(1) says that if the system has the power to assert A, it has the power to assert B(A). (2) says essentially that the system knows it has the power to reason by modus ponens. (3) says, in effect, that the system knows (1). All of these assumptions seem unproblematic. To these we add the key assumption:
(4) |- not B(false)
which says that the system asserts that it is not inconsistent. It turns out that these assumptions, along with the assumption that the system has the resources to do Peano arithmetic, lead to a contradiction.
To see this, we simply construct a sentence G such that
(5) |- G <-> not B(G).
This is a standard diagonal construction, and does not rely on any assumptions about the system's computability. We define the function diag in Peano arithmetic so that diag(`C(x)') is `C(`C(x)')' for any predicate C. (For clarity, I reintroduce the `' notation for Gödel numbering.) Then let G be the sentence not B(diag(`not B(diag(x))')). It is straightforward to show that G <-> not B(`G'). As long as the system has at least the capacities of Peano arithmetic, it can replicate this reasoning, so that |- G <-> not B(`G').
G is effectively a sentence that says "I do not believe G", much like a standard Gödelian construction, but without any assumptions about computability. It is not hard to see how the contradiction arises. The system knows that if it believes G, it is unsound; so it knows that if it is sound, it does not believe G. But this is to say that it knows that if it is sound, G is true. By assumption, it knows that it is sound, so it knows that G is true. So now it must be unsound, as it has fallen into a contradiction. This reasoning is easily formalized:
(6) |- B(G) -> B(not B(G)) [from (5), (1), (2)]
(7) |- B(G) -> B(B(G)) [from (3)]
(8) |- B(G) -> B(false) [from (6), (7), (2)]
(9) |- B(false) -> B(G) [from (2), along with |- B(false -> G)]
(10) |- G <-> not B(false) [from (5), (8), (9)]
(11) |- B(G) [from (10), (4), (1)]
(12) |- B(false) [from (12), (9)]
We can see, then, that the assumption that we know we are sound leads to a contradiction. One might try to pin the blame on one of the other assumptions, but all these seem quite straightforward. Indeed, these include the sort of implicit assumptions that Penrose appeals to in his arguments all the time. Indeed, one could make the case that all of premises (1)-(4) are implicitly appealed to in Penrose's main argument. For the purposes of the argument against Penrose, it does not really matter which we blame for the contradiction, but I think it is fairly clear that it is the assumption that the system knows that it is sound that causes most of the damage. It is this assumption, then, that should be withdrawn.
Penrose has therefore pointed to a false culprit. When the contradiction is reached, he pins the blame on the assumption that our reasoning powers are captured by a formal system F. But the argument above shows that this assumption is inessential in reaching the contradiction: A similar contradiction, via a not dissimilar sort of argument, can be reached even in the absence of that assumption. It follows that the responsibility for the contradiction lies elsewhere than in the assumption of computability. It is the assumption about knowledge of soundness that should be withdrawn.
Still, Penrose's argument has succeeded in clarifying some issues. In a sense, it shows where the deepest flaw in Gödelian arguments lies. One might have thought that the deepest flaw lay in the unjustified claim that one can see the soundness of certain formal systems that underlie our own reasoning. But in fact, if the above analysis is correct, the deepest flaw lies in the assumption that we know that we are sound. All Gödelian arguments appeal to this premise somewhere, but in fact the premise generates a contradiction. Perhaps we are sound, but we cannot know unassailably that we are sound.
A reader who is not convinced by Penrose's Gödelian arguments is left with little reason to accept his claims that physics is noncomputable and that quantum processes are essential to cognition, although these speculations are interesting in their own right. But even if one accepts that human behavior can be accounted for computationally, there is still the question of human consciousness, which after all is Penrose's ultimate target.
Penrose is clear that the puzzle of consciousness is one of his central motivations. Indeed, one reason for his skepticism about AI is that it is so hard to see how the mere enaction of a computation should give rise to an inner subjective life. Why couldn't all the computation go in the dark, without consciousness? So Penrose postulates that we to appeal to physics instead, and suggests that the locus of consciousness may be a quantum gravity process in microtubules. But this seems to suffer from exactly the same problem. Why should quantum processes in microtubules give rise to consciousness, any more than computational processes should? Neither suggestion seems appreciably better off than the other.
Although Penrose's quantum-gravity proposal might at least conceivably help explain certain elements of human behavior (if behavior turned out to be uncomputable, for example), it simply seems to be the wrong sort of thing to explain human consciousness. Indeed, Penrose nowhere claims that it does, and by the end of the book the "Missing Science of Consciousness" seems as far off as it ever was. As things stand, even by the end of Penrose's book, we seem to be left in Penrose's position D: these physical theories leave consciousness entirely unexplained.
This might seem odd, given that Penrose says he embraces position C, but in fact C and D are quite compatible. This is because Penrose's four positions run together a number of separate issues. For convenience, I repeat the positions here:
A: All thinking is computation; in particular, feelings of conscious awareness are evoked merely by the carrying out of appropriate computations.
B: Awareness if a feature of the brain's physical action; and whereas any physical action can be simulated computationally, computational simulation cannot by itself evoke awareness.
C: Appropriate physical action evokes awareness, but this physical action cannot even be properly simulated computationally.
D: Awareness cannot be explained by physical, computational, or any other scientific terms.
Note that A, B, and C all concern how awareness is evoked, but D concerns how awareness is explained. These are two very different issues. To see the contrast, note that almost everybody would accept that the brain evokes awareness - if we were to construct a duplicate brain, there would be conscious experience associated with it. But it is far from clear that a physical description of the brain can explain awareness - many people have argued that given any physical account of brain processes, the question of how those processes evoke conscious experience will be unanswered by the physical account.
To really clarify the positions in the vicinity, we have to distinguish three questions:
(1) What does it take to simulate our physical action?
(2) What does it take to evoke conscious awareness?
(3) What does it take to explain conscious awareness?
In answer to each question, one might say that (a) Computation alone is enough, (b) Physics is enough, but physical features beyond computation are required, or (c) Not even physics is enough. Call these positions C, P, and N. So we have a total of 27 positions, that one might label CCC, CPN, and so on.
Question (1) is the question Penrose is concerned with for most of the book, and the issue that separates B and C above. He argues for position P-- over C--. Descartes might have argued for N--, but few would embrace such a position these days.
Question (2) is the issue at the heart of Searle's Chinese room argument, and the issue that separates A from B and C above. Searle argues for -P- over -C-, and Penrose is clearly sympathetic with this position. Almost everyone would accept that a physical duplicate of me would "evoke" consciousness, so position -N- is not central here.
Question (3) is the central question about the explanation of consciousness (a question that much of my own work is concerned with). Penrose's positions A, B, and C are neutral on this question, but D is solely concerned with it; so in a sense, D is independent of the rest. Many advocates of AI might hold --C, some neurobiologists might hold --P, whereas my own position is --N.
The four positions Penrose describes come down to CC- (A), CP- (B), PP- (C), and --N (D). Penrose seems to think that in arguing for position C (PP-) he is arguing against position D (--N), but it is clear from this analysis that this is not so. In the end, nothing in Penrose's book bears on question (3), which is a pity, though it is certainly understandable. It would be very interesting to hear Penrose's position on just how physical theories might or might not explain human consciousness.
Indeed, one might even combine positions A and D, as I do, embracing CCN. On this position, human-like behavior can be produced computationally, and indeed enacting the right computation will give rise to consciousness, but neither a computational account nor a physical account alone will explain consciousness. It might seem odd that computation should evoke but not explain consciousness, but this is no more odd than the corresponding position that neurophysiology might evoke but not explain consciousness. In either case, consciousness emerges from some underlying basis, but we need a further element in the theory to explain just how and why it emerges.
One can have a lot of fun cataloging positions (Dennett is CCC; Searle may be CPP; Eccles is NNN; Penrose is PPP; I am CCN; some philosophers and neuroscientists are CPN or PPN; note that all these are "non-decreasing" in C->P->N, as we might expect), but this is enough for now. The main point is that Penrose's treatment runs together question (3) with questions (1) and (2), so that in the end the question of how consciousness might be explained is left to one side.
A true science of consciousness will have address all of these questions, and especially question (3). Penrose has produced an enormously enjoyable and challenging book, but it seems to me that for all his hard work, the science of consciousness is still missing.