A Theory of Mechanized Thought
A Blueprint for Building a Creating Subject

The creating subject (the English by van Dalin) it was called the 'creative subject' by Kreisel') was imagined (by Brouwer) to be a thinking entity that (over time) would come to learn things about its world. I have chosen this as a metaphor for our goal. Our thinker is not as smart as the original. It can forget and might have to retreat from 'mistakes'. But the reason that this idea is relavent is that it considers the dynamic aspect of reasoning and recognizes that the semantics of such a system is, of necesity, different than the 'classical' understanding of reasoning.

For us, reasoning is not intuitionistic. Below we discuss finite structures, called contexts, which can be used to reason and talk about both classical and intuitionistic theories. It answers the question of how intuitionists can understand enought classical reasoning to say they believe its 'wrong'. Any thinker worth their salt would have to be able to understand and discuss this difference.

Introduction

One of the origins of this work was a remark of John McCarthy (I don't know if has been published but it has certainly been articulated to me and others many times). To paraphase: "You can't expect a computer system to find a proof if you can't actually tell it the proof explicitly". This thinking led to the development of the FOL computer program in the early 1970s (ref). As a newly minted PhD in logic (from the mathematics department under the direction of Soloman Feferman at Stanford), I thought it would be an easy task to write a computer system that could 1) repeat the work of Russell and Whitehead in Principa Mathematica using modern notation and 2) add 'modern' formalisms for set theory and mathematics and have the system that would be able to accept and print out proofs of undergraduate and graduate math homework problems that were good enough to be handed in and graded. Was I mistaken (details in [MATH-HOMEWORK] !!!!!

The present work aims at reformulating the ideas of logic, recursion theory and model theory in a way that can be used as a blueprint for building an independent/autonomous artifact—a physical system—that has an understanding of its world; can come to know new things; and can influence the world around it. Furthermore, its understanding of the world, of nesessity, must be good enough so that it can survive over time (perhaps a very long time (maybe forever?)) and can be a sucesseful participant in society.

We start with the problem of building the mind/head of our artifact. We understand full well that our artifact will have a boundry and both sensors and effectors, but we beg (to start with) the question of how this 'head' is connected to such devices and instead consentrate on what John McCrthy called the 'epistemolical adaquacy' of our ideas.

Among the goals of This work is:

To describe how to build such an intelligent artifact

and

To give a view of sets, computation and theories that can act as a blueprint for this project.

We start this adventure by listing some properties our head/reasoner needs to have:

Seeds of Hope

What does it mean to provide a blueprint for building a Creating Subject? It is here that we driven to develop a shift in the way we look at the notion of theory and how it is xsused.

We start this adventure by listing some properties our thinker needs to have:

1) Heads and computer systems are finite structures. The consequences of this observation cannot be overestimated. It gets directly to the heart of the philosophical discussions of nominalism, finitism, intuitionism, ... and the pragmatic discussions of system building. If a reasoner is finite then any 'theory' whose model theory requires an infinite collection of things to be present simply cannot be used as a blueprint for building it. Every AI researcher laughs when I say this because they all know that infinite collections of things cannot be put into a computer (and in practice they don't even consider it), but the reality is that AI researchers frequently try to explain what they are doing by using the vocabulary of logic and by implication its infinitary notions, ignoring the fact that logical ideas like deductive closure, function, the integers, ... all explicitly involve infinite collections. We need to rethink this.

The systems AI researchers build and the descriptions they give of how these systems work generally don't match. The attempt to use the vocabulary and notions of logic is not satisfying and has frequently been a mistake. In fairness there is little else in the literature for them to use. This disconnect has resulted in many AI researchers disparaging logic and proposing to reformulate the workings of reasoning using generally far less thought out solutions. I believe The analyses logicians have made of reasoning is, in many ways, correct but their use of formalism has been too narrow and needs rethinking in light of what we know about real physical computers. One of the goals of this work to explain and eliminate these differences.

2) Mental Images Our thinker must be able to form 'mental images' or 'mental models' of things. These images must be realizable as finite data structures. Our proposal introduces a notion of 'type' and use them to represent mental models. Our thinker will use these models for thinking about (and reasoned about) things using other mental models. We call the mental models of the things used in reasoning contexts.

When a speaker points and says "Let me introduce you to Fred." ("Meet Fred") they are clearly making a statement about how the speaker proposes to use language later in the conversation. The 'meaning' of this interaction is that the speaker wants the listener to know that the word 'Fred' should be used to refer to whatever 'mental model' the listener has of the thing the speaker is pointing at. This is always the case when one person introduces something new to someone else. In building our 'creating subject', this kind of interaction has to be accomodated because the connection between words and the 'mental images' we conjure up when we hear them is a fundamental part of communication. One of the results of this work is to describe a collection of data structures (types) and show how they can serve as our thinker's mental models of things.

This seems like a good place to mention the note [WHY-LISP] which describes the idea of interning and its importance to communication (and natural language processing.)

3) Metatheory. The conversation described in the previous section is metamathematical in nature. The sentence quoted is about how we should use language. Logicians are just beginning to take seriously the `formalization of metatheory'. Kleene explicitally states that IM is an informal mathematical discussion of formalization. (p. 65). It is my belief that a large percentage (40%) of natural language is about how the words used in future dialog should be understood. This implies that understanding conversation uses metatheoretical ideas in a fundamental way. Any implicit assumption that the sentences of discourse mainly relay 'facts' is not well grounded.

In addition, more straitforwardly, we use metatheory (eg to describe new subsidiary deduction rules) all the time in our discussions of theories. For the more logically minded how exactly do we seamlessly switch between elementary and non-elementry theories of things. Certainly anything that can discuss mathematics must be able to do this, but what principle is being used here and in what place in out 'thinker' is it found. For AI researchers involved in natural language processing this issue appears all the time. (we discuss this in detail below)

While we are on the subject of metatheory, I think we also need to address the question of meta-metatheory, meta-meta-metatheory, ... and how we can put this 'infinite' tower of theories in a head. In the original FOL computer system we showed how this 'hierarchy' was finitized and gave examples of it use. (Prolegamina (the preprint) 'self reflection' Section 10 p 25.)

4) Reasoning. Reasonoing must be carried out using finite structures. We propose structures (mental images) of a type we call contexts and we develop logic in a way that is both consistent with ordinary first order semantics and describle as operations on finite structures. Satisfaction can be defined for contexts and we propose taking operations that are satisfaction preserving (but not nesceearily validity proserving) as fundemental. Ordinary validity preserving rules of inference the become special cases, but forgeting is an acceptable operation.

5) Contextualization. Our thinker must be able to isolate the context in which any reasoning takes place. The idea of context was used in 1974 in the computer program developed at the Stanford AI lab called FOL (ref) which formalized the notion of a rule of inference as a consistency, i.e., satisfaction, preserving map on contexts.

6) Contradictions. We believe that all reasoning takes place in some context. There is no global mutually accepted collection of facts. Locally the facts of a context are imagined to be consistent. We believe that if a context implies a contradiction, i.e., 'False' is a consequence of the facts of the context, then either we can look at the justifications of the assumbtions used in the derivation of 'False' and determine which to 'blamed' (in the usual Natural deduction style) or just declare the context is unusable for problem solving. In this way contradictions never impede our ability to do problem solving. The development of 'papa-consistent' logic is yet another instance of attempting to imagine that all 'facts' are in one big (happy?) theory (the swamp revisited).

The idea of context was later discussed by McCarthy (ref) where

7) Reasoning over Time. Human understanders (the only real examples of understanders we have) reason over time. They forget things and make new discoveries. Any adaquate description of reasoning must be able to account for this behavior. The currently fashional ideas of logic (e.g., using the idea that theories are deductively closed, for example) do not account for this.

8) Interruptability without Death. We are able to recover from wide variety of interuptions but neither computers nor functions are very good at it. In Peano arithmetic the addition and multiplication functions are not computed. They are simply lookup tables!!!! The effort to multiply two numbers does not depend on their size. From this point of view the arithmetic we learned in elementary school (as well as what even the most hardened mathematicians use in their every day life) cannot be Peano arithmetic or we could all multiply arbitrarily large integers in our heads. So what did we learn when we learned 'arithmetic'? Functions and prograns which claim to 'compute' sums are not prima facie interruptable. In case of functions interuption doesn't even seem relevant notion and in the usual computation theories of CS programs are imagined to compute 'functions' and so run to completion.

Our thinker has to be more grounded. It must be able to realize that computing the Ackermann function for the argument 300 will take longer than it has and even if it has started to 'compute' its value it has to have a mechanism to abort the attempt. The question I ask is by what machinery might our thinker notice that it wasn't going to finish in the 'available' time? It must be able to use what I called "the gossip principle" which means, that no matter what it's doing, a system needs to be able to interupt itself and ask questions about what it has been doing lately, i.e., gossip about itself, and change its behavior on the basis of what it notices. Of course, the hard part is that (by the gossip principle itself) this 'gossiping' must also be interuptable!!!! I believe a theory that would allow you to formalize a general version of this principle is not straight forward and implementing it is problematic. The proposal of this work includes a theory of computation systems that includes a form of interuption.

9) Information theoretic vs set theoretic semantics We propose to base our semantics on information content rather then membership. Instead of subset we use the notion of subtype. One thing is a subtype of another just in case it provides more information. This type of thinking occurs in mathematical algebra (not high school algebra). In our theory 3D space is a subtype of 2D space because it contains more details, i.e., information about its third dimension.

10) static vs before/after semantics Formalizing behaviors, actions and events.

11) perception

So, having made our list, the question is what to do?

Reasoning Questions

1) Distinguishing between fact and consequence

2) Mixing theories The usual proof of Wilson's theorem.

3) Elementary ve non-elementary theories Axions of group theory vs theorems about normal subgroups - what principle was used to 'switch' and in what reasong environment was this principle applied.

4) Classical Rreasoning We believe that basic reasoning is

5) Contextualized reasoning We intend to show that contexts can be used to formalize non-monotonic reasoning, default logic, modal logic, cross world indiviuation problens, ...

Representation Questions

1) representing one's own structure or meta theory fails.

1a) implementation issues

1a-1) representations ie colored quotes

2) what can I see

Philosophy issuse

1) classical reasoning carried out in a finite contextualized reasoning environment.

2) de re/de dicto - there is ni de re its all de dicto.

3) foundations vs providing a model The very idea of foundations hides the fact that most of the time this means reductionist foundations. That is, we strive to describe a small collection of notions (a foundation) and then proceed to 'build' everything from there. So, for example, the usual formulations of first order senantics omit the use of 'if the else' for two reasons: 1) it isn't a binary propositional function and 2) it requires the definition of satisfaction and valitity to involve a mutual recursive between the 'value' of a term and the 'truth' value of a formula. On the other hand every computer programming language has some form of conditional and no one bats an eye. Logicians frequently simplify the 'complexity' of something by 'reducing' it to previously understood notions.

This reduction actually affects our task in some bothersome ways. Logicians renounce the use of complex data structures and reduce thinking to PRA because 'any other reasonable data type can be encoded using Godel numbering' and so there is no 'need' for anything but numbers. This also is reflected in the use of set theory where it has become 'obvious' that (at lesst much) of mathematics including integers can be encoded into a set theory built entirerly out of the empty set (and some 'large' cardinals eg omega). Of course there are set theories with urelements but as a statistical matter their use is religated to special problems.

For us this way of looking at foundations is an interesting collection of observations but misses the point. Things for us (and our creating subject) had better include (at least mental images of) numbers, trees, people, ... So we have adopted the following:

1) we need mental images of things (we will build these out of types) that we assign an information theoretic semantics. A type deliniates a collection of objects call examples. In general these collections are 'infinite'. They are distinguished by the existence of a recognizer for each type - that is, for any object the epistemic question we can ask is 'is this object of type T' We build recognizers as programs in a specified computation system.

Note that this discription mentions both computations and criterioa (propositions) as part of the 'meaning' of a type. This is an example of our use of mutual recursion to define our basic notions,

A simpler version of this idea of type was first articulated by Luca Cardelli when he was thinking about designing programming languages. The recognizer semantics of types can be understood is a realization of the ideas in Skolem 1928 where we use a computation system to realize the semantics. The logic of FOL contexts is many sorted with a partial order on sorts that mimics the notion of subtype for types (which in tern are used to define the semantics of FOL contexts). The use of partialy ordered many sorted first order logic was developed in the original implementation of FOL contexts and was used for (among other things) formalizing GBN, which specifies 'element of' in a style characterized by CS as polymorphic. The canonical description of many sorted first order logic (Wang 19??) insists that different sorts are disjoint. (Which clearly requires 'encoding' to formalize GBN.) (see [doing Math homework]) ============================================================================ Artifacts exist in time and space and they change over time. Theories do not provide an answer to these questions!!!! Our goal is not to understand how correct reasonong can happen but rather to buikd a reasoner. The question then becomes in what sense a theory might constitute a blueprint for building something that thinks. What do we know now What we learn from computers is that we can actually build physical structures in machines (computers) What's different now that we have computers 'I see no wffs here' ==> 'I see no wffs here BUT I can make one' Actual Objects bool integer rational char string symbol pair list sexp :true :false :idb -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 make-rational(int,int) rational-get-deno,eter rational-get-numerater #\1 ... "" make-pair(a,b) pair-get-1st(p) pair-get-2nd(p) What We Mean by a Head What are Things - Types How do we 'do' Anything - Restartable Computations How do we Representate Theories - Contexts Heads Why do we do Anything Boundries Physics 2) New World a) TYPEsystem (replaces set theory) Set Theory - Tsys element set class - type example - object - example subset - subtype (computaable) epsilon - exampleOf (computable) comprehension - recognizer new notions: type-name type-def refinement scaler isin ex b) COMPsystem (replaces recursion theory) Recursion Theory - Csys domain - type function - program for restartable computation S/T predicate - stepper new notions: computation (as data) continuation c) FOLsystems (replaces theory and deduction) Logic - FOLsystem theory - context language - language signature sort - type-name domain - type model - simulation structure interpretation - IDK logic validity - satisfaction proof - fact rules of inference - satisfaction preserving maps new notions: IDK logic d) ReasoningSystem = < a , b , c > appapp 3) Plan a) 2.a-2.d) descriptive papers b) formal docs c) realization/implematation This is weak idea HERE we want to actuallY build a reasoning structure!!!