Quick Note on Proof Theory

I read Aristotelian demonstration as more a making explicit than a proof of truths. The logical expressivism of the author of Making It Explicit (Robert Brandom) does something similar with modern logic. Nonetheless it would be very wrong to conclude that proof theory has no philosophical relevance.

To begin with, proof theory is itself not concerned with proving this or that truth. It is the study of proofs, the beginning of which is to recognize that proofs and proof calculi are themselves mathematical objects. Proofs are functions from premises to conclusions. This has profound consequences.

At an utterly simple level, one small piece of a far larger result is that the notion of an implication A => B is at a certain level formally interchangeable with the notion of a mathematical function A => B. Frege very explicitly treats logical predication as a function as well.

Category theory builds all of mathematics on such morphisms, starting from a single basic operation of composition of arrows. Homotopy type theory suggests that we think of the arrows as paths through spaces. All this is an elaboration and abstraction of the utterly simple but crucial notion of “follows from”, or what Brandom calls subjunctive robustness.

Then an Aristotelian syllogism can be seen on the model of the composition of two predications or functions or morphisms or arrows or paths A => B and B => C around a common type or middle term B that is the output of one and the input of the other. This is not intended to capture a sophisticated result like a mathematical theorem, but rather to express sound reasoning in the simplest, most perspicuous, and most universal way possible.

Next in this series: Reason Relations

Searching for a Middle Term

“But nothing, I think, prevents one from in a sense understanding and in a sense being ignorant of what one is learning” (Aristotle, Posterior Analytics; Complete Works revised Oxford edition vol. 1, p. 115). The kind of understanding spoken of here involves awareness “both that the explanation because of which the object is is its explanation, and that it is not possible for this to be otherwise” (ibid). To speak of the “explanation because of which” something is suggests that the concern is with states of affairs being some way, and the “not… otherwise” language further confirms this.

Following this is the famous criterion that demonstrative understanding depends on “things that are true and primitive and immediate and more familiar than and prior to and explanatory of the conclusion…. [T]here will be deduction even without these conditions, but there will not be demonstration, for it will not produce understanding” (ibid). The “more familiar than” part has sometimes been mistranslated as “better known than”, confusing what Aristotle carefully distinguishes as gnosis (personal acquaintance) and episteme (knowledge in a strong sense). I think this phrase is the key to the whole larger clause, giving it a pragmatic rather than foundationalist meaning. (Foundationalist claims only emerged later, with the Stoics and Descartes.) The pedagogical aim of demonstration is to use things that are more familiar to us — which for practical purposes we take to be true and primitive and immediate and prior and explanatory — to showcase reasons for things that are slightly less obvious.

Independent of these criteria for demonstration, the whole point of the syllogistic form is that the conclusion very “obviously” and necessarily follows, by a simple operation of composition on the premises (A => B and B => C, so A=> C). Once we have accepted both premises of a syllogism, the conclusion is already implicit, and that in an especially clear way. We will not reach any novel or unexpected conclusions by syllogism. It is a kind of canonical minimal inferential step, intended not to be profound but to be as simple and clear as possible.

(Contemporary category theory grounds all of mathematics on the notion of composable abstract dependencies, expressing complex dependencies as compositions of simpler ones. Its power depends on the fact that under a few carefully specified conditions expressing the properties of good composition, the composition of higher-order functions with internal conditional logic — and other even more general constructions — works in exactly the same way as composition of simple predications like “A is B“.)

Since a syllogism is designed to be a minimal inferential step, there is never a question of “searching” for the right conclusion. Rather, Aristotle speaks of searching for a “middle term” before an appropriate pair of premises is identified for syllogistic use. A middle term like B in the example above is the key ingredient in a syllogism, appearing both in the syntactically dependent position in one premise, and in the syntactically depended-upon position in the other premise, thus allowing the two to be composed together. This is a very simple example of mediation. Existence of a middle term B is what makes composition of the premises possible, and is therefore what makes pairings of premises appropriate for syllogistic use.

In many contexts, searching for a middle term can be understood as inventing an appropriate intermediate abstraction from available materials. If an existing abstraction is too broad to fit the case, we can add specifications until it does, and then optionally give the result a new name. All Aristotelian terms essentially are implied specifications; the names are just for convenience. Aristotle sometimes uses pure specifications as “nameless terms”.

Named abstractions function as shorthand for the potential inferences that they embody, enabling simple common-sense reasoning in ordinary language. We can become more clear about our thinking by using dialectic to unpack the implications of the abstractions embodied in our use of words. (See also Free Play; Practical Judgment.)

Syllogism

Aristotle invented logic as a discipline, and in Prior Analytics developed a detailed theory of so-called syllogisms to codify deductive reasoning, which also marks the beginning of formalization in logic. Although there actually were interesting developments in the European middle ages with the theory of so-called supposition as a kind of semi-formal semantics, Kant famously said Aristotle had said all there was to say about logic, and this went undisputed until the time of Boole and De Morgan in the mid-19th century. Boole himself said he was only extending Aristotle’s theory.

The fundamental principle of syllogistic reasoning is best understood as a kind of function composition. Aristotle himself did not have the concept of a mathematical function, which we owe mainly to Leibniz, but he clearly used a concept of composition of things we can recognize as function-like. In the late 19th century, Frege pointed out that the logical meaning of grammatical predication in ordinary language can be considered as a kind of function application.

Aristotle’s syllogisms were expressed in natural language, but in order to focus attention on their form, he often substituted letters for concrete terms. The fundamental pattern is

(quantifier A) op B
(quantifier B) op C
Therefore, A op C

where each instance of “quantifier” is either “some” or “all”; each instance of “op” is either what Aristotle called “combination” or “separation”, conventionally represented in natural language by “is” or “is not”; and each letter is a type aka “universal” aka higher-order term. (In the middle ages and later, individuals were treated as a kind of singleton types with implicit universal quantification, so it is common to see examples like “Socrates is a human”, but Aristotle’s own concrete examples never included references to individuals.) Not all combinations of substitutions correspond to valid inferences, but Prior Analytics systematically described all the valid ones.

In traditional interpretations, Aristotle’s use of conventionalized natural language representations sometimes led to analyses of the “op” emphasizing grammatical relations between subjects and predicates. However, Aristotle did not concern himself with grammar, but with the more substantive meaning of (possibly negated) “said of” relations, which actually codify normative material inferences. His logic is thus a fascinating hybrid, in which each canonical proposition represents a normative judgment of a material-inferential relation between types, and then the representations are formally composed together.

The conclusion B of the first material inference, which is also the premise of the second, was traditionally called the “middle term”, the role of which in reasoning through its licensing of composition lies behind all of Hegel’s talk about mediation. The 20th century saw the development of category theory, which explains all mathematical reasoning and formal logic in terms of the composition of “morphisms” or “arrows” corresponding to primitive function- or inference-like things. Aside from many applications in computer science and physics, category theory has also been used to analyze grammar. The historical relation of Aristotle to the Greek grammarians goes in the same direction — Aristotle influenced the grammarians, not the other way around. (See also Searching for a Middle Term; Aristotelian Demonstration; Demonstrative “Science”?)

Propositions, Terms

Brandom puts significant emphasis on Kant and Frege’s focus on whole judgments — contrasted with simple first-order terms, corresponding to natural-language words or subsentential phrases — as the appropriate units of logical analysis. The important part of this is that a judgment is the minimal unit that can be given inferential meaning.

All this looks quite different from a higher-order perspective. Mid-20th century logical orthodoxy was severely biased toward first-order logic, due to foundationalist worries about completeness. In a first-order context, logical terms are expected to correspond to subsentential elements that cannot be given inferential meaning by themselves. But in a higher-order context, this is not the case. One of the most important ideas in contemporary computer science is the correspondence between propositions and types. Generalized terms are interpretable as types, and thus also as propositions. This means that (higher-order) terms can represent instances of arbitrarily complex propositions. Higher-order terms can be thus be given inferential meaning, just like sentential variables. This is all in a formal context rather than a natural-language one, but so was Frege’s work; and for what it’s worth, some linguists have also been using typed lambda calculus in the analysis of natural language semantics.

Suitably typed terms compose, just like functions or category-theoretic morphisms and functors. I understand the syllogistic principle on which Aristotle based a kind of simultaneously formal and material term inference (see Aristotelian Propositions) to be just a form of composition of things that can be thought of as functions or typed terms. Proof theory, category theory, and many other technical developments explicitly work with composition as a basic form of abstract inference. Aristotle developed the original compositional logic, and it was not Aristotle but mid-20th century logical orthodoxy that insisted on the centrality of the first-order case. Higher-order, compositionally oriented logics can interpret classic syllogistic inference, first-order logic, and much else, while supporting more inferentially oriented semantics on the formal side, with types potentially taking pieces of developed material-inferential content into the formal context. We can also use natural-language words to refer to higher-order terms and their inferential significance, just as we can capture a whole complex argument in an appropriately framed definition. Accordingly, there should be no stigma associated with reasoning about terms, or even just about words.

In computer-assisted theorem-proving, there is an important distinction between results that can be proved directly by something like algebraic substitution for individual variables, and those that require a more global rewriting of the context in terms of some previously proven equivalence(s). At a high enough level of simultaneous abstraction and detail, such rewriting could perhaps constructively model the revision of commitments and concepts from one well-defined context to another.

The potential issue would be that global rewriting still works in a higher-order context that is expected to itself be statically consistent, whereas revision of commitments and concepts taken simply implies a change of higher-level context. I think this just means a careful distinction of levels would be needed. After all, any new, revised genealogical recollection of our best thoughts will be in principle representable as a new static higher-order structure, and that structure will include something that can be read as an explanation of the transition. It may itself be subject to future revision, but in the static context that does not matter.

The limitation of such an approach is that it requires all the details of the transition to be set up statically, which can be a lot of work, and it would also be far more brittle than Brandom’s informal material inference. (See also Categorical “Evil”; Definition.)

I am fascinated by the fact that typed terms can begin to capture material as well as purely formal significance. How complete or adequate this is would depend on the implementation.