Syllogism

Aristotle invented logic as a discipline, and in Prior Analytics developed a detailed theory of so-called syllogisms to codify deductive reasoning, which also marks the beginning of formalization in logic. Although there actually were interesting developments in the European middle ages with the theory of so-called supposition as a kind of semi-formal semantics, Kant famously said Aristotle had said all there was to say about logic, and this went undisputed until the time of Boole and De Morgan in the mid-19th century. Boole himself said he was only extending Aristotle’s theory.

The fundamental principle of syllogistic reasoning is best understood as a kind of function composition. Aristotle himself did not have the concept of a mathematical function, which we owe mainly to Leibniz, but he clearly used a concept of composition of things we can recognize as function-like. In the late 19th century, Frege pointed out that the logical meaning of grammatical predication in ordinary language can be considered as a kind of function application.

Aristotle’s syllogisms were expressed in natural language, but in order to focus attention on their form, he often substituted letters for concrete terms. The fundamental pattern is

(quantifier A) op B
(quantifier B) op C
Therefore, A op C

where each instance of “quantifier” is either “some” or “all”; each instance of “op” is either what Aristotle called “combination” or “separation”, conventionally represented in natural language by “is” or “is not”; and each letter is a type aka “universal” aka higher-order term. (In the middle ages and later, individuals were treated as a kind of singleton types with implicit universal quantification, so it is common to see examples like “Socrates is a human”, but Aristotle’s own concrete examples never included references to individuals.) Not all combinations of substitutions correspond to valid inferences, but Prior Analytics systematically described all the valid ones.

In traditional interpretations, Aristotle’s use of conventionalized natural language representations sometimes led to analyses of the “op” emphasizing grammatical relations between subjects and predicates. However, Aristotle did not concern himself with grammar, but with the more substantive meaning of (possibly negated) “said of” relations, which actually codify normative material inferences. His logic is thus a fascinating hybrid, in which each canonical proposition represents a normative judgment of a material-inferential relation between types, and then the representations are formally composed together.

The conclusion B of the first material inference, which is also the premise of the second, was traditionally called the “middle term”, the role of which in reasoning through its licensing of composition lies behind all of Hegel’s talk about mediation. The 20th century saw the development of category theory, which explains all mathematical reasoning and formal logic in terms of the composition of “morphisms” or “arrows” corresponding to primitive function- or inference-like things. Aside from many applications in computer science and physics, category theory has also been used to analyze grammar. The historical relation of Aristotle to the Greek grammarians goes in the same direction — Aristotle influenced the grammarians, not the other way around. (See also Searching for a Middle Term; Aristotelian Demonstration; Demonstrative “Science”?)

New Approaches to Modality

I periodically peek at the groundbreaking work on formal systems that is going on in homotopy type theory (HoTT), and in doing so just stumbled on an intriguing treatment of modal HoTT that seems much more philosophically promising to me than standard 20th century modal logic.

Types can be taken as formalizing major aspects of the Aristotelian notions of substance and form. Type theory — developed by Swedish philosopher Per Martin-Löf from early 20th century work by the British philosopher Bertrand Russell and the American mathematician Alonzo Church — is the most important thing in the theory of programming languages these days. It is both a higher-order constructive logic and an abstract functional programming language, and was originally developed as a foundation for constructive mathematics. Several variants of type theory have also been used in linguistics to analyze meaning in natural language.

Homotopy type theory combines this with category theory and the categorical logic pioneered by American mathematician William Lawvere, who was also first suggested a category-theory interpretation of Hegelian logic. HoTT interprets types as paths between topological spaces, higher-order paths between paths, and so on, in a hierarchy of levels that also subsumes classical logic and set theory. It is a leading alternative “foundation” or framework for mathematics, in the less epistemologically “foundationalist” spirit of previous proposals for categorical foundations. It is also a useful tool for higher mathematics and physics that includes an ultra-expressive logic, and has a fully computational interpretation.

There is a pretty readable new book on modal HoTT by British philosopher David Corfield, which also gives a nice introductory prose account of HoTT in general and type theory in general. (I confess I prefer pages of mostly prose — of which Corfield has a lot — to forests of symbolic notation.) Corfield offers modal HoTT as a better logic for philosophy and natural language analysis than standard 20th century first-order classical logic, because its greater expressiveness allows for much richer distinctions. He mentions Brandom several times, and says he thinks type theory can formally capture many of Brandom’s concerns, as I previously suggested. Based on admittedly elementary acquaintance with standard modal logic, I’ve had a degree of worry about Brandom’s use of modal constructs, and this may also help with that.

The worry has to do with a concept of necessity that occasionally sounds overly strong to my ear, and is related to my issues with necessity in Kant. I don’t like any universal quantification on untyped variables, let alone applied to all possible worlds, which is the signature move of standard modal logic. But it seems that adding types into the picture changes everything.

Before Corfield brought it to my attention, I was only dimly aware of the existence of modal type theory (nicely summarized in nLab). This apparently associates modality with the monads (little related to Leibnizian ones) that I use to encapsulate so-called effects in functional programming for my day job. Apparently William Lawvere already wrote about geometric modalities, in which the modal operator means something like “it is locally the case that”. This turns modality into a way of formalizing talk about context, which seems far more interesting than super-strong generalization. (See also Modality and Variation; Deontic Modality; Redding on Morals and Modality).

It also turns out Corfield is a principal contributor to the nLab page I previously reported finding, on Hegel’s logic as a modal type theory.

Independent of his discussion of modality, Corfield nicely builds on American programming language theorist Robert Harper’s notion of “computational trinitarianism”, which stresses a three-way isomorphism between constructive logic, programming languages, and mathematical category theory. The thesis is that any sound statement in any one of these fields should have a reasonable interpretation in both of the other two.

In working life, my own practical approach to software engineering puts a high value on a kind of reasoning inspired by a view of fancy type theory and category theory as extensions or enrichments of simple Aristotelian logic, which on its formal side was grounded in the composition of pairs of informally generated judgments of material consequence or material incompatibility. I find the history of these matters fascinating, and view category theory and type theory as a kind of vindication of Aristotle’s emphasis on composition (or what could be viewed as chained function application, or transitivity of implicit implication, since canonical Aristotelian propositions actually codify material inferences) as the single most important kind of formal operation in reasoning.

Categorical Hegel?

I just discovered a book-length nLab web draft with extremely detailed interpretation of Hegel’s Science of Logic into higher category theory and homotopy type theory. (Reading category theory into Hegel was originally suggested by William Lawvere in the 1960s.) A lot of it is way beyond me, but there is much of interest. nLab in general hosts world-class work in math and logic, as well as applications of it to physics and philosophy. Remarks there about historical philosophers are uneven in quality, but a number of them are interesting, and the more mathematical or logical they are, the better the quality gets. The aforementioned draft does reference old, inadequate generalizations about Hegel as “mystical”, but the detail and scope of the interpretation into state-of-the-art mathematics are awe-inspiring. It also includes a nice formalization of Aristotelian logic, which is mathematically much simpler and relatively easy to understand. I previously found a much shorter page there that explicitly mentions Brandom, and connects with his interest in modal logic. (See also Identity, Isomorphism; Categorical “Evil”; Higher Order.)

Propositions, Terms

Brandom puts significant emphasis on Kant and Frege’s focus on whole judgments — contrasted with simple first-order terms, corresponding to natural-language words or subsentential phrases — as the appropriate units of logical analysis. The important part of this is that a judgment is the minimal unit that can be given inferential meaning.

All this looks quite different from a higher-order perspective. Mid-20th century logical orthodoxy was severely biased toward first-order logic, due to foundationalist worries about completeness. In a first-order context, logical terms are expected to correspond to subsentential elements that cannot be given inferential meaning by themselves. But in a higher-order context, this is not the case. One of the most important ideas in contemporary computer science is the correspondence between propositions and types. Generalized terms are interpretable as types, and thus also as propositions. This means that (higher-order) terms can represent instances of arbitrarily complex propositions. Higher-order terms can be thus be given inferential meaning, just like sentential variables. This is all in a formal context rather than a natural-language one, but so was Frege’s work; and for what it’s worth, some linguists have also been using typed lambda calculus in the analysis of natural language semantics.

Suitably typed terms compose, just like functions or category-theoretic morphisms and functors. I understand the syllogistic principle on which Aristotle based a kind of simultaneously formal and material term inference (see Aristotelian Propositions) to be just a form of composition of things that can be thought of as functions or typed terms. Proof theory, category theory, and many other technical developments explicitly work with composition as a basic form of abstract inference. Aristotle developed the original compositional logic, and it was not Aristotle but mid-20th century logical orthodoxy that insisted on the centrality of the first-order case. Higher-order, compositionally oriented logics can interpret classic syllogistic inference, first-order logic, and much else, while supporting more inferentially oriented semantics on the formal side, with types potentially taking pieces of developed material-inferential content into the formal context. We can also use natural-language words to refer to higher-order terms and their inferential significance, just as we can capture a whole complex argument in an appropriately framed definition. Accordingly, there should be no stigma associated with reasoning about terms, or even just about words.

In computer-assisted theorem-proving, there is an important distinction between results that can be proved directly by something like algebraic substitution for individual variables, and those that require a more global rewriting of the context in terms of some previously proven equivalence(s). At a high enough level of simultaneous abstraction and detail, such rewriting could perhaps constructively model the revision of commitments and concepts from one well-defined context to another.

The potential issue would be that global rewriting still works in a higher-order context that is expected to itself be statically consistent, whereas revision of commitments and concepts taken simply implies a change of higher-level context. I think this just means a careful distinction of levels would be needed. After all, any new, revised genealogical recollection of our best thoughts will be in principle representable as a new static higher-order structure, and that structure will include something that can be read as an explanation of the transition. It may itself be subject to future revision, but in the static context that does not matter.

The limitation of such an approach is that it requires all the details of the transition to be set up statically, which can be a lot of work, and it would also be far more brittle than Brandom’s informal material inference. (See also Categorical “Evil”; Definition.)

I am fascinated by the fact that typed terms can begin to capture material as well as purely formal significance. How complete or adequate this is would depend on the implementation.

Categorical “Evil”

If we are aiming at any kind of true unity of apperception, then in any given logical moment we should aim to reason in ways that are invariant under isomorphism. Over time our practical and theoretical reasoning may and will iteratively change, but synchronically we should aim to ensure that reasoning about equivalent things will be invariant within the scope of each iteration.

In higher mathematics, difficulties arise when one structure is represented by or in another structure that has a different associated notion of equivalence. This requires maintaining a careful distinction of levels. The expected consequence relation for the represented notion may not work well with the representation. Such failures of reasoning to be invariant under isomorphism are informally, half-jokingly referred to by practitioners of higher category theory as “evil”. This is a mathematical idea with a clear normative aspect and a very high relevance to philosophy.

The serious slogan implied by the half-joke is that evil should be avoided. More positively, a principle of equivalence-invariance has been articulated for this purpose. One version states that all grammatically correct properties of objects in a fixed category should be invariant under isomorphism. Another states that isomorphic structures should have the same structural properties. On the additional assumption that the only properties of objects we are concerned with are structural properties, this is said to be equivalent to the first.

There are numerous examples of such “evil”, usually associated with uncareful use of equality (identity) between things of different sorts. A significant foundational one is that material set theories such as ZFC allow arbitrary sets to be putatively compared for equality, without providing any means to effect the comparison. Comparison of completely arbitrary things is of course is not computable, so it cannot be implemented in any programming language. It is also said to violate equivalence invariance, which means that material set theories allow evil. The root of this evil is that such theories inappropriately privilege pre-given, arbitrary elements over definable structural properties. (This issue is another reason I think definition needs to be dialectically preserved or uplifted in our more sophisticated reflections, rather than relegated to the dustbin in favor of a sole emphasis on recollective genealogy. A concern to define structures and structural properties of things appears in this context as the determinate negation of the effective privileging of putatively pre-given elements over any and all rational considerations.) ZFC set theory offers a nice illustration of the more general evil of Cartesian-style bottom-up foundationalism.

The evil-generating supposition that utterly arbitrary things can be compared (and that we don’t need to care that we can’t even say how this would be accomplished) implicitly presupposes that all things whatsoever have a pre-given “Identity” that is independent of their structural properties, but mysteriously nonetheless somehow contentful and somehow magically immediately epistemically available as such. This is a mathematical version of the overly strong but still common notion of Identity that I and many others have been concerned to reject. Such bad notions of Identity are deeply involved with the ills of Mastery diagnosed by Hegel and Brandom.

We should not allow evil in foundations, so many leading mathematicians interested in foundations are now looking for an alternative to the 20th century default of ZFC. Some combination of dependent type theory for syntax with higher category theory for semantics seems most promising as an alternative. The recent development of homotopy type theory (HoTT) is perhaps the most vigorous candidate.

Another way to broadly characterize this mathematical “evil” is that it results from treating representation as prior to inference in the order of explanation, as Brandom might say, which means treating correspondence to something merely assumed as given as taking precedence over coherence of reasoning. This is a variant of what Sellars famously called the Myth of the Given. It is a philosophical evil as well as a mathematical one. Besides their intrinsic importance, these mathematical issues make more explicit some of the logical damage done by the Myth of the Given.

Another broad characterization has to do with mainstream 20th century privileging of classical logic over constructive logic, of first-order logic over higher-order logic, and of model theory over proof theory. Prior to the late 19th century, nearly all mathematics was constructive. Cantor’s development of transfinite mathematics was the main motivation for mathematicians to begin working in a nonconstructive style. Gödel’s proof that first-order logic was the richest logic for which all propositions that are true in all models are also true was thought to make it better for foundational use. Logical completeness and even soundness are standardly defined in ways that privilege model theory, which is the formal theory of representation.

It is now known, however, that there are several ways of embedding and representing classical logic — with no loss of fidelity — on a constructive foundation, so the old claim that constructive logic was less powerful has been refuted. Going in the other direction, however, classical logic has no way of recovering the computability that is built into constructive logic once it has been violated, so it is increasingly recognized that a constructive logic provides the more flexible and comprehensive starting point. (Also, transfinite mathematics can reportedly now be given a constructive foundation under HoTT.)

Since the mid-20th century there has been an immense development of higher-order concepts in formal domains, including mathematical foundations; the theory of programming languages; and the implementation of theorem-proving software. Higher-order formalisms offer a huge improvement in expressive power. (As a hand-waving analogy, imagine how hard it would be to do physics with only first-order equations.)

Type theory, proof theory, and the theory of programming languages are kinds of formalism that put inference before pre-given representations. Category theory seems to take an even-handed approach.

Although I noted some interest in Brandom on the part of people working in a higher-order constructive context, Brandom himself seems much more interested in things that would be described by paraconsistent logics, such as processes of belief revision or of the evolution of case law or common law, or of normativity writ large. (In the past, he engaged significantly with Michael Dummett’s work, while to my knowledge remaining silent on Dummet’s arguments in favor of the philosophical value of constructive logic.)

Paraconsistency is a property of some consequence relations, such that in absence of an explicit assumption that from a contradiction anything follows, not everything can in fact be proven to follow from a given contradiction, so the consequence relation does not “explode” (collapse into triviality).

In view of the vast proliferation of alternative formalisms of all sorts since the mid-20th century, it may very well be inappropriate to presume that we will ever get back to one formalism to rule them all. I do expect that homotopy type theory or something like it will eventually come to dominate work on mathematical foundations and related aspects of computer science (and everything else that falls under Hegelian Understanding, taken as a positive moment in the larger process); but as hugely important as I think these are, I am also sympathetic to Brandom’s Kantian/Hegelian idea that considerations of normativity form an outer frame around everything else, as well as to the Aristotelian view that considerations of normativity tend to resist formalization.

On the formal side, it seems it is not possible to synchronically reconcile HoTT with paraconsistency, which would seem to be a problem. (At the opposite, simple end of the scale, my other favorite logical mechanism — Aristotelian syllogism interpreted as function composition — apparently can be shown to have a paraconsistency property, since it syntactically constrains conclusions to be semantically relevant to the premises.)

Diachronically, though, perhaps we could paraconsistently evolve from one synchronically non-evil, HoTT-expressible view of the world to a dialectically better one, while the synchronic/diachronic distinction could save us from a conflict of requirements between the respective logics.

I think the same logical structure needed to wrap a paraconsistent recollective genealogy around a formal development would also account for iterative development of HoTT-expressible formal specifications, where each iteration would be internally consistent, but assumptions or requirements may change between iterations.