Classical logic with Mendler induction

We investigate (co-) induction in classical logic under the propositions-as-types paradigm, considering propositional, second- order and (co-) inductive types. Specifically, we introduce an extension of the Dual Calculus with a Mendler-style (co-) iterator and show that it is strongly normalizing. We prove this using a reducibility argument.

termination. Logically, this entails the consistency of a classical system that goes further than the usual Boolean and second-order propositions. There is no a priori reason to assume this might be the case: languages based on classical logic have been shown to be quite misbehaved if not handled properly [12] and certain forms of Mendler induction have been shown to break strong normalization at higher-ranked types [2].
Here we show that these two constructions are, in fact, compatible. In summary, we: • develop a second-order Dual Calculus with functional types (Section 2), • prove its strong normalization (Section 3) via a reducibility argument, • review Mendler induction in a functional setting (Section 4), • extend the Dual Calculus with Mendler (co-) inductive types (Section 5) and • adapt the aforementioned reducibility argument to prove that the extension is also strongly normalizing (Section 6).
Duality. At every stage-borrowing from one of LK's design principles-we consider concomitantly the duals of every type we introduce, viz. subtraction [5] and co-induction. Similarly to LK, this entails little more than 'f lipping' the actions on the left and on the right. The choice to do so was not merely aesthetic: having subtractions in our system affords us a much more natural definition of Mendler induction than if we only had implication at our disposal. This is comparable to the use of existential types as a basis for modeling ad hoc polymorphism in functional languages [18] as opposed to the more elaborate encoding by means of universal types-and stresses the point that duality brings forth gains in expressiveness at little cost for the designer.
formation rules are restricted in what phrases they expect-e.g. pairs should combine values, while projections pass the components of a pair to some other continuation. This distinction also forces the existence of two kinds of variables: variables for terms and co-variables for co-terms; we assume that they belong to some disjoint and countably infinite sets denoted by Var and Covar, respectively.
Cuts and abstractions. The third and final kind of phrase in the Dual Calculus are cuts. Recall the famous dictum of computer science: Data-structures + Algorithms = Programs.
In DC, where terms represent the creation of information and co-terms consume it, we find that cuts, the combination of a term with a continuation, are analogous to programs: Terms + Co-terms = Cuts; they are the entities that are capable of being executed. Given a cut, one can consider the execution that would ensue if given data for a variable or co-variable. The calculus provides a mechanism to express such situations by means of abstractions x. (c) and of co-abstractions α.(c) on any cut c.
Abstractions are continuations-they expect values in order to proceed with some execution-and, dually, co-abstractions are computations.
Subtraction. One novelty of this paper is the central rôle given to subtractive types, A − B [5]. Subtraction is the dual connective to implication; it is to continuations what implication is to terms: it allows one to abstract co-variables in co-terms-and thereby compose continuations. Given a continuation k where a co-variable α might appear free, the subtractive abstraction (or catch, due to its connection with exception handling) is defined by the binding operator μα.(k)-the idea being that applying (read, cutting) a continuation k and value t to it, packed together as (t#k ), yields a cut of the form t • k[k /α].
Typing judgments. We present the types and the typing rules in Table 2; we omit the structural rules here but they can be found in the aforementioned paper by Wadler [22]. We have three forms of typing judgments that go hand-in-hand with the three different types of phrases: Γ t : A | Δ for terms, Γ | k : A Δ for co-terms and Γ c Δ for cuts. The entailment symbol(s) always points to the phrase under judgment, and it appears in the same position as the entailment symbol in the logically corresponding sequent of LK. Typing contexts Γ are finite assignments of variables to their assumed types; dually, typing co-contexts Δ assign co-variables to their types. Tacitly, we assume that they always include the free (co-) variables in the phrase under consideration. Type-schemes F(X ) are types in which a distinguished type variable X may appear free; the instantiation of such a type-scheme to a particular type T is simply the (capture avoiding) substitution of the distinguished X by T and is denoted F(T). Example: witness the lack of witness. We can apply the rules in Table 2 to bear proof of valid formulas in second-order classical logic. One such example at the second-order level is ¬∀ X . T → ∃ X . ¬T: Note how the existential does not construct witnesses but simply diverts the f low of execution (by use of a co-abstraction).
Head reduction. The final ingredient of the calculus is the set of reduction rules. Head reduction rules (Table 3) encode the operational behavior that one would expect from the constructs of the language. They apply only to cuts and only at the outermost level. Head reduction is nondeterministic-as a cut made of abstractions and co-abstractions can reduce by either one of the abstraction rules-and non-conf luent [22, p. 195]. Conf luence can be reestablished by prioritizing the reduction of one type of abstraction over the other; this gives rise to two conf luent reduction disciplines that we term abstraction prioritizing and co-abstraction prioritizing. In any case, reduction of well-typed cuts yields well-typed cuts. 2 Parallel reduction. Since the phrases of DC are defined by mutual induction, we can generalize head-reduction to cuts that occur inside any term, co-term or, indeed, in other cuts (Table 4). Because, in general, several rules can be applied in parallel to any given phrase, we call this parallel reduction.

Strong normalization of the second-order Dual Calculus
The proof of strong normalization. Having surveyed the syntax, types and reduction rules of DC, we will now give a proof of its strong normalization-i.e. that all reduction sequences of well-typed phrases terminate in a finite number of steps-for the given non-deterministic parallel reduction rules. It follows that all manner of reduction sub-strategies, such as head reduction and the deterministic co-or abstraction prioritizing strategies, are also strongly normalizing.
The proof rests on a reducibility argument. Similar approaches for the propositional fragment can be found in the literature [10,20]; however, the biggest inf luence on our proof was the one by Parigot for the second-order extension of the Symmetric Lambda-Calculus [19].
Our main innovation here is the identification of a complete lattice structure with fix-points suitable for the interpretation of (co-) inductive types. We will, in fact, need to consider two lattices: OP and ON P. Because types have structure in the form of terms and co-terms, each element of said lattices is a pair of sets, with terms in one component and co-terms in the other. These two must be orthogonal-i.e. all cuts formed with those terms and co-terms must be strongly normalizing. The difference between the two is that in OP, we find terms/co-terms of arbitrary form; the components of lattice ON P are restricted to having only terms/co-terms that are introductions/eliminations. Between these two domains, we have type-induced actions from OP to ON P and a completion operator from ON P to OP that generates all terms/co-terms compatible with the given introductions/eliminations. (1) In this setting, we give (two) mutually induced interpretations for types (one in ON P and the other in OP) and establish an adequacy result (Theorem 3.21) from which strong normalization follows as a corollary.

Operations on sets of syntax
Sets of syntax. The set of all terms formed using the rules in Table 1 will be denoted by T ; similarly, co-terms will be K and cuts C. We will also need three special subsets of those sets: IT for those terms whose outer syntactic form is an introduction, EK, dually, for the co-terms whose outer syntactic form is an eliminator and SN for the set of strongly normalizing cuts. 3 Since the proof refers to the parallel reduction strategy, we also need the set of strongly normalizing terms, SN T , and the set of strongly normalizing co-terms, SN K.
Saturation. These sets above have all in common the property that they are closed under reduction. They are said to be saturated. For example, a strongly normalizing term must reduce to strongly normalizing terms and cuts reduce to cuts. Saturation can be expressed in terms of the image, denoted [−], of the (parallel) reduction relation. In symbols, a set of phrases P ∈ P ( Syntactic actions on sets. The syntactic constructors give rise to obvious actions on sets of terms, co-terms and cuts, e.g. By abuse of notation, these operators shall be denoted as their syntactic counterparts. Apart from the cut action, which may introduce head reductions, they all preserve saturation. LEMMA 3.1 (Saturation for (co-) term syntactic operators). Assume T and U, K and L and C are saturated sets of terms, co-terms and cuts, respectively. Then the sets constructed from them using the introduction, elimination and structural-abstraction syntactic operators are all saturated: Substitution and its restriction. The (capture-avoiding) substitution operation lifts point-wise to the level of sets as a monotone function (−) [(=)/φ] : P(U) × P(V ) → P(U) for V the set of terms (resp. co-terms), φ a variable (resp. co-variable) and U either the set of terms, co-terms or cuts. We will often need to answer the question of which strongly normalizing phrases are in some set of 'good phrases' after simultaneously substituting for some variables. This is handled by the following operations of restriction under (simultaneous, capture-avoinding) substitution: given χ a finite family of co-/variables and P a family of sets with each P i in T or K, as appropriate for the respective χ i ; the restriction to T ∈ P(T ) under substitution P/ χ is given by the restrictions − P/ χ : P(K) → P(SN K) and − P/ χ : P(C) → P(SN ) are defined similarly.
They are clearly antitone on P and monotone on the set we are restricting to. Furthermore, they witness an adjoint situation between sets of terms/co-terms/cuts and their strongly normalizing counterparts. PROPOSITION 3.2 Let χ be a finite family of variables and co-variables and P a(n equally indexed) family of sets of phrases, each according with the kind of the respective χ ; then, Substitutivity. The preservation of saturation by this restriction for one (co-) variable requires the concept substitutivity of reduction-substitution commutes with (parallel) reduction in the following two ways: LEMMA 3.3 (I) For any phrases (i.e. terms/co-terms/cuts) p 0 p 1 (II) For any phrase p The first property can take a more algebraic f lavour: Let P stand for a set of terms, or of co-terms or cuts, U be a set of terms and L a set of co-terms. Substitutivity I is equivalent to

Restriction and saturation.
And all this algebraic scaffolding affords us a very straightforward proof of THEOREM 3.5 Let P be a saturated set of either terms, co-terms or cuts; then, for any T ⊆ T and any K ⊆ K, the sets P T/x and P K/α are saturated.
PROOF. Note that P T/x ⊆ SN /SN T /SN K so, also, P T/x ⊆ SN /SN T /SN K (likewise for P K/α ). Then, for either case, a simple algebraic derivation-using the adjointness of the restrictions, monotonicity of the image of a relation, substitutivity and saturation-suffices.

Orthogonal pairs
Orthogonal pairs. Whenever a term t and a co-term k-necessarily strongly normalizing-form a strongly normalizing cut t • k, we say that they are orthogonal. Similarly, for sets T of terms and K of co-terms, we say that they are orthogonal if T • K ⊆ SN . We use the name orthogonal pairs for pairs of orthogonal sets that are saturated and denote the set of all such pairs by OP. For any orthogonal pair P ∈ OP, its set of terms is denoted (P) T and its set of co-terms by (P) K . Note that no type restriction is in play in the definition of orthogonal pairs, e.g. a cut of an injection with a projection is by definition orthogonal as no reduction rule applies.

Lattices.
Recall that a lattice L is a partially ordered set such that every non-empty finite subset S of the carrier of L has a least upper bound (or join, or lub) and a greatest lower-bound (or meet, or glb), respectively, denoted by S and S. By abuse of notation, we conf late lattices with their carrier sets unless otherwise noted. If the bounds exist for any subset of L, one says that the lattice is complete. In particular, this entails the existence of a bottom and a top element for the partial order. The powerset P(S) of a set S ordered by inclusion, together with set-union and set-intersection, is a complete lattice with bounds given by S and the empty set. The dual L op of a (complete) lattice L (where we take the opposite order and invert the bounds) is a (complete) lattice, as is the point-wise product of any two (complete) lattices.
the join and meet of arbitrary sets S ⊆ OP are and, explicitly, the empty joins and meets are Orthogonal normal pairs. The other lattice we are interested in is the lattice ON P of what we call orthogonal normal pairs. These are orthogonal pairs that are made out at the outermost level of either introductions or eliminators. Logically speaking, they correspond to those proofs whose last derivation is a left or right operational rule; computationally, they would intuitively correspond to '(co-) normal forms' of a type. Orthogonal normal pairs inherit the lattice structure of OP but for the empty lub and glb that become ⊥ ≡ (∅, EK ∩ SN K) and ≡ (IT ∩ SN T , ∅).

PROPOSITION 3.7 (Lattice structure of ON P).
The set ON P can be turned into a sub-lattice of OP. It is complete with extrema given by

LEMMA 3.8 (Saturation and orthogonality).
Let T ⊆ SN T and K ⊆ SN K be saturated sets of terms and co-terms, respectively, satisfying PROOF. We prove the result indirectly by proving instead that for any finite subsets T ⊆ T and K ⊆ K one has T • K ⊆ SN . Given that both T and K are equal to the union of their finite subsets (and these are preserved by the syntactic operation of cutting, −•−) the result will follow. The proof for these will be by induction on the sum of the depths of all possible reduction paths of all phrases in the sets T and K . As the sets are finite and the (co-) terms terminating (by virtue of being in SN T or SN K, as appropriate), we are guaranteed that this measure is either zero-and only head reductions apply-or strictly reducing when we execute a single step of parallel reduction. For the zero case, we have that The induction step is proved with measured aid of the induction hypothesis. Assume the sum of the depths of all reduction paths for terms and co-terms in T and K is non-zero; it therefore follows that either both have smaller sum than T • K , or the sum of depths of reduction for one of them is zero and for the other it is smaller that that of T • K . Saturation now takes on a critical rôle; because of it we know that both T and K are subsets of T and K, and, since they are finite, the induction hypothesis can be applied to them. In any of the cases above, be it by the induction hypothesis or by virtue of not having any reducible terms, we have that PROOF. Apart from the implicative and subtractive cases, by Lemma 3.1, all the sets parametrizing the cuts are saturated-and clearly strongly normalizing as well. For the other two cases, we appeal first to Theorem 3.5 for the saturated T and K to show that T T/x and K K /α are also saturated-and by definition strongly normalizing. By Lemma 3.8, it suffices to show that after head-reduction the resulting cuts are strongly normalizing-e.g. using adjointness (Prop. 3. 2): The remaining cases are even easier.
Type actions. Pairing together the actions of the introductions and eliminations of a given type allows us to construct elements of ON P whenever we apply them to orthogonal pairs. These type actions are defined in Table 5.
PROOF. Using the lattice properties of ON P (Proposition 3.7), the proof reduces to repeated use of Lemma 3.9 to establish the orthogonality of the given sets. For that, we need the orthogonality and the saturation properties of the components of OP. It also follows from the latter that they are saturated (Lemma 3.1). By construction, they are all made at the outermost level out of constructors/ eliminators.
Orthogonal completion. Now that we have interpretations for the actions that construct values/covalues of a type in ON P, we need to go the other way (as per diagram 1, above) to OP, so that we also include (co-) variables and (co-) abstractions in our interpretations. So, for saturated orthogonal sets of values T and of co-values K, the term and co-term completions of T and K are respectively defined as: Due to the non-determinism associated with the reduction of abstractions, we need guarantee that all added (co-) abstractions are compatible not only with the initial set of values, but also with any (co-) abstractions that have been added in the process-and vice-versa. In other words, we need to iterate this process by taking the least fix-point in the complete lattice of subsets of terms-and then, from it, obtain the continuations: (In fact, as has been remarked elsewhere [3,19], all one needs is a fix-point.) That the fix-point exists is a consequence of the next proposition; that it always yields an orthogonal pair when applied to elements N ∈ ON P, is proven in Proposition 3.14-and in this case, we term it the structural completion of N.
PROOF. The set SN is saturated, hence, by Theorem 3.5, so is SN L/α for any co-term set L, and, by Lemma 3.1, α. SN L/α is, therefore, also saturated. The set of variables Var is, trivially, saturated, as is T (by assumption); the union of these sets for any choice of co-variable in the abstraction is also saturated. (Likewise for the co-term closure.) For an arbitrary set of terms T and an arbitrary set of co-terms K, let P = (T K). The following equalities hold: And from these, it follows easily that PROPOSITION 3.14 Let N ∈ ON P be an orthogonal normal pair. Its structural completion is an orthogonal pair:

Orthogonal interpretations
Interpretations. Given a type T and a finite mapping γ containing its free type variables, ftv(T), to ON P-called the interpretation context-we define (Table 6) two interpretations, T (γ ) in orthogonal pairs and T (γ ) in orthogonal normal pairs, by mutual induction on the structure of T (bound variables are, as usual, taken fresh). They both satisfy the weakening and substitution properties. The extension of an interpretation context γ where a type-variable X is mapped to THEOREM 3.15 (Well-definedness). For any DC type T and for any suitable interpretation context γ (i.e. finite dom(γ ) and ftv(T) ⊆ dom(γ )): LEMMA 3. 16 The two interpretations can be easily related in the following way: Second-order properties. In addition to being orthogonal, the two interpretations of Table 6 also satisfy the two (standard) properties of any second-order system: weakening and substitution. The first is paramount in the generalization rules (right-rule for universal quantification, left-rule for existential quantification), the second conf lates type instantiation (substitution) at the syntactic level with instantiation of interpretation at the semantic level.
Syntactic lifting. The final results that we need relate the structure of the calculus-its operatorswith the interpretation of types. Concretely, the former are 'morphisms' between the latter.
LEMMA 3.20 (Conservation for abstractions). The abstraction operation takes the interpretation for terms into a subset of the interpretation for co-terms-and vice-versa for co-abstractions:

Adequacy
Substitutions. The result of a (head) reduction is not necessarily composed of sub-phrases of the original cut; for abstractions, it depends on the result of a substitution. The proof of strong normalization cannot simply rest on a direct induction on the property of being strongly normalizing. We must additionally show that said property is invariant whenever we make some substitution that respects the interpretations of the given types. Given typing (co-) contexts Γ and Δ, and an interpretation context γ containing all the freevariables in those contexts, we define a (bi-) substitution σ for Γ , Δ, γ as being a finite mapping of (co-) variables into the parametrized interpretation of their types in the contexts, i.e.
The application of a substitution to a phrase p is denoted p [σ ]. With this in hand, we can express the adequate strong normalization theorem as follows.
THEOREM 3.21 (Adequacy). Let t, k and c stand for terms, co-terms and cuts of the dual calculus. For any typing contexts and co-contexts Γ and Δ, s.t.
for any (suitable) interpretation context γ for Γ , Δ and T, and correspondingly suitable substitution σ , we have that PROOF. Formally, by rule induction on the typing trees. We show the cases for subtraction. Terms are handled straightforwardly using the induction hypothesis: On the continuation side, we assume that co-variable α is chosen fresh everywhere. As the typing context on the assumption of the typing rule has an extra α : B, we have that for any substitution σ for the conclusion and k ∈ B (γ ) K , the substitution σ k /α is in the conditions of the theorem, and, therefore the induction hypothesis yields: and also, because Covar ⊆ B (γ ) K , by definition of restriction, we conclude

COROLLARY 3.22 (Strong normalization).
Every well-typed phrase of DC is strongly normalizing.

Mendler induction
Having covered the first theme of the paper, classical logical in its Dual Calculus guise, let us focus in this section on the second theme we are exploring: Mendler induction. As the concept may be rather foreign, it is best to review it informally in the familiar functional setting.

Inductive definitions.
Roughly speaking, an inductive definition of a function is one in which the function being defined can be used in its own definition provided that it is applied only to values of strictly smaller character than the input. The fix-point operator associated to the inductive type μ X .F(X ) arising from a type scheme F(X ), clearly violates induction and indeed breaks strong normalization: one can feed it the identity function to yield a looping term. One may naively attempt to tame this behavior by considering the following modified fix-point operator in which, for the introduction in : F μX .F(X ) → μX .F(X ), one may regard x as being of strictly smaller character than in(x ). Of course, this is still unsatisfactory as, for instance, we have the looping term fix (λf . f • in). The problem here is that the functional λf . f • in : (μX .F(X ) → A) → F μX .F(X ) → A of which we are taking the fix-point takes advantage of the concrete type F μX .F(X ) of x used in the recursive call.

Mendler induction.
The ingenuity of Mendler induction is to ban such perversities by restricting the type of the functionals that the iterator can be applied to: these should not rely on the inductive type but rather be abstract-in other words, be represented by a fresh type variable X as in the typing below 4 : Note that if the type scheme F(X ) is endowed with a polymorphic mapping operation map F :

Dual calculus with Mendler induction
Mendler induction. We shall now formalize Mendler induction in the classical calculus of Section 2. Additionally, we shall also introduce its dual, Mendler co-induction. This requires type constructors, syntactic operations corresponding to the introductions and eliminations and their typing rules and reduction rules. These are summarized in Table 7. First, we take a type scheme F(X ) and represent its inductive type by μ X .F(X )-dually, we represent the associated co-inductive type by ν X .F(X ).
Syntax. As usual, the inductive introduction, min − , witnesses that the values of the unfolding of the inductive type F(μ X . F(X )) are injected in the inductive type μ X .F(X ). It is in performing induction that we consume values of inductive type and, hence, the induction operator (or iterator, or inductor), mitr ρ,α [k, l] corresponds to an elimination. It is comprised of an iteration step k, an output continuation l and two distinct induction co-variables, ρ and α. We postpone the explanation of their significance for the section on reduction below, but note now that the iterator binds ρ and α in the iteration continuation but not in the output continuation, thus, e.g.
The co-inductive operators, mcoitr r,x t, u and mout [k], are obtained via dualization. In particular, the co-inductive eliminator, mout [k], witnesses that the co-values k of type F(ν X .F(X )) determine the 'proper' (i.e. those that are not abstractions) co-values of ν X .F(X ).

Reduction.
To reduce an inductive cut min t • mitr ρ,α [k, l], we start by passing the unwrapped inductive value t to the induction step k. However, in the spirit of Mendler induction, the induction step must be instantiated with the induction itself and, because we are in a classical calculus, with the output continuation-this is where the parameter co-variables come into play. The first co-variable, ρ, receives the induction; the induction step may call this co-variable (using a cut) arbitrarily and it must also be able to capture the output of those calls-in other words, it needs to compose this continuation with other continuations; therefore, one needs to pass μα. mitr ρ,α [k, α] , the induction with the output continuation (subtractively) abstracted. The other co-variable, α, represents in k the output of the induction-which for a call mitr ρ,α [k, l] is l. 5 For co-induction, we dualize-in particular, the co-inductive call expects the lambda-abstraction of the co-inductive step.
Typing. Lastly, we have the typing rules that force induction to be well founded. Recall that this was achieved in the functional setting by forcing the inductive step to take an argument of arbitrary instances of the type scheme F(X ). Here we do the same. In typing mitr ρ,α [k, l] for μ X .F(X ), we require k to have type F(X ) where X is a variable that appears nowhere in the derivation except in the (input) type of the co-variable ρ. For any continuation k on N , the successor 'function' is defined as the following continuation for N .
Example: addition. The above primitives are all we need to define addition of these naturals. The inductive step 'return m for zero, or induct and then add one' is encoded as Step THEOREM 5.1 Let n and m stand for the encoding of two natural numbers and the encoding of their sum be (by abuse of notation) n + m. Under the abstraction prioritizing reduction rule, PROOF. Inducting on n: for n + 1 = min i 2 n min i 2 n • mitr ρ,α Step m ρ,α , l Notice how the inductive operator works by accumulating future actions in the continuation parameter as it consumes the value-this is characteristic of this style of programming.
Splitting the naturals. Let us take a slight generalization of the naturals, μ X . A ∨ X , where 1 is replaced by some fixed type A (not containing X free) and zero can be parametrized by a term of that type instead of the fixed witness * , Another well-known operation on naturals is the witness split : N → N ∨ N of the partition of naturals between evens and odds. This is an inductive function that tells of a natural number not only if it is even or odd but also which nth even or odd it is. Since it consumes a natural, it will be a continuation; to simulate returning, we also must parametrize it with a continuation on N ∨ N with the relevant component being called correspondingly:

Co-induction 'via' induction.
The example above is more than simply an arithmetic curiosity. If we straightforwardly take the syntactic dual-replacing induction with co-induction, disjunctions with conjunctions and inputs with outputs-we arrive at the following definitions: ]) @α)) . As their names suggest, they form the basis of the merging of co-inductive streams of some type A [14]. Their associated co-inductive type is ν X . A ∧ X . Going forward, dualizing Prop. 5.2 as well, we can see this operator satisfies the specification we would expect from the merging operator: if we try to take out (read, pass to a continuation k) the element at position 2n (zero based) of the merged stream, we get element n of the first stream, and if we try to take out the element at position 2n + 1, we get the element at position n of the second stream. It may seem odd that we can encode infinite streams in a language that is-as we shall shortly see-strongly normalizing. The strict duality of the Dual Calculus makes it possible to re-frame any co-inductive problem into a more familiar inductive one. In this case, induction is strongly normalizing because we ever only apply it to finite values; conversely, co-inductive values may be 'infinite' but can only ever be analyzed using finite sequences of mout[−] operations.

Strong normalization for Mendler induction
We now come to the main contribution of the paper: the extension of the orthogonal pairs interpretation of the second-order Dual Calculus (Section 3) to Mendler induction-and the proof, thereby, that the extension is also strongly normalizing.

Sets of syntax
Set and lattice structure. The extension begins with the reformulation of the sets T , K and C, SN T , SN K and SN , IT and EK so that they accommodate the (co-) inductive operators. Modulo these changes, the definitions of OP and ON P remain the same, and so do the actions for propositional and second-order types and the orthogonal completion, . All that remains is to give suitable definitions for the (co-) inductive actions and the interpretations of (co-) inductive types. As before, we lift syntactic operators to the level of sets by taking the image of their actions on phrases, confusing the notation for both. These operators preserve saturation. LEMMA 6.1 Let T and U be saturated sets of terms, K and L be saturated sets of co-terms. Inductive and coinductive terms and co-terms built out of them and the inductive and co-inductive introductions and eliminations are saturated: Inductive restrictions. The reduction rule for Mendler induction is unlike any other of the calculus. When performing an inductive step for mitr ρ,α [k, l], the bound variable ρ will be only substituted by one specific term, namely μα. mitr ρ,α [k, α] . One needs a different kind of restriction to encode this invariant: take K and L to be sets of co-terms (intuitively, where the inductive step and output continuation live) and define the inductive restriction by K/ ρ α L ≡ k ∈ SN K for all l ∈ L, k μα. mitr ρ,α [k, α] /ρ [l/α] ∈ K , and also for co-induction, for sets of terms T and U,

(Saturation for Mendler restrictions).
Let T be a set of terms, and K a set of co-terms-both of them saturated-and let U and L be any set of terms and any set of co-terms, respectively. For any (distinct) variables r and x, and co-variables ρ and α, K/ ρ α L and T/ r x U are saturated. LEMMA 6.3 (Preservation of (head) orthogonality). Take T, U ⊆ SN T to be saturated sets of terms, K, L ⊆ SN K to be saturated sets of co-terms and assume that T • K ⊆ SN ; it then follows that

LEMMA 6.4 (Preservation of orthogonality).
Take T, U ⊆ SN T to be saturated sets of strongly normalizing terms, K, L ⊆ SN K to be saturated sets of strongly normalizing co-terms and assume that T • K ⊆ SN ; it then follows that

Orthogonal pairs
Mendler pairing. Combining the inductive restriction with the inductive introduction/elimination set operations, we can easily create orthogonal normal pairs-much as we did for the propositional actions-from two given orthogonal pairs: one intuitively standing for the interpretation of F(μF . F(X )) and the other for the output type. However, the interpretation of the inductive type should not depend on a specific choice of output type but should accept all instantiations of output, as well as all possible induction co-variables; model-wise this corresponds to taking a meet over all possible choices for the parameters: and similarly for its dual: Monotonization. The typing constraints on Mendler induction correspond-model-wise-to a monotonization step. This turns out to be what we need to guarantee that an inductive type can be modeled by a least fix-point; without this step, the interpretation of a type scheme would be a function on complete lattices that would not necessarily be monotone. There are two possible universal ways to induce monotone endo-functions from a given endo-function f on a complete lattice: the first one, f , we call the monotone extension and use it for inductive types; the other one, the monotone restriction f , will be useful for co-inductive types. Their definitions 6 are: They are, respectively, the least monotone function above and the greatest monotone function below f . Necessarily, by Tarski's fix-point theorem, they both have least and greatest fix-points, i.e. we have lfp ( f ) and gfp ( f ).

Orthogonal interpretations
Interpretations. The normal interpretations for (co-) inductive types associated to some typescheme F(X ) given a (suitable) interpretation context gamma is while the respective orthogonal interpretations are as before. These interpretations also satisfy the weakening and substitution properties. LEMMA 6.9 The two interpretations are still related (Lemma 3.16) by K the interpretation of subtractive types requires that for any l ∈ A (γ ) K we already know mitr ρ,α k[σ ], l ∈ μ X .F(X ) (γ ) K -a circularity!
For ω-complete posets there is an alternative characterization of the least fix-point of a continuous function as the least upper bound of a countable chain. The completion operation used in the definition of the OP interpretation is not continuous. However, classically, the least fix-point of any monotone function f on a complete lattice exists and lies somewhere in the transfinite chain [7] d α+1 = f (d α ) and d λ = α<λ d α (for limit λ) (and dually for co-induction).
Admissibility. This least fix point comes with a useful induction principle. Because we work with least upper and greatest lower bounds, we need to think in terms of their preservation. We say of a proposition (seen as a set) that it is admissible iff it satisfies: 1. Lub preservation: S ⊆ P ⇒ P S 2. Downward closure: a ≤ b and P(b) ⇒ P(a). The property holds for the greatest fix-point of its monotone restriction, PROOF. Apply Theorem 6.12 to L = M op .

Adequacy
As in Section 3, we establish strong normalization via conservation and adequacy results for the Mendler-inductive extension. LEMMA 6.14 (Conservation). For any typescheme F, type A and interpretation context γ suitable for both, where X is fresh for type A.
PROOF. We focus here solely on the eliminator for inductive types. The challenge we are faced with is to recast the statement of conservation as a proposition within the confines of our induction principle (Theorem 6.12). Using the-by now familiar-fact that the terms (and co-terms) in the normal interpretation are included in the orthogonal one and defining the abbreviation our goal can be re-framed as with the ON P interpretation of the inductive type being the least fixed point Take then P to be To use the induction principle (Theorem 6.12), we need to prove that P is admissible (downward and least upper bound closed) and that it is preserved by MuP Let M ≤ N ∈ ON P. For downward closure, by contra-variance of the order for the continuation side, it follows that (N) K ⊆ (M) K ; whence, if P(N) holds, we have that or, equivalently, P(M), as needed. For the least upper bound property, for S ⊆ P we consider the empty and non-empty cases separately. If S = ∅ then, trivially, for any N ∈ S, mitr ρ,α The core of the proof lies in showing preservation of P by MuP • F(X ) (γ [X → −]). Assume, to this end, that N ∈ ON P is such that P(N), whence we gather that This will put us in a position to prove that the inductive calls that comprise the left-hand side of the inclusion satisfy the inductive restriction. To see this, take and, from , Combining these two observations with the definition of the substitution restriction yields As claimed, then, since, From this we conclude that PROOF. By extending the argument of Theorem 3.21. For the iterator, the proof almost exactly boils down to proving that the induction hypothesis on k[σ ] (and l[σ ]) implies the relevant pre-conditions of conservation (Lemma 6.14). A slight complication arises from the fact that the free type variables in A need not appear in the conclusion, and therefore, a context γ which satisfies the adequacy conditions for the conclusion, need not be suitable for the antecedents. To guarantee that that is the case, we extend any such γ with those type variables that appear in A but were not contemplated in γ . Denoting the set of these by C = ftv(A) − dom(γ ) we get a new context by assigning ⊥ 7 to those variables: We shall also need to consider a further extension that accounts for the extra (fresh) type variable X ; this freshness can be used-by the weakening property-to assign an arbitrary N ∈ ON P to it. The set C is necessarily finite as it is bound by the free type variables that appear in the (finite) type A. By repeated applications of weakening, we have that any substitution σ in the adequacy conditions for the conclusion w.r.t. context γ is also valid for γ as For the return continuation l, we immediately get from the induction hypothesis and weakening that which, by weakening (ftv(μ X .F(X )) ⊆ dom(γ )), is equivalent to The (co-) injection cases are simple; the co-inductor is dealt similarly to the above. COROLLARY 6.16 (Strong normalization). Every well-typed phrase of DC with Mendler induction is strongly normalizing.
We have investigated classical logic with Mendler induction, presenting a classical calculus with very general (co-) inductive types. Our work borrows from and generalizes systems based on Gentzen's LK under the Curry-Howard correspondence. Despite its generality, and as determined by means of a reducibility argument, our Dual Calculus with Mendler induction is well behaved in that its well-typed cuts are guaranteed to terminate. We expect-but have yet to fully confirm-that other models fit within our framework for interpreting Mendler induction; our prime example is based on inf lationary fix-points like those used in complexity theory [8] that also apply to non-monotone interpretations.
It is known that LK-based calculi can encode various other calculi [6,22]. Our calculus supports map operations for all positive (co-) inductive types. This may be used to encode Kimura and Tatsuta's [15] extension of the Dual Calculus with positive (co-) inductive types [9, ch. 5].
One avenue of research that remains unexplored is how one may extract proofs from within our system-in previous work, Berardi, et al. [4] showed how, embracing the non-determinism of reduction inherent in the Symmetric Lambda-Calculus (and also present in DC), one could express proof witnesses that behave like processes for a logic based on Peano arithmetic. A further direction would take these investigations into the realm of linear logic, where the connection with processes may be more salient.