Probabilities of first order sentences on sparse random relational structures: An application to definability on random CNF formulas

We extend the convergence law for sparse random graphs proven by Lynch to arbitrary relational languages. We consider a finite relational vocabulary $\sigma$ and a first order theory $T$ for $\sigma$ composed of symmetry and anti-reflexivity axioms. We define a binomial random model of finite $\sigma$-structures that satisfy $T$ and show that first order properties have well defined asymptotic probabilities when the expected number of tuples satisfying each relation in $\sigma$ is linear. It is also shown that these limit probabilities are well-behaved with respect to several parameters that represent the density of tuples in each relation $R$ in the vocabulary $\sigma$. An application of these results to the problem of random Boolean satisfiability is presented. We show that in a random $k$-CNF formula on $n$ variables, where each possible clause occurs with probability $\sim c/n^{k-1}$, independently any first order property of $k$-CNF formulas that implies unsatisfiability does almost surely not hold as $n$ tends to infinity.


Introduction
We say that a sequence of random structures {G n } n satisfies a limit law with respect to some logical language L if for every property P expressible in L the probability that G n satisfies P tends to some limit as n → ∞. If that limit takes only the values zero and one then we say that {G n } n satisfies a zero-one law with respect to L.
Convergence and zero-one laws have been extensively studied on the binomial graph G(n, p). The seminal theorem on this topic, due to Fagin [7] and Glebskii et al. [9] independently, concerns general relational structures. When applied to graphs it states that if p is fixed, then G(n, p) satisfies a zero-one law with respect to the first order (FO) language of graphs.
This zero-one law was later extended by Shelah and Spencer in [12]. There it is proven, among other results, that if p := p(n) is a decreasing function of the form n −α and α > 0 is irrational, then G(n, p(n)) obeys a zero-one law with respect to FO logic. Moreover, it is also proven that if α ∈ (0, 1) is rational then G(n, p(n)) does not obey a convergence law.
This was further studied by Lynch in [10], where it is shown that in the case where the expected number of edges is linear, i.e. when p(n) ∼ β/n for some β > 0, then G(n, p(n)) satisfies a limit law with respect to FO logic. The following is a restatement of the main result in that article.
Theorem (Lynch, 1992). Let p(n) ∼ β/n. For every FO sentence φ, the function F φ : (0, ∞) → [0, 1] given by is well defined and is given by an expression with parameter β built using rational constants, addition, multiplication and exponentiation with base e.
A relevant aspect of this result is that the limit probability of any FO property in G(n, p(n)) when p(n) ∼ β/n varies analytically with β. A consequence of this is that FO logic cannot "capture" sudden changes in the structure of G(n, p(n)).
It was left open at the end of [10] whether the convergence law obeyed by G(n, p(n)) in the range p(n) ∼ β/n could be generalized to other random models of relational structures that contain relations of arity greater than 2. A result in this direction was obtained in [11], among other zero-one and convergence laws. They consider the random model of d-uniform hypergraphs G d (n, p) where each d-edge is added to a set of n labeled vertices independently with probability p. It is shown that when p(n) ∼ β/n d−1 , i.e. when the expected number of edges is linear, G d (n, p(n)) obeys a convergence law with respect to the FO language of d-uniform hypergraphs. With little additional work it can be shown that in these conditions the limit probability of any FO property of G d (n, p(n)) varies analytically with β. We extend this result to arbitrary relational structures on whose relations we can impose symmetry and anti-reflexivity constraints (Theorem 1.3). This generalization is motivated by an application to the problem of random SAT. We continue the study started by Atserias in [1] with respect to the definability in first order logic of certificates for unsatisfiability that hold for typical unsatisfiable formulas. A random model for 3-CNF formulas where each possible clause over n variables is added independently with probability p is considered there. In this model the expected number of clauses m is Θ(n 3 p) as n grows. The main result of that article states the following: (1) if m = Θ(n 2−α ) for an irrational number α > 0, then no FO property of 3-CNF formulas that implies unsatisfiability holds asymptotically almost surely (a.a.s.) for unsatisfiable formulas, and (2) if m = Θ(n 2+α ) for α > 0, then there exists some FO property that implies unsatisfiability and holds a.a.s. for unsatisfiable formulas.
The second part of the statement is the simpler one to prove: it can be shown that when m = Θ(n 2+α ) for some α > 0 the random 3-CNF formula a.a.s. contains some fixed unsatisfiable subformula (which depends on the choice of α). This is clearly expressible in FO logic, so (2) follows. The proof of (1) is more involved and, in fact, shows something stronger: if m = Θ(n 2−α ) for α > 0 irrational, then all FO properties that imply unsatisfiability a.a.s. do not hold. This proof employs techniques based in those used by Shelah and Spencer in [12] to prove that G(n, p) satisfies a zero-one law with respect to FO logic when p is an irrational power of n.
Since the techniques used to prove (2) rely on the fact that α is irrational, the study of the range m = Θ(n) (that is, m = Θ(n 2−α ) with α = 1), was left open. This range is of special interest because it is where the phase transition from almost sure satisfiability to almost sure unsatisfiability takes place. It was shown in [3] that a random k-CNF formula with m clauses over n variables satisfying that m ∼ cn is a.a.s satisfiable for all sufficiently small values of c and is a.a.s unsatisfiable for all sufficiently large values of c.
The possibility of studying FO definability of certificates for unsatisfiability in random l-CNF formulas with a linear expected number of clauses using a generalization of Lynch theorem was suggested by Atserias. This application is discussed in Section 5. We give a brief overview of it here. Let F (l, n, p) be a random model of l-CNF formulas where each l-clause over n variables is chosen independently with probability p. Let F l n (β) denote a random formula in F (l, n, p) where p := p(n) ∼ β/n l−1 . Suppose that every FO property of l-CNF formulas has a well defined asymptotic probability in F l n (β) for any β > 0. Further suppose that these asymptotic probabilities vary analytically with β. Then any FO property that implies unsatisfiability a.a.s does not hold in F l n (β) for β > 0. Indeed, let P be one such FO property. One can find a value β 0 > 0 satisfying that a.a.s F l n (β) is satisfiable when 0 < β < β 0 . As a consequence P a.a.s does not hold in F l n (β) when 0 < β < β 0 . Since the asymptotic probability of P varies analytically with β and it vanishes in the non-empty interval (0, β 0 ), because of the Principle of analytical continuation it must be true that a.a.s P does not hold in F l n (β) for all β > 0.

General notation
Given a positive natural number n, we write [n] to denote the set 1, 2, . . . , n. Given numbers, n, m ∈ N with m ≤ n we denote by (n) m the m-th falling factorial of n. Given a set S and a natural number k ∈ N we use S k to denote the set of subsets of S of size k.Given a set S and n ≤ |S|, we define (S) n as the subset of S n consisting of the n-tuples whose coordinates are all different. We also define S * := ∞ n=0 S n and (S) * := n≤|S| (S) n . We use the convention that over-lined variables, like x, denote ordered tuples of arbitrary length. Given an ordered tuple x we define len(x) as its length. Given a tuple x and an element x the expression x ∈ x means that x appears as some coordinate in x. Given a map f : X → Y and an ordered tuple x := (x 1 , . . . , x a ) ∈ X * we define f (x) ∈ Y * as the tuple (f (x 1 ), . . . , f (x a )). Given two tuples x, y we write x y to denote their concatenation. Given a set S and elements x s for each s ∈ S we write {x s } s∈S , or just {x s } s when S is understood, to denote the tuple indexed by S which contains the element x s at the position given by s.
Let S be a set, a a positive natural number, and Φ a group of permutations over [a]. Then Φ acts naturally on S a in the following way: Given g ∈ Φ and x := (x 1 , . . . , x a ) ∈ S a let gx := (x g(1) , . . . , x g(a) ). We denote by S a /Φ the quotient of S a by this action. Given x := (x 1 , . . . , x a ) ∈ S a we denote its equivalence class in S a /Φ by [x 1 , . . . , x a ] or [x]. Thus, The notations x and (x 1 , . . . , x a ) represent ordered tuples while [x] and [x 1 , . . . , x a ] denote ordered tuples modulo the action of some arbitrary group of permutations. Which group it is will depend on the ambient set where [x 1 , . . . , x a ] belongs and it should either be clear from context or not be relevant.
Given real functions over the natural numbers f, g : N → R the expressions f = O(g), f = o(g) and f = Θ(g) have their usual meaning. If g(n) = 0 for n large enough we write f ∼ g if lim n→∞ f (n) g(n) = 1.

Probabilistic preliminaries
We assume familiarity with basic probability theory. We denote by Poiss λ (n) the discrete probability mass function of a random Poisson variable with mean λ. That is, Poiss λ (n) = e −λ λ n n! . We define Poiss λ (≥ n) = 1 − n−1 i=0 Poiss λ (i). Given some sequence of events {A n } n we say that A n is satisfied asymptotically almost surely (a.a.s.) if Pr(A n ) tends to 1 as n → ∞. Given a sequence of random variables {X n } n , the first moment method is an application of Markov's inequality that establishes that if E[X n ] tends to zero as n → ∞ then a.a.s X n = 0.
If A, B are events we may write the conditioned probability Pr(A | B) as Pr B (A) to shorten some expressions. In this situation, given a random variable X we put E B [X] to denote conditional expectation of X given the event B.
Our main tool for proving the convergence in distribution to Poisson variables is the next result, which can be found in [2, Theorem 1.23].
We use the following observation in order to compute the binomial moments of our random variables. Observation 1.1. Let X 1 , . . . , X l be non negative random integer variables over the same probability space. Let r 1 , . . . , r l ∈ N. Suppose each X i is the sum of indicator random variables (i.e. variables that only take the values 0 and 1) Define ∈ Ω represent all the possible unordered choices of r i indicator variables Y i,j for each i ∈ [l]. Then

Logical preliminaries
We assume familiarity with first order logic (FO). We follow the convention that first order logic contains the equality symbol. Given a vocabulary σ we denote by F O[σ] the set of first order formulas of vocabulary σ. Given a relation symbol R ∈ σ we denote by ar(R) the arity of R. Given a formula φ ∈ F O[σ] we use the notation φ(y) to express that y is a tuple of (different) variables which contains all free variables in φ and none of its bounded variables, but it may contain variables which do not appear in φ. Formulas with no free variables are called sentences and formulas with no quantifiers are called open formulas. The quantifier rank of a formula φ, written as qr(φ), is the maximum number of nested quantifiers in φ. We call edge sentence to any consistent open formula that contains no occurrence of the equality symbol '='.

Structures as multi-hypergraphs
For the rest of the article consider fixed: • A relational vocabulary σ such that all the relations R ∈ σ satisfy ar(R) ≥ 2.
• Groups {Φ R } R∈σ such that each Φ R is consists of permutations on [ar(R)] with the usual composition as its operation.
for all R ∈ σ.
We define C as the class of σ-structures that satisfy the following axioms: • Symmetry axioms: For each R ∈ σ and g ∈ Φ R : • Anti-reflexivity axioms: For each R ∈ σ and {i, j} ∈ P R Structures in C generalize the usual notion of a hypergraph in the sense that they contain multiple "adjacency" relations with arbitrary symmetry and anti-reflexivity axioms.
We use the usual graph theory nomenclature and notation with some minor changes. In the scope of this article hypergraphs are structures in C. Given a hypergraph G its vertex set V (G) is its universe.
In order to define the edge sets of G we need the following auxiliary definition Definition 1.1. Let V be a set, and let R ∈ σ. We define the set of possible edges over V given by R as We call edges to the elements of E R [V ] and we say that the sort of an edge That is, E R [V ] contains all the "ar(R)-tuples of elements in V modulo the permutations in φ R " excluding those that contain some repetition of elements in the positions given by P R .
Let G be a hypergraph with vertex set is V and let R ∈ σ be a relation. We define the edge set of G given by R, denoted by E R (G), as the set of edges [v] ∈ E R [V ] such that v ∈ R G . We define the total edge set of G as the set E(G) := ∪ R∈σ E R (G). Given an edge, e ∈ E(G) we denote by V (e) the set of all vertices that participate in e.
Clearly a hypergraph G is completely given by its vertex set V (G) and its edge set E(G). Notice that edges e ∈ E(G) are sorted according to the relation they represent. The size of G, written as |G|, is its number of vertices.
Given two hypergraphs H and G we say that H is a sub-hypergraph of G, written That is, the excess of G is the "weighted number of edges" minus its number of vertices. An hypergraph G is connected if for any two vertices v, u ∈ V (G) there is a sequence of edges e 1 , . . . , e m ∈ E(G) such that v ∈ V (e 1 ), u ∈ V (e m ) and for each i ∈ [m − 1], V (e i ) ∩ V (e i+1 ) = ∅. It holds that ex(G) ≥ −1 for any connected hypergraph.
Given a hypergraph G we define the following metric, d, over V (G): That is, the distance between v and u is the minimum number of edges necessary to connect v and u. If such number does not exist we define d G (u, v) = ∞. When G is understood or not relevant we simply write d instead of d G . Equivalently, the distance d coincides with the usual one defined over the Gaifman graph of the structure G. The diameter of a hypergraph is the maximum distance between any pair of vertices. We extend naturally the distance d to sets and tuples of vertices, as usual. Given a vertex/set/tuple X and a number r ∈ N we define the neighborhood N G (X; r), or simply N (X; r) when G is not relevant, as the set of vertices v such that d G (X, v) ≤ r.
A connected hypergraph G is a path between two of its vertices v, u ∈ V (G) if G does not contain any connected proper sub-hypergraph containing both v, u. A connected hypergraph G is a tree if ex(G) = −1 and dense if ex(G) > 0. An hypergraph is called r-sparse if it does not contain any dense sub-hypergraph H such that diam(H) ≤ r. A connected hypergraph G with ex(G) ≥ 0 is called saturated if for any non-empty proper sub-hypergraph H ⊂ G it holds ex(H) < ex(G). A connected hypergraph G with ex(G) = 0 is called a unicycle. A saturated unicycle is called a cycle. We say that an edge e := [v] contains a loop if some vertex v appears in v more than once.
A rooted tree (T, v) is a tree T with a distinguished vertex v ∈ V (T ) called its root. We usually omit the root when it is not relevant and write just T instead of (T, v). The initial edges of a rooted tree (T, v) are the edges in T that contain v. We define the radius of a rooted tree as the maximum distance between its root and any other vertex.
Let Σ be a set. A Σ-hypergraph is a pair (H, χ) where H is a hypergraph and χ : V (H) → Σ is a map called a Σ-coloring of H.
Isomorphisms between hypergraphs are defined as isomorphisms between relational structures. Isomorphisms between Σ-hypergraphs are just isomorphisms between the underlying hypergraphs that also preserve their colorings. In both cases we denote the isomorphism relation by . Given a hypergraph H, resp. a Σ-hypergraph (H, χ), an automorphism of H, resp. (H, χ), is an isomorphism from H, resp. (H, χ), to itself. We denote by aut(H), resp. aut(H, χ), the number of such automorphisms.
Let H be a hypergraph and let V be a set. We define the set of copies of H over V ,

Ehrenfeucht-Fraisse Games
We assume familiarity with Ehrenfeucht-Fraisse (EF) games. An introduction to the subject can be found for instance in [5, Section 2], for example. Given hypergraphs H 1 and H 2 we denote the k-round EF game played on H 1 and H 2 by Ehr k (H 1 ; H 2 ). The following is satisfied: Given lists v ∈ V (H 1 ) * , and u ∈ V (H 2 ) * of the same length, we denote the k round Ehrenfeucht-Fraisse game on H 1 and H 2 with initial position given by v and u by Ehr k (H 1 , v; H 2 , u).
We also define the k-round distance Ehrenfeucht-Fraisse game on H 1 and H 2 , denoted by dEhr k (H 1 ; H 2 ), the same way as Ehr k (H 1 ; H 2 ), but now in order for Duplicator to win the game the following additional condition has to be satisfied at the end: For any where v s and u s denote the vertex played on H 1 , resp. H 2 in the s-th round of the game. Given v ∈ V (H 1 ) * , and u ∈ V (H 2 ) * lists of vertices of the same length, we define the game dEhr k (H 1 , v; H 2 , u) analogously to Ehr k (H 1 , v; H 2 , u).

The random model
For each R ∈ σ let p R be a real number between zero and one. The random model G C n, {p R } R∈σ is the discrete probability space that assigns to each hypergraph G whose vertex set V (G) is [n] the following probability: Equivalently, this is the probability space obtained by assigning to each edge e ∈ E R [n] probability p R independently for each R ∈ σ.
As in the case of Lynch theorem, we are interested in the "sparse regime" of G C (n, {p} R ), were the expected number of edges of each sort is linear. This is achieved when for each

Main definitions
Our main definition follow closely the ones in [10] adapted to the context of hypergraphs. When v is empty we simply write Center(H). Definition 1.3. Let H be a hypergraph, v ∈ V (H) * and r ∈ N. Let X be the set of vertices v ∈ V (H) that either belong to v or belong to some saturated sub-hypergraph of H with diameter at most 2r + 1. We define Core(H, v; r) as N (X; r). If v is empty we write Core(H; r). We say that H is r-simple if all connected components of Core(H; r) are unicycles.
Then we define Tr (H, v; v) as the tree H[X] with v as a root. That is, Tr (H, v; v) is the tree formed of all vertices whose only path to Center(H, v) contains v. One can easily check that H[X] is indeed a tree: if it were not then it would contain some saturated sub-hypergraph, leading to a contradiction. Given r ∈ N we define Tr(H, v; v; r) as Tr(Core(H, v; r), v; v). In the case that v is the empty list we write simply Tr(H; v) or Tr(H; v; r).
For any k ∈ N we define an equivalence relation over rooted trees which generalizes both the relation of "k-morphism" as defined in [10], and the notion of "(k, r)-values" defined in [11]. Definition 1.5. Fix a natural number k. We define the k-equivalence relation over rooted trees, written as ∼ k , by induction over their radii as follows: • Any two trees with radius zero are k-equivalent. Notice that those trees consist only of one vertex: their respective roots.
• Let r > 0. Suppose the k-equivalence relation has been defined for rooted trees with radius at most r − 1. Let Σ k,r−1 be the set consisting of the ∼ k classes of trees with radius at most r − 1. Let ρ be an special symbol called the root symbol. Set Σ k,r−1 := Σ k,r−1 ∪ {ρ}. Then a (k, r)-pattern is isomorphism class of Σ k,r−1 -hypergraphs (e, τ ) that consist of only one edge with no loops and no isolated vertices, and satisfy τ (v) = ρ for exactly one vertex v ∈ V (e). We denote by P (k, r) the set of (k, r)-patterns.
Given a rooted tree (T, v) of radius r we define its canonical k-coloring as the map ) ∈ " and the "quantity of initial edges e 2 ∈ E(T 2 ) such that (e, τ k (T δ ,v δ ) ) ∈ " are equal or are both greater than k − 1.
The following is a way of characterizing ∼ k classes of rooted trees with radii at most r that will be useful later.
Observation 1.2. Let T be a ∼ k class of rooted trees with radii at most r. Then there is a partition E 1 T , E 2 T of P (k, r) and natural numbers a < k for each ∈ E 2 T that depends only on T such that a rooted tree (T, v) belongs to T if and only if the following hold: (1) For any pattern ∈ E 1 T there are at least k initial edges e ∈ E(T ) such that (e, τ k (T,v) ) ∈ , and (2) for any pattern ∈ E 2 T there are exactly a initial edges e ∈ E(T ) such that (e, τ k (T,v) ) ∈ .
From this characterization of the ∼ k relation it follows, by induction over r, that the quantity of ∼ k classes of trees with radii at most r is finite, for any r ∈ N. Definition 1.6. Let k ∈ N. Given a non-tree connected hypergraph H, we define its canonical k-coloring τ k H as the one that assigns to each vertex v ∈ V (H) the ∼ k class of the tree Tr(H, v). Let H 1 and H 2 be connected hypergraphs which are not trees. Set H 1 := Center(H 1 ) and H 2 := Center(H 2 ). We say that H 1 and H 2 are k-equivalent, written as Definition 1.7. Let k, r ∈ N and let H 1 and H 2 be hypergraphs. Let H 1 := Core(H 1 ; r) and H 2 := Core(H 2 ; r). We say that H 1 and H 2 are (k, r)-agreeable, written as H 1 ≈ k,r H 2 if for any ∼ k class H "the number of connected components in H 1 that belong to H" and "the number of connected components in H 2 that belong to H" are the same or are both greater than k − 1.
Definition 1.8. Let k, r ∈ N and let Σ (k,r) be the set of ∼ k classes of rooted trees with radii at most r. Then a (k, r)-cycle is an isomorphism class of Σ (k,r) -hypergraphs (H, τ ) that are cycles of diameter at most 2r + 1. We denote by C(k, r) the set of (k, r)-cycles. O there are at least k connected components H ⊂ Core(G; r) whose cycle H = Center(H) satisfies that (H , τ k H ) ∈ ω, and (2) for any ω ∈ U 2 O there are exactly a ω connected components H ⊂ Core(G; r) whose cycle H = Center(H) satisfies that (H , τ k H ) ∈ ω. Definition 1.9. Let H be a hypergraph and let k, r ∈ N. Let X ⊂ V (H) be the set of vertices in H belonging to some saturated sub-hypergraph of diameter at most 2r + 1. We say that H is (k, r)-rich if for any r ≤ r, vertices v 1 , . . . , v k and ∼ k class T of trees with radius at most r there exists

Main result and outline of the proof
Our goal is to prove the following theorem is well defined and analytic.
In fact we prove something stronger. We show that the limit in last theorem is given by an expression with parameters {β R } R built using rational constants, sums, products and exponentiation with base e. We do so by giving a family of expressions which contains the ones that define limit probabilities of FO properties in G n ({β} R ).
The main arguments are similar to the ones in the proof of [10, Theorem 2.1], adapted to fit our context. As in that article the proof is divided into two parts: a model theoretic part and a probabilistic part. The main result of the first part is the following With regards to the second part, the "landscape" of G n can be described similarly to the one of G(n, c/n) as in [13]: A.a.s for any fixed radius r all neighborhoods N (v; r) in G n are trees or unicycles, so cycles in G n are far apart. One can find arbitrarily many copies of any fixed tree, while the expected number of copies of any fixed cycle is finite. The main probabilistic results are the following:  2 Model theoretic results

Winning strategies for Duplicator
During this section H 1 and H 2 stand for hypergraphs and V 1 : , if X and Y can be ordered to form lists w, resp. z such that (H 1 , w v) k,r (H 2 , z u).
and "the number of Y i such that (H δ , Z) k,r (H 2 , Y i )" are both equal or are both greater than k − 1.
The main theorem of this section, which is a strengthening of [14, Theorem 2.6.7], is the following.
In order to prove this theorem we need to make two observations and prove a previous lemma.
be the vertices played in the first round of an instance of the game where Duplicator is following a winning strategy. Then Duplicator also wins dEhr k−1 (H 1 , v 2 ; H 2 , u 2 ), where v 2 := v v and u 2 := u u.
Let v ∈ V 1 and u ∈ V 2 be vertices played in the first round of an instance of where Duplicator is following a winning strategy. Further suppose that d(v, v) ≤ 2r + 1 (and in consequence d(u, u) ≤ 2r + 1 as well). Let v 2 := v v and u 2 := u u. Then Proof. Using Observation 2.2 we get that Duplicator wins Analogously we obtain N H 2 (u 2 ; r) = N H 2 (u 2 ; r), as we wanted.
Proof of Theorem 2.1. Let X 1 , . . . , X a and Y 1 , . . . , Y b be partitions of X and Y respectively as in the definition of ∼ = k,r . Let r 0 := (3 k − 1)/2 and r i := (r i−1 − 1)/3 for each 1 ≤ i ≤ k. Let v 1 i and v 2 i be the vertices played in H 1 and H 2 respectively during the i-th round of Ehr k (H 1 , H 2 ). We show a winning strategy for Duplicator in Ehr k (H 1 ; H 2 ). For each 0 ≤ i ≤ k, Duplicator will keep track of some marked sets of vertices T ⊂ V 1 , S ⊂ V 2 . For δ = 1, 2 each marked set T ⊂ V δ will have associated a tuple of vertices v(T ) ∈ V * δ consisting of the vertices played in H δ so far that were "appropriately close" to T when chosen, ordered according to the rounds they where played in. The game will start with no sets of vertices marked and at the end of the i-th round Duplicator will perform one of the two following operations: • Given two sets S ⊂ V 1 , T ⊂ V 2 that were previously marked during the same round, append v 1 i and v 2 i to v(S) and v(T ) respectively.
We show that Duplicator can play in such a way that at the end round the following are satisfied: (i) For δ = 1, 2, each vertex played so far v δ j ∈ V δ belongs to v(S) for a unique marked set S ⊂ V δ .
(ii) Let S ⊂ V 1 and T ⊂ V 2 be sets marked during the same round. Then any previously played vertex v 1 j occupies a position in v(S) if and only if v 2 j occupies the same position in v(T ).

(iii)
-Let S ⊂ V 1 be a marked set. Then for any different marked S ⊂ V 1 of any different S among X 1 , . . . , X a it holds d(S, S ) > 2r i + 1.
-Let T ⊂ V 2 be a marked set. Then for any different marked T ⊂ V 2 or any different T among Y 1 , . . . , Y b it holds d(T, T ) > 2r i + 1.
(iv) Let S ⊂ V 1 , T ⊂ V 2 be sets marked during the same round. Then In particular, if conditions (i) to (iv) are satisfied this means that if v 1 := (v 1 1 , . . . , v 1 i ) and v 2 := (v 2 1 , . . . , v 2 i ) are the vertices played so far then Duplicator wins And at the end of the k-th round Duplicator will have won Ehr (H 1 ; H 2 ). The game dEhr k (H 1 ; H 2 ) proceeds as follows. Clearly properties (i) to (iv) hold at the beginning of the game. Suppose that Duplicator can play in such a way that properties (i) to (iv) hold until the beginning of the i-th round. Suppose during the i-th round Spoiler chooses v 1 i ∈ V 1 (the case where they play in V 2 is symmetric). There are three possible cases: • For some unique previously marked set S ⊂ V 1 we have d(S ∪ v, v 1 i ) ≤ 2r i + 1. In this case let T ⊂ V 2 be the set in H 2 marked in the same round as T . By hypothesis Then, by definition, for some orderings w, z of the vertices in S and T respectively it holds that Duplicator wins Thus Duplicator can choose v 2 i ∈ V 2 according to the winning strategy in that game. After this Duplicator sets v(S) • For all marked sets S ⊂ V 1 it holds d(S ∪ v(S), v 1 i ) > 2r i + 1, but there is a unique S among X 1 , . . . , X a such that d(S, v 1 i ) ≤ 2r i + 1. In this case from condition (1) of the statement follows that there is some non-marked set T among Y 1 , . . . , Y b such that Thus, by definition, for some orderings w, z of the vertices in S and T respectively, Duplicator wins dEhr k−i+1 ( N (w; 3r i + 1), w; N (z; 3r i + 1), z ) .
Then Duplicator can choose v 2 i ∈ V 2 according to a winning strategy for this game. After this Duplicator marks both S and T and sets v(S) := v 1 i , and v(T ) := v 2 i . Notice that because of Lemma 2.1 now (H 1 , (S, v(S))) k−i,r i (H 2 , (T, v(T ))) .
• For all marked sets S ⊂ V 1 we have d(S ∪ v(S), v 1 i ) > 2r i + 1, and for all sets S among X 1 , . . . , X a it also holds d(S, v 1 i ) > 2r i + 1. In this case from condition (2) of the statement it follows that Duplicator can choose v 2 The fact that conditions (i) to (iv) still hold at the end of the round follows from comparing r i−1 and r i as well as applying Observation 2.1 and Observation 2.2.

k-Equivalent trees
We want prove the following.
Before proceeding with the proof we need an auxiliary result. Let (T, v) be a rooted tree and e an initial edge of T . We define Tr(T, v; e) as the induced tree T [X] on the set X := {v} ∪ { u ∈ V (T ) | d(v, u) = 1 + d(e, u) }, with v as the root. In other words, Tr (T, v; e) is the tree consisting of v and all the vertices in T whose only path to v contains e. Lemma 2.2. Let k ∈ N and fix r > 0. Suppose theorem 2.2 holds for rooted trees with radii at most r. Let (T 1 , v 1 ) and (T 2 , v 2 ) be rooted trees with radius r + 1. Let τ k (T 1 ,v 1 ) and τ k (T 2 ,v 2 ) be colorings over T 1 and T 2 as in Definition 1.5 Let e 1 and e 2 be initial edges of T 1 and T 2 respectively satisfying (e 1 , τ k (T 1 ,v 1 ) ) (e 2 , τ k (T 2 ,v 2 ) ). Name T 1 := Tr(T 1 , v 1 ; e 1 ) and T 2 := Tr(T 2 , v 2 ; e 2 ). Then Duplicator wins dEhr k (T 1 , v 1 ; T 2 , v 2 ).
Proof. We show a winning strategy for Duplicator. At the beginning of the game fix an isomorphism f : V (e 1 ) → V (e 2 ) between (e 1 , τ k (T 1 ,v 1 ) ) and (e 2 , τ k (T 2 ,v 2 ) ). Suppose in the i-th round of the game Spoiler plays on T 1 . The other case is symmetric. If Spoiler plays v 1 then Duplicator chooses v 2 . Otherwise, Spoiler plays a vertex v that belongs to some Tr(T 1 , v 1 ; u) for a unique u ∈ V (e 1 ) different from the root v 1 . Set T 1 := Tr (T 1 , v 1 ; u) and T 2 := Tr (T 2 , v 2 ; f (u)) Then, as τ k (T 1 ,v 1 ) (u) = τ k (T 2 ,v 2 ) (f (u)), we obtain (T 1 , u) ∼ k (T 2 , f (u)). As both these trees have radii at most r, by assumption Duplicator has a winning strategy in dEhr k ( T 1 , u; T 2 , f (u) ) and they can follow it considering the previous plays in T 1 and T 2 .

Proof of Theorem 2.2.
Notice that, as (T 1 , v 1 ) ∼ k (T 2 , v 2 ), both T 1 and T 2 have the same radius r. We prove the result by induction on r. If r = 0 then both T 1 and T 2 consist of only one vertex and we are done. Now let r > 0 and assume that the statement is true for all smaller values of r. Let τ k (T 1 ,v 1 ) and τ k (T 2 ,v 2 ) be the colorings over T 1 and T 2 as in Definition 1.5. We show that there is a winning strategy for Duplicator in dEhr k (T 1 , v 1 ; T 2 , v 2 ). At the start of the game, set all the initial edges in T 1 and T 2 as non-marked. Suppose in the i-th round Spoiler plays in T 1 . The other case is symmetric. If Spoiler plays v 1 then Duplicator plays v 2 . Otherwise, the vertex played by Spoiler belongs to Tr(T 1 , v 1 ; e 1 ) for a unique initial edge e 1 of T 1 . There are two possibilities: • If e 1 is not marked yet, mark it. In this case, there is a non-marked initial edge e 2 in T 2 satisfying e 1 , τ k . Mark e 2 as well. Set T 1 := Tr(T 1 , v 1 ; e 1 ) and T 2 := Tr(T 2 , v 2 ; e 2 ) Because of Lemma 2.2, Duplicator has a winning strategy in dEhrk(T 1 , v 1 ; T 2 , v 2 ) and can play according to it.
• If e 1 is already marked then there is a unique initial edge e 2 in T 2 that was marked during the same round as e 1 and it satisfies e 1 , τ k . Again, because of Lemma 2.2, Duplicator has a winning strategy in dEhrk(T 1 , v 1 ; T 2 , v 2 ) and can continue playing according to it taking into account the plays made previously in T 1 and T 2 . Proof. The winning strategy for Duplicator is as follows. Suppose at the beginning of the i-th round Spoiler plays in H 1 (the case where they play in H 2 is symmetric). Then Spoiler has chosen a vertex that belongs to Tr(H 1 ; u) for a unique u ∈ H 1 . Set T 1 := Tr (H 1 ; u) and T 2 := Tr (H 2 ; f (u)). By hypothesis (T 1 , u) ∼ k (T 2 , f (u)). Then because of Theorem 2.2 we have that Duplicator has a winning strategy in dEhr k ( T 1 , u; T 2 , f (u) ) , and they can follow it taking into account the previous moves made in T 1 and T 2 , if any. In particular, if Spoiler has chosen u then Duplicator will necessarily choose f (u). One can easily check that distances are preserved following this strategy. Lemma 2.3. Let k, r ∈ N and let H 1 , H 2 be hypergraphs such that H 1 ≈ k,r H 2 . Let X and Y be the sets of vertices in H 1 , resp. H 2 , that belong to a saturated sub-hypergraph of diameter at most 2r + 1. Then (H 1 , X) ∼ = k,r (H 2 , Y ) in the sense of Definition 2.2.

Main result
Proof. Let X 1 , . . . , X a and Y 1 , . . . , Y b be partitions of X and Y such that each N (X i ; r) and N (Y i ; r) is a connected component of Core (H 1 ; r), resp. Core (H 2 ; r). Because of The-  (H 1 , H 2 ).
Proof. Because of the previous lemma we can apply Theorem 2.1 with X ⊂ V (H 1 ) and Y ⊂ V (H 2 ) defined as before. The hypothesis of (k, r)-richness on both H 1 , H 2 ensures that condition (2)     In this case we successively cut edges e from G such that d(e, H) is the maximum possible (notice that this always yields a connected hypergraph) until we obtain a hypergraph G with ex(G ) < ex(G). Let e be the edge that was cut last.

Substituting in the first equation we get
, and let P 1 , P 2 be paths of length at most r that join H with v 1 and v 2 respectively in G . Then the hypergraph H := H ∪ e ∪ P 1 ∪ P 2 satisfies the conditions in the statement. Proof. Apply the previous lemma twice starting with G and taking as H a sub-hypergraph of G consisting of a single vertex and no edges.
In particular, if we define l := max R∈σ ar(R) the last lemma implies that, if G is a dense hypergraph whose diameter is at most r then G contains a dense sub-hypergraph H with |H| ≤ l(4r + 2). Theorem 3.1. Let r ∈ N. Then a.a.s G n is r-sparse.
Proof. Because of the last lemma there is a constant R such that "G does not contain dense hypergraphs of size bounded by R" implies that "G is r-sparse". Thus, lim n→∞ Pr (G n is r-sparse) ≥ lim n→∞ Pr (G n does not contain dense hypergraphs of size ≤ R) .
Because of Lemma 3.2, given a fixed dense hypergraph, the probability that G n contains no copies of it tends to 1 as n goes to infinity. Using that there are a finite number of ∼ classes of dense hypergraphs whose size bounded by R, we deduce that the RHS of the last inequality tends to 1.
As a corollary we obtain the needed result. Proof. If some connected component of Core(G n ; r) is not a cycle then either G n contains a dense hypergraph of diameter at most 4r + 1, or G n contains two cycles of diameter at most 2r + 1 that are at distance at most 2r + 1. In the second case, considering the two cycles and the path joining them, G n contains a dense hypergraph of diameter bounded by 6r + 3. Hence the fact that G n is (6r + 3)-sparse implies that G n is r-simple. Because of the previous theorem G n is a.a.s (6r + 3)-sparse and the result follows. Proof. An application of the first moment method together with Lemma 3.3 and the fact that there is a finite number of classes of paths whose length is at most 2r + 1, implies that a.a.s the N (v; r) are disjoint. Also, because of Theorem 3.1 a.a.s the N (v; r) are either trees or unicycles. But if any of the N (v; r) was an unicycle then in G n there would exist a path P of length at most 2r + 1 joining some vertex v ∈ v with a cycle C of diameter at most 2r + 1. Using Lemma 3.3 again, as well as the fact that there is a finite number of possible classes for P ∪ C, we obtain that a.a.s no such P and C exist. In consequence all the N (v; r) are disjoint trees as we wanted to prove. Lemma 3.7. Let v ⊂ N * be a finite set of fixed vertices and let π(x) be an edge sentence such that len(x) = len(v). Define G n = G n \ E[v] (i.e. G n minus all the edges induced on v). Fix r ∈ N. Then a.a.s for all vertices v ∈ v the neighborhoods N G n (v; r) are disjoint trees.
Proof. Let A n be the event that the N G n (v; r) are disjoint trees. Notice that A n does not concern the possible edges induced over v. Because edges are independent in our random model, we have that Pr (A n | π(v)) = Pr(A n ). Now the result follows from Lemma 3.6 using that G n ⊂ G n .
(2) Let u ∈ (N) * , and let π(x) ∈ F O[σ] be a consistent edge sentence such that len(x) = len(u). Let v ∈ (N) * be vertices contained in u. For each v ∈ v let T v be a k-equivalence class of trees with radii at most r. Then We devote the rest of this section to proving this theorem. The proof is by induction on r. Recall that all trees with radius zero are k-equivalent. Thus, the limits appearing in conditions (1) and (2) are both equal to 1 in the case r = 0. Definition 3.3. Let k ∈ N and r > 0. Suppose that Theorem 3.3 holds for r − 1. Given a (k, r)-pattern we define the expressions λ r, and µ r, as follows. Let (e, τ ) be a representative of whose root is v. Then for all vertices u ∈ V (e) such that u = v it holds that τ (u) is a ∼ k class of trees with radius at most r and we can set Pr r − 1, τ (u) , and µ r, = β R(e) aut( ) λ r, .
Clearly the definitions of λ r, and µ r, are independent of the chosen representative. By hypothesis it holds that µ r, is positive for all values of {β R } R∈σ ∈ (0, ∞) |σ| and it is an expression belonging to M . Lemma 3.9. Let k ∈ N, r > 0 and u ∈ (N) * . Let π(x) ∈ F O[σ] be a consistent edge sentence such that len(x) = len(u). Let v ∈ (N) * be vertices contained in u. For each v ∈ v set T n,v := Tr (G n , u; v; r). Given a pattern ∈ P (k, r) and v ∈ v we define the random variable X n,v, as the number of initial edges e ∈ E(T n,v ) such that (e, τ k (Tn,v,v) ) ∈ . Suppose that Theorem 3.3 holds for r − 1. Then the conditional distributions of the variables X n,v, given π(u) converge to independent Poisson distributions whose respective mean values are given by the µ r, .
Proof. To avoid excessively complex notation we prove only the case where v consists of a single vertex v. The general case is proven using the same arguments. Set T n := T n,v and X n, := X n,v, for all ∈ P (k, r). By Theorem 1.1, in order to prove the result it is enough to show that for any choice of natural numbers {b } ∈P (k,r) it holds that Consider the numbers {b } ∈P (k,r) fixed. For each n ∈ N define Informally, elements of Ω n represent choices of b possible initial edges of T n whose kpattern is for all (k, r)-patterns . Using Observation 1.1 we obtain We say that a choice {E } ∈ Ω n is disjoint if the edges (e, τ ) ∈ ∈P (k,r) E satisfy that no vertex w ∈ u other than v belongs to any of those edges and each vertex w ∈ [n] \ {v} belongs to at most one of those edges. For each n ∈ N let Ω n ⊂ Ω n be the set of disjoint elements in Ω n and set Ω N = ∪ n∈N Ω n . If for some {E } ∈ Ω n we have that e ∈ E(T n ) for all (e, τ ) ∈ ∈P (k,r) E then {E } is necessarily disjoint. This is because T n is a tree and the only vertex in u that belongs to T n is v by definition. Thus, in the last sum it suffices to consider only the disjoint {E } . Because of the symmetry of the random model the probabilities in that sum are the same for all disjoint choices of {E } . Hence, if we fix {E } ∈ Ω N we obtain (2) Set N := ∈P (k,r) (| | − 1)b . Counting vertices and automorphisms we get that ( Let w ∈ (N) * be a list containing exactly the vertices u ∈ V (e) for all e ∈ ∈P (k,r) E . Clearly, the event ∈P (k,r) (e,τ )∈E e ∈ E(G n ) can be described via an edge sentence whose variables are interpreted as vertices in w.
Let ψ(x) be one of such edge sentences. This event is independent of π(u) because edges are independent in G n . Thus, a simple computation yields Because of Lemma 3.7 a.a.s if e ∈ E(G n ) and v ∈ V (e), then e ∈ E(T n ). Thus, The trees T r(T n ; u) in the last probability coincide with T r(G n , u w; u; r − 1) for all u. As a consequence, using the hypothesis that Theorem 3.3 holds for r − 1, we obtain Combining this this with Equations (2), (3) and (4) we obtain This proves Equation (11) and the statement.
Next lemma completes the proof of Theorem 3.3.
Lemma 3.10. Let r > 0. Suppose that Theorem 3.3 holds for r − 1. Then it also holds for r.
Proof. Fix k ∈ N. We start showing condition (1) of Theorem 3.3. Fix T a ∼ k class of trees with radius at most r. Fix a vertex v ∈ N as well. Set T n := Tr(G n , v; v; r). For each ∈ P (k, r) let X n, be the random variable that counts the number of initial edges in T n whose pattern is . Let Using the previous lemma we obtain that the last limit equals the following expression: Using the definition of the µ r, we obtain that the last expression belongs to Λ as we wanted to prove. Furthermore, as the µ r, are positive, this expression is also positive for all values of {β R } R∈σ ∈ (0, ∞) |σ| . Now we proceed to prove condition (2). Let u, v, {T v } v∈v and π(x) be as in the statement of (2). Using the previous lemma we obtain that the events T r(G n , u; v; r) ∈ T v for all v ∈ v are asymptotically independent and are also independent of π(u). Then the desired result follows from condition (1).

Almost all graphs are (k,r)-rich
Theorem 3.4. Let k, r ∈ N. Then a.a.s G n is (k, r)-rich.
Proof. Let Σ be the set of all ∼ k classes of rooted trees with radii at most r. Let m > k.
For each T ∈ Σ let v(T) ∈ (N) m be tuples satisfying that all the v(T) are disjoint. Let w ∈ (N) * be a concatenation of all the v(T). For each T ∈ Σ define X n,T as the number of vertices v ∈ v(T) such that T r(G n , w; v; r) ∈ T. Because of Theorem 3.3 the ∼ k types of the trees T r(G n , w; v; r) for all v ∈ w are asymptotically independent and given any v ∈ w and T it holds that Pr(T r(G n , w; v; r) ∈ T) tends to Pr[r, T] as n goes to infinity. Hence, the variables X n,T converge in distribution to independent binomial variables whose respective parameters are m and Pr[r, T]. That is, given natural numbers 0 ≤ l T ≤ m for all T ∈ Σ, Also, for m large enough we have Suppose that m is large enough for both Equations (5) and (6) to hold. Then lim n→∞ Pr (X n,T < k) ≤ for all T ∈ Σ We define A n as the event that for any v ∈ w we have N (v; r) ∩ Core(G n ; r) = ∅ (in particular this implies that N (v; r) is a tree), and for any two v 1 , v 2 ∈ w it is satisfied that d Gn (v 1 , v 2 ) > 2r+1. If A n holds then for all v ∈ w we have that N (v; r) = T r(G n , w; v; r) and the N (v; r) are disjoint trees. Thus, if both A n holds and X n,T ≥ k for all T then G n is (k, r)-rich. Because of Lemma 3.6 a.a.s A n holds, and we obtain lim n→∞ Pr ( G n is not (k, r)-rich ) ≤ lim n→∞ Pr A n ∧ X n,T < k
As can be arbitrarily small given a suitable choice of m we obtain that necessarily a.a.s G n is (k, r)-rich, as was to be proved.

Probabilities of cycles
Definition 3.4. We define Γ and Υ as the minimal families of expressions with arguments {β R } R∈σ that satisfy the following conditions: (1) given natural numbers a R for each R ∈ σ, a positive number b ∈ N and a λ ∈ Λ, the expression λ b R∈σ β a R R belongs to Γ, (2) given a γ ∈ Γ and a a ∈ N, the expressions Poiss γ (a) and Poiss γ (≥ a) both belong to Υ, and (3) if υ 1 , υ 2 ∈ Υ then υ 1 υ 2 ∈ Υ as well.
For each n ∈ N we define We Set N := O∈C(k,r) |O|b O . We have that Let v ∈ (N) * be a list that contains exactly the vertices in G {F O } O∈C(k,r) . Then the event can be written as an edge sentence concerning the vertices in v. Let ϕ(x) be one of such sentences. We have that Because of Theorem 3.2 a.a.s if some cycle H of diameter at most 2r + 1 satisfies H ⊂ G n then H G n . Hence, As all the vertices v ∈ v belong to Core(G n ; r), the trees T r(G n ; v; r) in the last probability coincide with T r(G n , v; v; r). By Theorem 3.3 we have that Combining this with Equations (8) to (10) we obtain E   O∈C(k,r) This proves Equation (7) and the statement. Proof. For each O ∈ C(k, r) let X n,O be as in the previous lemma. Let O be as in Observation 1.3. Let A n be the event that G n is r-simple. Then Because of Theorem 3.2, a.a.s A n holds. Thus, using the last lemma the previous limit equals the following expression As all the γ r,O belong to Γ, this last expression belongs to Υ and the theorem is proven.
is well defined and it is given by a finite sum of expressions in Υ.
Proof. Let k be the quantifier rank of φ and let r = 3 k . Let G n := G n ({β R } R∈σ ) and let Σ be the set of (k, r)-agreeability classes of r-simple hypergraphs. Because of Theorem 3.2 a.a.s G n is r-simple. Thus Because the set Σ is finite, we can exchange the summation and the limit. By Theorem 3.4 a.a.s G n is (k, r)-rich. This together with Theorem 2.4 implies that for any O ∈ Σ lim n→∞ Pr G n |= φ G n ∈ O = 0 or 1 .
Let Σ ⊂ Σ be the set of classes O for which last limit equals 1. Then Because of Theorem 3.5 we know that each of the limits inside the last sum exists and is given by an expression that belongs to Υ. As a consequence the theorem follows.

Application to random SAT
We define a binomial model of random CNF formulas, in analogy with the one in [3], but the generality in Theorem 1.3 allows for many variants.
Definition 5.1. Given a variable x, both expressions x and ¬x are called literals. A clause is a set of literals. A clause C is called non-tautological if no variable x satisfies that both x and ¬x belong to C. An assignment over a set of variables X is a map f that assigns 0 or 1 to each variable of X. A clause C is satisfied by an assignment f if either there is some variable x such that x ∈ C and f (x) = 1 or there is some variable x such that ¬x ∈ C and f (x) = 0. Given l ∈ N a l-CNF formula is a set of non-tautological clauses that contain exactly l literals. We say that a formula F on the variables x 1 , . . . , x n is satisfiable if there is an assignment f : {x 1 , . . . , x n } → {0, 1} that satisfies all clauses in F .
Given n, l ∈ N and a real number 0 ≤ p ≤ 1 we define the random model F (l, n, p) as the discrete probability space that assigns to each l-CNF formula F on the variables {x i } i∈[n] the probability where |F | is the number of clauses in F . Equivalently, a random formula in F (l, n, p) is obtained by choosing each of the 2 l n l non-tautological clauses of size l on the variables {x i } i with probability p independently. When p is a function of n satisfying p(n) ∼ β/n l−1 we denote by F l n (β) a random sample of F (l, n, p(n)). We consider l-CNF formulas, as defined above, as relational structures with a language σ consisting of l + 1 relation symbols R 0 , . . . , R l of arity l. We do that in such a way that the expression R j (x i 1 , . . . , x i l ) means that our formula contains the clause consisting of ¬x i 1 , . . . , ¬x i j and x i j+1 , . . . x i l . The relations R 1 , . . . , R l satisfy the following axioms: (1) given 0 ≤ j ≤ l and variables y 1 , . . . , y l the fact that R j (y 1 , . . . , y l ) holds is invariant under any permutation of the variables y 1 , . . . , y j or y j+1 , . . . , y l , and (2) for any 0 ≤ j ≤ l and any variables y 1 , . . . , y l it holds that R j (y 1 , . . . , y l ) only if all the y i are different. Call C to the family of σ-structures satisfying the last two axioms. The language σ and the family C satisfy the conditions in Section 1.4. The random model F l (n, p) coincides with the model G(n, {p R } R ) of random C-hypergraphs described in Section 1.6 when all the p R are equal. As a particular case of Theorem 1.3 we obtain the following result. The following is a well known result regarding random CNF formulas.
Theorem 5.2. Let l ≥ 2 be a natural number, and let c ∈ (0, ∞) be an arbitrary real number. Let m : N → N be such that m(n) ∼ cn. For each n let C n,1 , . . . , C n,m(n) be clauses chosen uniformly at random independently among the 2 l n l non-tautological clauses of size l over the variables x 1 , . . . , x n . For each n, let U N SAT n denote the event that there is no assignment of the variables x 1 , . . . , x n that satisfies all clauses C n,1 , . . . , C n,m(n) . Then there are two real constants 0 < c 1 < c 2 , such that a.a.s U N SAT n does not hold if c < c 1 , and a.a.s U N SAT n holds if c > c 2 .
The existence of c 1 is proven in [3,Theorem 1]. The fact that c 2 exists follows from a direct application of the first moment method and is also shown for instance in [3,8,4]. We want to show that an analogous "phase transition" also happens in F (l, n, p) when p ∼ β/n l−1 . We start by showing the following Corollary 5.1. Let l ≥ 2 be a natural number. Let c ∈ (0, ∞) be an arbitrary real number and let m : N → N satisfy m(n) ∼ cn. For each n ∈ N let F n,m(n) be a random formula chosen uniformly at random among all sets of m(n) non-tautological clauses of size l over the variables x 1 , . . . , x n . Then there are two real positive constants 0 < c 1 < c 2 such that a.a.s F n,m(n) is satisfiable if c < c 1 , and a.a.s F n,m(n) is unsatisfiable if c > c 2 .
Proof. For each n ∈ N let C n,1 , . . . , C n,m(n) and U N SAT n be as in the previous theorem. One can consider F n,m(n) to be the result of selecting clauses C n,1 , . . . , C n,m(n) uniformly at random independently among all possible clauses, given the fact that no two clauses C n,i , C n,j are equal. Hence, Pr F n,m(n) is unsatisfiable = Pr U N SAT n all the C n,i are different .
An application of the first moment method yields that for l ≥ 3 a.a.s the number of unordered pairs {i, j} such that C n,i = C n,j is equal to zero. In the case of l = 2, an application of Theorem 1.1 proves that the number of such pairs {i, j} converges in distribution to a Poisson variable. In either case all the C n,i are different with positive asymptotic probability. Thus the constants c 1 and c 2 from the previous theorem satisfy our statement.
Let F n,m(n) be as in last result. Note that because of the symmetry in the random model F (l, n, p(n)) one can consider F n,m(n) to be a random sample of the space F (l, n, p(n)) given that the number of clauses is m(n). Using this observation we can prove the following. Theorem 5.3. Let l > 1. Then there are real positive values β 1 < β 2 such that a.a.s F l n (β) is satisfiable for 0 < β < β 1 and a.a.s F l n (β) is unsatisfiable and for β > β 2 .
Proof. For each n ∈ N let X n (β) be the random variable equal to the number of clauses in F l n (β). We have that E[X n (β)] ∼ β2 l l! n. Let c 1 , c 2 be as in last corollary. Define β 1 := c 1 l! 2 l and β 2 := c 2 l! 2 l . Fix β ∈ R satisfying 0 < β < β 1 . Let > 0 be a real number such that l! + n . Denote by dp n the probability density function of the variable X n (β). That is dp n (m) = Pr(X n (β) = m). Then, because of the previous equation, Pr F l n (β) is unsatisfiable X n (β) = m dp n (m).
Combining the previous equations we obtain that for any β < β 1 it holds that F l n (β) a.a.s is satisfiable, as it was to be proven. Showing that for any β > β 2 , a.a.s F l n (β) is unsatisfiable is analogous.
A direct consequence of the last theorem, due to A. Atserias (personal communication, July, 2019), is the following Theorem 5.4. Let l > 1 be a natural number. Let Φ ∈ F O[σ] be a first order sentence that implies unsatisfiability. Then for all β > 0 a.a.s F l n (β) does not satisfy Φ.
By Theorem 5.1, last limit varies analytically with β. It vanishes in the proper interval (0, β 1 ] then by the Principle of analytic continuation it has to vanish in the whole (0, ∞), and the result holds.