Equidistribution results for self-similar measures

A well known theorem due to Koksma states that for Lebesgue almost every $x>1$ the sequence $(x^n)_{n=1}^{\infty}$ is uniformly distributed modulo one. In this paper we give sufficient conditions for an analogue of this theorem to hold for self-similar measures. Our approach applies more generally to sequences of the form $(f_{n}(x))_{n=1}^{\infty}$ where $(f_n)_{n=1}^{\infty}$ is a sequence of sufficiently smooth real valued functions satisfying a nonlinearity assumption. As a corollary of our main result, we show that if $C$ is equal to the middle third Cantor set and $t\geq 1$, then with respect to the Cantor-Lebesgue measure on $C+t$ the sequence $(x^n)_{n=1}^{\infty}$ is uniformly distributed for almost every $x$.


Introduction
A sequence (x n ) ∞ n=1 of real numbers is said to be uniformly distributed modulo one if for every pair of real numbers u, v with 0 ≤ u < v ≤ 1 we have The study of uniformly distributed sequences has its origins in the pioneering work of Weyl [28] from the early 20th century. From these beginnings this topic has developed into an important area of mathematics, with many deep connections to Ergodic Theory, Number Theory, and Probability Theory. It is a challenging problem to determine whether a given sequence of real numbers is uniformly distributed. Often the sequences one considers are of dynamical or number theoretic origins. For an overview of this topic we refer the reader to [5,22] and the references therein. One of the most well known results from uniform distribution theory states that for any integer b ≥ 2, for Lebesgue almost every x ∈ R the sequence (b n x) ∞ n=1 is uniformly distributed modulo one. This result due to Borel is commonly referred to as Borel's normal number theorem [4]. In what follows we say that x is b-normal if (b n x) ∞ n=1 is uniformly distributed modulo one. For an arbitrary Borel probability measure µ supported on R which is defined independently from the dynamical system x → bx mod 1, it is natural to wonder whether x is b-normal for µ almost every x. The following metaconjecture encapsulates many important results in this direction. Metaconjecture 1.1. Suppose µ is a Borel probability measure that is "independent" from the dynamical system x → bx mod 1. Then µ almost every x is b-normal.
Of course the important detail in this conjecture is what exactly it means for a Borel probability measure to be independent from the dynamical system x → bx mod 1. The first instances of this metaconjecture being verified are found in the papers of Cassels [8] and Schmidt [26]. These authors were motivated by a question of Steinhaus as to whether there exists an x that is b-normal for infinitely many b but not all b. They answered this question in the affirmative by proving that with respect to the Cantor-Lebesgue measure on the middle third Cantor set, almost every x is b-normal if b is not a power of three. The underlying independence here comes from the middle third Cantor set being defined by similarities with contraction ratios equal to 1/3, and b having a prime factor not equal to 3. The current state of the art in this area are the following two theorems due to Hochman and Shmerkin [16], and Dajan, Ganguly, and Weiss [9]. Theorem 1.2. [16] Let {ϕ i (x) = r i x + t i } i∈A be an iterated function system satisfying the open set condition. Suppose b ≥ 2 is such that log |r i | log b / ∈ Q for some i ∈ A, then for every fully supported 1 non-atomic self-similar measure µ, µ almost every x is b-normal.
∈ Q for some i, j ∈ A, then for every fully supported non-atomic self-similar measure µ, µ almost every x is b-normal.
The independence in Theorem 1.2 comes from the existence of a contraction ratio satisfying Whereas in Theorem 1.3 the independence comes from the existence of translation parameters t i , t j satisfying t i − t j / ∈ Q. Other important contributions in this area include the papers by Kaufman [19], and Queffélec and Ramaré [25], who constructed Borel probability measures supported on subsets of the badly approximable numbers whose Fourier transforms converged to zero polynomially fast. Importantly, if the Fourier transform of a Borel probability measure converges to zero polynomially fast, then it can be shown that almost every point with respect to this measure is b-normal for any b ≥ 2. Kaufman later went on to show that such measures also exist for the set of α-well approximable numbers [20]. The results of Kaufman [19], and Queffélec and Ramaré [25], were later extended by Jordan and Sahlsten to a general class of measures [18]. In a more recent paper, Simmons and Weiss [27] proved that if X is a self-similar set satisfying the open set condition, then with respect to the natural measure on X 2 , the orbit under the Gauss map (x → 1/x mod 1) of almost every x equidistributes with respect to the Gauss measure. Here the important point is that the natural measure on X is defined independently from the dynamics of the Gauss map.
After Borel's normal number theorem, one of the next uniform distribution results one likely encounters is a theorem due to Koksma [21]. This theorem states that for Lebesgue almost every x > 1 the sequence (x n ) ∞ n=1 is uniformly distributed modulo one. The motivation behind this paper comes from this theorem and the results stated above. More specifically, given a Borel probability measure µ supported on [1, ∞) that is defined "independently" from the family of maps {f n (x) = x n } ∞ n=1 , we are interested in determining whether for µ almost every x the sequence (x n ) ∞ n=1 is uniformly distributed modulo one. The study of the distribution of the sequence (x n ) ∞ n=1 modulo one dates back to the work of Hardy [14] and Pisot [23,24]. It is a challenging problem to describe the distribution of (x n ) ∞ n=1 modulo one for specific values of x. It is still unknown whether there exists a transcendental x > 1 such that d(x n , N) → 0 as n → ∞. For some recent results on the distribution of the sequence (x n ) ∞ n=1 we refer the reader to [1,2,3,5,6,7,11] and the references therein. We remark that any self-similar set is defined by a finite collection of affine maps, whereas for any n ≥ 2 the map f n (x) = x n is certainly not affine. One could view this observation as some measure of independence. With the above results in mind, the following conjecture seems plausible.
Conjecture 1.4. Let µ be a non-atomic self-similar measure with support contained in [1, ∞). Then for µ almost every x the sequence (x n ) ∞ n=1 is uniformly distributed modulo one. One of the challenges faced when addressing this conjecture is that, at least to the best of the author's knowledge, there is no dynamical system which effectively captures the distribution of (x n ) ∞ n=1 modulo one. As such one cannot rely upon techniques from Ergodic Theory to prove this conjecture. Techniques from Ergodic Theory were previously applied with great success in the proofs of Theorem 1.2 and Theorem 1.3.
In this paper we do not prove the full Conjecture 1.4. We instead prove the following general statement which lends significant weight to its validity. Theorem 1.5. Let {ϕ i (x) = rx+t i } i∈A be an equicontractive iterated function system satisfying the convex strong separation condition with self-similar set X contained in [1, ∞). Moreover let (p i ) i∈A be a probability vector satisfying Then with respect to the self-similar measure µ corresponding to (p i ) i∈A , for µ almost every x the sequence (x n ) ∞ n=1 is uniformly distributed modulo one. We define what it means for an iterated function system to be equicontractive and to satisfy the convex strong separation condition in the next section. Importantly both of these conditions are satisfied by the iterated function system } for any t ∈ R. The self-similar set for this iterated function system is C + t where C is the middle third Cantor set. Taking (p i ) 2 i=1 = (1/2, 1/2) to be our probability vector, and using the fact that the self-similar measure for this choice of (p i ) 2 i=1 coincides with the Cantor Lebesgue measure on C + t, we see that Theorem 1.5 immediately implies the following corollary. Corollary 1.6. Let C be the middle third Cantor set. Then for any t ≥ 1, with respect to the Cantor-Lebesgue measure on C + t, for almost every x the sequence (x n ) ∞ n=1 is uniformly distributed modulo one. Theorem 1.5 is implied by the following more general theorem which applies to a wider class of functions. Theorem 1.7. Let {ϕ i (x) = rx+t i } i∈A be an equicontractive iterated function system satisfying the convex strong separation condition with self-similar set X contained in [1, ∞).
be a sequence of functions satisfying the following properties: 1. f n ∈ C 3 (conv(X), R) for each n. 3 2. There exists C 1 , C 2 > 0 such that for any m, n with m < n we have: for all x ∈ conv(X).
3. There exists C 3 > 0 such that for all n sufficiently large, for any m < n we have: for all x ∈ conv(X).
4. For any m, n with m < n we have either Moreover let (p i ) i∈A be a probability vector satisfying Then with respect to the self-similar measure µ corresponding to (p i ) i∈A , for µ almost every x the sequence (f n (x)) ∞ n=1 is uniformly distributed modulo one.
The hypothesis of Theorem 1.7 is satisfied by many sequences of functions. For instance we could take f n (x) = x n + x n−1 + · · · + x + 1 for all n. Alternatively we could fix a polynomial g with strictly positive coefficients and let f n (x) = g(x)x n for all n, or f n (x) = g(n)x n for all n. Each of these sequences of functions satisfies the hypothesis of Theorem 1.7.
The rest of the paper is organised as follows. In Section 2 we recall the necessary preliminaries from Fractal Geometry and the theory of uniform distribution. In Section 3 we prove Theorem 1.7.

Fractal Geometry
We call a map ϕ : R → R a similarity if it is of the form ϕ(x) = rx+t for some r ∈ (−1, 0)∪(0, 1) and t ∈ R. We call a finite collection of similarities {ϕ i } i∈A an iterated function systems or IFS for short. Here and throughout A denotes an arbitrary finite set. Given an IFS {ϕ i (x) = r i x + t i } i∈A , we say that it is equicontractive if there exists r ∈ (−1, 0) ∪ (0, 1) such that r i = r for all i ∈ A. Throughout this paper we will assume that if {ϕ i } i∈A is an equicontractive IFS then r ∈ (0, 1). For each of our theorems there is no loss of generality in making this assumption. This is due to the fact that if {ϕ i } i∈A is an equicontractive IFS then An important result due to Hutchinson [17] states that for any IFS {ϕ i } i∈A , there exists a unique non-empty compact set X satisfying X is called the self-similar set of {ϕ i } i∈A . The middle third Cantor set and the von-Koch snowflake are well known examples of self-similar sets. Given a finite word a = (a 1 , . . . , a M ) ∈ ∞ k=1 A k we let ϕ a := ϕ a 1 • · · · • ϕ a M and X a := ϕ a (X).
For distinct a, b ∈ A M we let Given an IFS {ϕ i } i∈A and a probability vector p := (p i ) i∈A , there exists a unique Borel probability measure µ p satisfying We call µ p the self-similar measure corresponding to {ϕ i } i∈A and p. For our purposes it is important that the relation (2.1) can be iterated and for any M ∈ N we in fact have where p a = M k=1 p a k for a = (a 1 , . . . , a M ). Given a probability vector p we define the entropy of p to equal h(p) := − i∈A p i log p i .
We emphasise that this quantity appears in the hypothesis of Theorem 1.5 and Theorem 1.7.
When the choice of p is implicit we simply denote µ p by µ.
Many results in the study of self-similar sets require additional separation conditions on the IFS. Often one restricts to the case when the IFS satisfies the strong separation condition or the open set condition (see [12,13]). In this paper we will require a slightly stronger separation condition that is still satisfied by many well known self-similar sets. Given an IFS {ϕ i } i∈A , we say that {ϕ i } i∈A satisfies the convex strong separation condition if the convex hull of X satisfies the following: ϕ i (conv(X)) ⊆ conv(X) ∀i ∈ A and ϕ i (conv(X)) ∩ ϕ j (conv(X)) = ∅ ∀i = j.
To help with our exposition we state here an identity that will be used numerous times in our proof of Theorem 1.7. Suppose {ϕ i } i∈A is an equicontractive IFS and f ∈ C 1 (conv(I), R). Then for any a ∈ A M , it follows from the chain rule that

Uniform distribution
To prove Theorem 1.7 we make use of the well known criterion due to Weyl for uniform distribution in terms of exponential sums, and a result due to Davenport, Erdős, and LeVeque (see [5,10,28]). Combining these results we may deduce the following statement. converges, then for µ almost every x the sequence (f n (x)) ∞ n=1 is uniformly distributed modulo one.
Proposition 2.1 is the tool that enables us to prove Theorem 1.7. We will also rely on the following technical lemma due to van der Corput, for a proof of this lemma see [  Notation. Throughout this paper we will use exp(x) to denote e 2πix . Given two complex valued functions f and g, we write f = O(g) if there exists C > 0 such that |f (x)| ≤ C|g(x)| for all x. If the underlying constant depends upon some parameter s and we want to emphasise this dependence we write f = O s (g). Given an interval I we let |I| denote the Lebesgue measure of I.

Proof of Theorem 1.7
Let us now fix an IFS {ϕ i } i∈A , a probability vector p, and a sequence of functions (f n ) ∞ n=1 so that the hypothesis of Theorem 1.7 is satisfied. In what follows we let I := conv(X).
Moreover, given a word a ∈ ∪ ∞ k=1 A k we let I a := ϕ a (I). For technical reasons it is useful to restrict our arguments to subsets of the self-similar set that are a uniform distance away from 1. With this in mind we introduce the parameter κ > 0 to be any small number such that 1 + κ / ∈ X. It follows from the convex strong separation condition that κ exists and can be taken to be arbitrarily small. Given such a κ > 0, we fix δ κ > 0 to be any sufficiently small real number so that if we let e −h(p)+δκ r δκ log 1+κ −2 log r , then Γ κ < 1. Such a δ κ > 0 exists because of our underlying assumption which is equivalent to e −2h(p) r < 1.
Moreover given such a κ and δ κ , we fix N κ to be any sufficiently large natural number so that max a∈A Nκ sup x,y∈Ia x y < 1 + δ κ , and for any a ∈ A Nκ we have either sup I a < 1 + κ or inf I a > 1 + κ.
Such an N κ exists since 1 + κ / ∈ X and X is compact. Given a word c ∈ ∪ ∞ k=1 A k we letμ It is a consequence of the convex strong separation condition thatμ c = µ • ϕ −1 c . We will use this equality during our proof of Theorem 1.7.
The following proposition provides the necessary estimates for us to successfully apply Proposition 2.1 in our proof of Theorem 1.7.
Proof of Theorem 1.7. It will be shown below that Proposition 3.1 implies that for any κ > 0 such that 1 + κ / ∈ X, if c ∈ {1, . . . , n} Nκ is such that inf I c > 1 + κ then forμ c almost every x the sequence (f n (x)) ∞ n=1 is uniformly distributed. It then follows from the definition of N κ and the self-similarity of µ, i.e. (2.2), that this statement implies that for µ almost every x > 1 + κ the sequence (f n (x)) ∞ n=1 is uniformly distributed. Since there exists arbitrarily small κ > 0 satisfying 1 + κ / ∈ X it is clear that Theorem 1.7 follows. It therefore suffices to show that our initial statement is true.
Let us now fix κ > 0 such that 1 + κ / ∈ X and c ∈ {1, . . . , n} Nκ such that inf I c > 1 + κ. By Proposition 2.1 to prove that forμ c almost every x the sequence (f n (x)) ∞ n=1 is uniformly distributed, it suffices to show that for any Expanding this expression we obtain (3. 2) The 1/N 2 term appearing in (3.2) does not effect the convergence properties of this series. As such it suffices to consider the remaining terms, which we can rewrite as

Substituting the bound provided by Proposition 3.1 into (3.3) we obtain
Therefore (3.1) holds for any l ∈ Z \ {0} and our proof is complete.

Proof of Proposition 3.1
Throughout this section the parameter κ is fixed. We also assume that δ k and N κ have been chosen so that the properties stated at the start of this section are satisfied. We also fix a word c ∈ A Nκ satisfying inf X c > 1 + κ. We define x 0 and x 1 to be such that Recall that by the definition of N κ we have Given l ∈ Z \ {0} and n ∈ N we define M = M (c, l, κ, n) := 1 + log 2πC 1 |l||I| + C 2 log n + (n − 1) log x 1 −2 log r + δ k n.
Importantly M has the property that Given k ∈ N we let B(k) := a ∈ A k : p a ≥ e k(−h(p)+δκ) .
It follows from a well known large deviation result due to Hoeffding [15] that for any k ∈ N there exists η := η(κ, p) > 0 such that For M as above we define It follows from (3.6) and properties of geometric series that Given an m < n we also define the function The proof of the following lemma is inspired by the proof of Lemma 6.1 from [18]. This lemma essentially allows us to bound from above the integral appearing in Proposition 3.1 by the L 2 norm of W M multiplied by a constant that grows exponentially with n.
Lemma 3.2. Let m < n and l ∈ Z \ {0}. For M as defined above we have Proof. Using first of all the relationμ c = µ • ϕ −1 c , then the self-similarity of µ (2.2), we can rewrite our integral as follows: Therefore it suffices to show that the latter integral satisfies the required bounds. By (3.7) we see that Note that if a ′ ∈ R M , then by the mean value theorem, (3.5), and our assumptions on the sequence of functions (f n ) ∞ n=1 , for all x ∈ I a ′ we have: We have shown that |W M (x)| ≥ r δκn for all x ∈ I a ′ for any a ′ ∈ R M . Applying this bound we obtain In the last line we used that for distinct a, b ∈ A M the intervals I a and I b are disjoint. It follows that Substituting this bound into (3.8) we obtain as required.
To complete our proof of Proposition 3.1 it is necessary to obtain good upper bounds for I |W M (x)| 2 dx. These bounds are provided by the following lemma.
Proof. We start by expanding I |W M (x)| 2 dx: (3.9) To bound the integral appearing in the summation in (3.9) we will use Lemma 2.2. Before doing this we demonstrate below that the hypotheses of this lemma are satisfied.
Applying the mean value theorem to the function h n,m , we see that there exists z ∈ [ϕ ca (x), ϕ cb (x)] such that It follows from the convex strong separation condition that there exists c 0 > 0 such that for all x ∈ I. Using our assumptions on the sequence (f n ) ∞ n=1 , and the fact z ∈ I c so z ≥ x 0 , it follows that Substituting (3.11) and (3.12) into (3.10), we see that for all x ∈ I we have Applying the mean value theorem as above, this time to the function f ′′

By our assumptions on
. What is more, it follows from the convex strong separation condition that the sign of ϕ ca (x) − ϕ cb (x) is independent of x and depends solely upon a and b. Therefore we must have φ ′′ (x) ≤ 0 for all x ∈ I or φ ′′ ≥ 0 for all x ∈ I. In either case φ ′ is monotonic and we have shown that the monotonicity condition of Lemma 2.2 is satisfied.