Maximal polarization for periodic configurations on the real line

We prove that among all 1-periodic configurations $\Gamma$ of points on the real line $\mathbb{R}$ the quantities $$ \min_{x \in \mathbb{R}} \sum_{\gamma \in \Gamma} e^{- \pi \alpha (x - \gamma)^2} \quad \text{and} \quad \max_{x \in \mathbb{R}} \sum_{\gamma \in \Gamma} e^{- \pi \alpha (x - \gamma)^2}$$ are maximized and minimized, respectively, if and only if the points are equispaced and whenever the number of points $n$ per period is sufficiently large (depending on $\alpha$). This solves the polarization problem for periodic configurations with a Gaussian weight on $\mathbb{R}$ for large $n$. The first result is shown using Fourier series. The second result follows from work of Cohn and Kumar on universal optimality and holds for all $n$ (independent of $\alpha$).


Introduction and main result
We study the following question: for fixed α > 0, among all periodic configurations of points Γ with given density on the real line, for which one is the function p α (x) = γ∈Γ e −πα(x−γ) 2  (1.1) as close to constant as possible?Factoring out scales, periodicity and symmetries, this is equivalent to the problem of placing n points on T ∼ = S 1 so that is as close to constant as possible.The equivalence of the two problems arises from the duality between (1.1) and (1.2) caused by the Poisson Summation Formula, which we explain in detail in §3.We note that (1.2) can be expressed by means of the Jacobi theta function θ(x; α) (details are given in §3); The problem arises naturally in a variety of settings, see §2.Such problems are often related to optimal sphere packing/covering.Since sphere packing in one dimension is trivial, one would expect equispaced points to be optimal.Indeed, Cohn and Kumar [19] showed that equispaced points on the line are universally optimal.Their result can be applied in our setting.
Proposition 1.1 (Application of Cohn and Kumar [19]).Among all periodic configurations Γ ⊂ R of the form of fixed density δ > 0 and for any fixed parameter α > 0, the quantity This result is not surprising, it is exactly what one would expect.However, to the best of our knowledge no "easy" proof of the theorem of Cohn and Kumar is known.As a consequence, since our proof of Proposition 1.1 makes use of the result of Cohn and Kumar, we do not currently have an "elementary" proof.We refer to §2.1 for an in-depth discussion of this result and give the proof in §4.
Proposition 1.1 is concerned with minimizing the maximum.The main result of our paper is the dual, maximizing the minimum, which we prove in the regime when the number of points is sufficiently large, where "sufficiently large" depends only on the width α of the Gaussian.The proof is given in §5.
Theorem (Main Result).Just as in Proposition 1.1, the two statements are dual by the Poisson Summation Formula.We remark that the parameter α in the result for p α corresponds to 1/α in the statement for f α (see §3).Note that the results are invariant under global shifts z as the sets {x 1 , . . ., x n } and {x 1 + z, . . ., x n + z} both yield the same energy and polarization (see §2).Also, equispaced is always understood periodically.The argument is structurally completely different from the Cohn and Kumar framework [19] of universal optimality, the proof invokes very different tools.The main obstacles when establishing our results are: (a) the location of the minimum depends on the x j in a complicated way and (b) for equispaced points the difference between minimum and mean is superexponentially small in n, which forces an analysis on very small scales.The proof of the main result is completely Fourier-analytic which makes it somewhat robust and applicable to a wider range of functions than just the Gaussian function; if one has, generally, a function of the type g(x) = k∈Z g(|k|)e 2πikx , with g(|k|) decaying sufficiently fast (say, faster than exponential), then much (but not all) of the argument carries over verbatim.For simplicity of exposition, the remainder of the paper only deals with the Gaussian case which is arguably the most natural.The proof is explicit enough that bounds on N (α) could be obtained, however, since one would naturally assume that the result is true for all n ≥ 1, independently of the value of α, we will not track this dependency.The condition n ≥ N (α) is necessary in many different steps of our argument and it appears that an unconditional argument for all n ≥ 1 would require some new ideas.Of course the case n = 1 is trivial and we provide a proof valid for all α > 0 when n = 2 in §6.It appears that already the case n = 3 poses some nontrivial difficulties.

Related results
2.1.Energy minimization.Energy minimization problems have received much attention in recent years.A seminal result due to Cohn, Kumar, Miller, Radchenko, and Viazovska [21] states that the E 8 -lattice and Leech lattice are universally optimal in their respective dimension, meaning that they uniquely minimize energy E g (Γ) among periodic configurations Γ and for a large class of (radial) potential functions g.A periodic configuration in R d is the union of finitely many shifted copies of a lattice Λ.We recall that a lattice is discrete co-compact subgroup of R d and its density is 1/vol(R d /Λ) and refer to the textbook of Conway and Sloane [22] for an introduction to lattices.The energy of a periodic configuration is given by So, it is the pairwise interaction of the points under the potential g excluding selfinteractions (as the potential may be singular at the origin).We refer to [19,21] for details on the energy minimization problem and to the textbook of Conway and Sloane [22] for an introduction to lattices, packing problems and covering problems as well as to the article of Schuermann and Vallentin [40].In [19] Cohn and Kumar showed that on the real line R (and at all scales) the scaled integer lattice is universally optimal.They obtained their result by constructing a "magic function" (using a version of the classical sampling theorem) which proved that the linear programming bounds for the problem (obtained in the same work) are indeed sharp for the scaled integer lattice.An alternative proof, also given in [19] is via spherical designs.Numerically, the hexagonal lattice also meets the linear programming bound for the energy minimization problem in dimension 2. However, a proof of its universal optimality is still missing.The results are linked to optimal sphere packings and the linear programming bounds for the sphere packing problem obtained by Cohn and Elkies [18].In seminal work, the sphere packing problem in dimension 8 was solved by Viazovska [46] and in dimension 24 by Cohn, Kumar, Miller, Radchenko, and Viazovska [20].The problem of energy minimization has also been treated on the sphere S d−1 ⊂ R d , which in the case of d = 2 is a problem of distributing points on the circle line S 1 ∼ = T. Often, for general d ≥ 2, a connection to spherical t-designs is given when distributing points on a sphere.We refer to the review by Brauchart and Grabner [17] and to Hardin and Saff [27] for the classical problem of Riesz energy minimization.More recent results on energy minimizing point distributions on spheres were obtained by Beltrán and Etayo [5] or Bilyk, Glazyrin, Matzke, Park, and Vlasiuk [10].For spherical t-designs we refer to the breakthrough of Bondarenko, Radchenko, and Viazovska [12] and to work of the second author [42] for upper bounds.

Polarization problems.
The polarization problem asks to place light sources such that the darkest point has maximal illumination.Often such problems are considered for compact manifolds, such as the sphere.We refer, e.g., to articles, published in different constellations, by Borodachov, Boyvalenkov, Hardin, Reznikov, Saff, and Stoyanova [13,14,15,16].For more numerical investigations and algorithms we refer to the work by Rolfes, Schüler and Zimmermann [38].The problem of polarization for Riesz potentials and lattices in R d was asked by Saff (cf.Problem 1.06 in the collection curated by American Institute of Mathematics for the workshop Discrete Geometry and Automorphic Forms [49]).We note that many physically important potentials, such as the Riesz potential, can be written as a Laplace transform of a non-negative measure µ.More precisely, any completely monotone function f : R + → R + , meaning (−1) k f (k) (x) ≥ 0, ∀k ≥ 1, is the Laplace transform of a non-negative Borel measure as a consequence of the Bernstein-Widder theorem [6,48] (see also the textbook of Schilling, Song, and Vondracek [39,Chap. 1]).Some results on polarization on S 1 for sufficiently fast decaying and convex potentials have been obtained in [15,Chap. 14.3].We remark that the Gaussian potential does not fall into the class of completely monotone functions as it is not convex.However, by adjusting distance to squared distance, we get completely monotone functions of squared distance, i.e., r → g(r 2 ) where g is completely monotone (compare [21]): As remarked in [21, Sec.1.2], it may seem more natural to take completely monotone functions of distance, rather than squared distance, but using squared distance allows for the use of the Gaussian function.In fact, one can check that any completely monotone function of distance is also a completely monotone function of squared distance.We refer to [21,Sec. 1.2] for this fact and more details.As an example we name the Riesz potentials, also known as inverse power laws, which are obtained as (compare again, e.g., [21]) If our result were to hold for all α > 0 (when n is fixed), one would immediately have a corresponding result for Riesz potentials as well as the whole class of completely monotone functions of squared distance (given sufficiently fast decay): Despite the seminal work of Cohn, Kumar, Miller, Radchenko and Viazovska [21] and overwhelming numerical evidence, the universal optimality of the hexagonal lattice, also known as A 2 root lattice or sometimes triangular lattice, is still open to date.The best available result is due to Montgomery [33] and states that the hexagonal lattice is optimal among lattices at all scales.More recently, the polarization problem among 2-dimensional lattices has been solved by the authors in joint work with Bétermin [8].Local optimality of the hexagonal lattice for lattice polarization and certain potential functions has been derived by the authors in [23].In [7], Bétermin and the first author showed that the hexagonal lattice maximizes Madelung-like lattice energies (lattice points have alternating signs).This result is somewhat in-between the result of Montgomery [33] and the joint result of the authors with Bétermin [8] as it does neither clearly relate to sphere packing nor covering.Related results concerning the Lennard-Jones potential (see Bétermin and Zhang [9]), which is r → r −12 − 2r −6 and neither non-negative nor monotonic nor convex, show that for different densities different geometrical arrangements can be optimal.This phenomenon is widely called phase transition.Some physically relevant consequences of the conjectured universal optimality of the hexagonal lattice (and proven optimality of E 8 and Leech lattice) are discussed by Petrache and Serfaty [37].A general survey is given by Lewin and Blanc [30].
2.4.Heat Equation Sampling.Our result solves the following problem on S 1 as a byproduct.The problem was originally discussed by Pausinger and the second author [35] on T 2 .Suppose there is an unknown distribution of heat f ∈ L 1 (S 1 ) and we are interested in estimating the total heat S 1 f (x)dx.If the function f is only in L 1 then no effective sampling strategies are possible.If we now assume, however, that some time t > 0 has passed, then the solution of the heat equation e t∆ f with f as initial condition satisfies and is also a more regular function for which sampling strategies should be possible.
Corollary 2.4.1.For any t > 0 and all n ≥ N (t) sufficiently large (depending only on t) the worst case sampling error is minimized if and only if the sampling points {x 1 , . . ., x n } are equispaced.
Proof.Interpreting the solution of the heat equation as a Fourier multiplier, The solution of the heat equation started with a Dirac delta is the Jacobi θ-function and thus f, Our results show that the maximum is minimized and the minimum is maximized if and only if the points are equispaced.This implies the statement.
Remark.It was pointed out to us by one of the referees that we can drop the condition n ≥ N (t) as Proposition 1.1 is sufficient in order to prove the above corollary for any t and all n.The argument goes along the same lines as above, but then continues in the following way.We need to show that Using a trivial estimate and then Proposition 1.1 we get In order to show (2.2) it now suffices to show that max It is a remarkable property of the theta function (cf.[22,Chap. 4, eq.( 22)]) that max The details of this argument are provided in §3 and §4.The crucial property is that, in the equispaced case, the minimum is achieved exactly midway between the points and the maximum at the points themselves.
2.5.Shift invariant systems.A shift invariant system V 2 (g) on R with a generator g ∈ L 2 (R) is a space of functions of the form An example is the classical Paley-Wiener space P W (R) of band-limited functions, i.e., supp( f ) ⊂ [−1/2, 1/2], which is generated by sinc(x) = sin(πx)/(πx).For a set Γ ⊂ R, we say that it is a set of sampling for V 2 (g) if and only if there exist positive constants 0 < A ≤ B < ∞, depending on g and Γ, such that For the motivation of (non-uniform) sampling in V 2 we refer to the article by Aldroubi and Gröchenig [2].Characterizing sampling sets for given generator g is a very difficult problem.A necessary condition is that the (lower Beurling) density of the set is at least 1.The case of density 1 is referred to as critical sampling.For a large class of functions, including the Gaussian function x → e −αx 2 , α > 0, the problem was solved by Gröchenig, Romero, and Stöckler [26].The case of critical sampling with Gaussian generator is treated by Baranov, Belov, and Gröchenig [4].
Our results suggest that for the space V 2 (φ α ), where φ α is a Gaussian, the bound B is minimal and A is maximal for equispaced sampling.Lastly, we mention the relatively new area of dynamical sampling introduced by Aldroubi, Cabrelli, Molter, and Tang [1].This combines the sampling problem with dynamical systems.In particular, we find connections between the heat equation and the sampling problem, as described by Aldroubi, Gröchenig, Huang, Jaming, Krishtal, and Romero [3].Ulanovskii and Zlotnikov [45] described sampling sets for P W (R) so that f can be reconstructed from samples of f * ϕ t , where ϕ t is a convolution kernel of a dynamical process.It would be interesting to see how our results connect to this area.

Notation and remarks
3.1.Basic notation.To clarify normalization, we note that we use the following version of the Fourier transform of a suitable function f on the real line: Thus, the Poisson Summation Formula reads (see, e.g., Gröchenig [25,Chap. 1.4]) The Fourier transform of a Gaussian is another Gaussian, differently scaled (see, e.g., Folland [24, App.A]): The periodization of φ α will be called a periodic Gaussian: k∈Z φ α (x + k).A periodic configuration Γ ⊂ R with period δ is a set of points of the following form: where Λ = δZ, δ > 0, The density ρ of a configuration Γ is the number n of points per period ρ = n/δ.

Polarization on the real line.
We are now interested in the following polarization problem: which periodic configuration of fixed density ρ maximizes min We quickly note that, fixing the amounts of points per period, a minimizer always exists by compactness.We call the above quantity the polarization of Γ and seek to find the maximal polarization.In general, the minimum depends on Γ and its density ρ as well as on α.For equidistributed points, however, the minimum is always achieved midway between successive points (as we will prove as part of the proof of the main result).The polarization may more explicitly be written in one of the following ways: where the second equality is due to the Poisson Summation Formula.Note that in this explicit formula {x 1 , . . ., x n } ⊂ [0, δ).By identification of a configuration Γ with (x 1 , . . ., x n ) ∈ (δT) n we see that a maximizing configuration must exist by compactness.Clearly, neither the factor 1/ √ α nor the factor 1/δ are of relevance for the minimization process or determination of the maximizing configuration.We will next show that for (3.1) and any fixed n, α > 0, and δ > 0 there is always an equivalent problem with the same n, δ = 1 and different α.
where x = x/δ ∈ [0, 1), x j = x j /δ ∈ [0, 1) and α = αδ 2 .We see that we may thus assume that the points {x 1 , . . ., x n } are distributed in [0, 1) and that Γ is 1-periodic (and of density n).Using the Poisson Summation Formula we see that finding the optimal configuration for (3.2) is the same as maximizing This is (up to flipping the argument) exactly the quantity f α (x) from (1.2) considered in our main result.Note that by the Poisson Summation Formula 3.3.Theta functions.The problem can be written as a variational problem for a finite superposition of real-valued theta functions.For parameter τ ∈ H (complex upper half-plane) and argument z ∈ C the classical theta function is This function is holomorphic in τ and entire in z.For τ = iα, α > 0 and z = x ∈ R the function becomes real-valued and we use the notation: Note that the function θ(x; α) is the heat kernel on the flat torus R/Z.As such it has mean value 1, which is easily verified by a small computation; where δ k,0 is the Kronecker delta.The function ϑ(z; τ ) and, hence, θ(x; α) can be expressed as an infinite product known as the Jacobi triple product, which is a special case of the Macdonald identities for affine root systems [31]: 1 − e 2kπiτ 1 + e (2k−1)πiτ e 2πiz 1 + e (2k−1)πiτ e −2πiz = k≥1 1 − e 2kπiτ 1 + 2 cos(2πz)e (2k−1)πiτ + e 2(2k−1)πiτ .
We refer to textbooks of Mumford [34], Stein and Shakarchi [41], or Whittaker and Watson [47] for more details on elliptic functions.

Proof of Proposition 1.1
Proposition 1.1 follows relatively easily from the work of Cohn and Kumar [19] and the Poisson Summation Formula.The heart of the argument has three ingredients: (1) first, universal optimality shows that, for any fixed α > 0, the interaction energy is minimized for equispaced points.
(2) The second ingredient is a trivial estimate that arises from replacing an average (arithmetic mean) of values by its maximum max (3) The third ingredient is that (2) is sharp whenever the points are equispaced (which, simultaneously, by universal optimality, minimizes the lower bound in (2) just above).There is a magic ingredient where, for equispaced points, the maximum of n k=1 θ α (x − x k ) is attained at the points x j themselves.We remark that the counterpart to (1) is false for the minimization problem.Likewise, regarding (3), the location of the minimum depends in a highly nonlinear fashion on the location of the points.Understanding the minimum and the considered polarization problem thus requires a different approach.
Proof.(1) We note that the energy for the potential Φ = 1/ √ α φ 1/α is given by where the second equality comes from the Poisson Summation Formula.The potentials are sitting on the periodic configuration Γ.However, not only their sum is considered but all their pairwise interactions and the sum over all of them.The condition λ ∈ Λ\{x k − x j } in (2.1) excludes self-interaction as the potential function g is allowed to be singular at 0 (this is also of physical relevance).For the Gaussian, we may allow self-interaction (which adds a fixed additive constant determined by normalization, but independent of Γ) and we do not need to exclude it.
, then the energy can be written as (after applying the Poisson Summation Formula) where the second and third equalities are due to the periodicity of θ α .The universal optimality of the (scaled) integers due to Cohn and Kumar [19] states, for all α > 0, Note that the result in [19] as well as ours also hold for arbitrary scaling.
(2) is a trivial observation and does not require any more details.
(3) For Γ 0 the maxima of p α (or likewise f α ) are attained at the equispaced points {0, 1/n, . . ., (n − 1)/n} (compare Proposition 5.1.1).This follows by a simple application of the Poisson Summation Formula and the triangle inequality.This allows for various additional tools to be used, in particular, it allows for a lossless application of the triangle inequality.We give the proof for the integers Z but the proof can easily be adjusted to scaled integers δZ (replace k by k/δ and adjust the Poisson Summation Formula accordingly).We perform the following small computation: So, the maximum is attained at 0 and by periodicity at all points in Z (or δZ).Note that E Φ (Γ) builds the average of all values taken on Γ (see Figure 1).Now recall that (4.4) tells us that for the equispaced configuration Γ 0 the maximum is attained exactly on Γ 0 .It readily follows from (4.1), (4.2) and (4.3) that max is minimal if and only if Γ is equispaced.
This gives Propositions 1.1 as a simple consequence of the result in [19].

Proof of the Main Result
We start with an overall overview of the argument.It is fairly modular and the subsections reflect its overall structure.We also emphasize that, due to the fast decay of the Fourier coefficients, the argument is somewhat forgiving when it comes to polynomial estimates in the number of points.As a consequence, some of the subsequent proofs are given in its simplest rather than their optimal form.The main argument comes in two parts: the first part shows that optimizing configurations have to be exponentially close (in n) to the equispaced distribution.The structure of the first part is as follows.
(1) §5.1 uses some basic facts about theta-functions.We show that if the points are equispaced, then the minimum is attained exactly at the midpoints between the equispaced points.This then allows us to deduce min which already shows some of the difficulty: the difference between the average and the minimum can be super-exponentially small in n.
(2) §5.2 introduces a trivial L 1 -estimate (essentially pigeonholing) and a nontrivial estimate: the McGehee-Pigno-Smith inequality [32], and independently discovered by Konyagin [29].It was pointed out to us by an anonymous referee that the McGehee-Pigno-Smith inequality can be avoided and we present this more elementary argument as well.
(3) §5.3 combines these ingredients to prove that if {x 1 , . . ., x n } ⊂ [0, 1) is an optimal configuration (meaning one maximizing the minimum), then the first n − 1 Fourier coefficients of the measure µ =  3) is extremely small, exponentially small in n, we get that any optimal configuration has to be exponentially close to equispaced.
The second part of the proof shows that the only configuration that is exponentially close (in n) to the equispaced distribution and has maximal polarization is the equispaced distribution: this part can be understood as a detailed analysis of the perturbative regime.The main idea lies in making the ansatz x j = j/n+ε j together with the explicit Fourier series representation Since the problem is invariant under shifts, we can (and have to) assume that ε 1 + • • • + ε n = 0 to eliminate the invariance of the problem under translation.The argument is then structured as follows.
(5) In §5.5 we show that the frequencies where k is a multiple of n are exactly the terms that contribute when the points are equispaced: among these frequencies only k ∈ {−n, 0, n} have a sizeable contribution, the rest is small.The equispaced points yield n local minima and our goal is to show that at least one of these minima further decreases unless ε j = 0 for all 1 ≤ j ≤ n (meaning the points are equispaced again).( 6) We consider the trigonometric polynomial g 1 (x) which is the restriction to the first (n − 1)/2 frequencies.By a modified Poincaré inequality, we will prove in §5.6 that any such trigonometric polynomial assumes a small negative value at at least one of the points of the form (k + 1/2)/n, for 0 ≤ k ≤ n − 1.This negative contribution is going to make at least one of the minima much smaller.It remains to make sure that this cannot be counteracted by contributions coming from the other frequencies.(7) There are two remaining parts to analyze: g 2 (x) defined by restricting summation to the frequencies n/2 ≤ |k| ≤ n − 1 and h(x) for all the remaining frequencies.We will prove in §5.7 that Indeed, these terms are many orders of magnitude smaller.(8) The main ingredient for showing the last step is a surprising appearance of the Discrete Fourier Transform (see §5.8) hidden in the Fourier coefficients: since the sum of the perturbations ε 1 + • • • + ε n = 0, we can approximate the Fourier coefficients whenever k is not a multiple of n, as where the sum is merely a Discrete Fourier Transform of the ε 1 , . . ., ε n .This allows us to deduce a certain type of symmetry (because the ε j are realvalued) which will be used to prove g 2 L ∞ ≪ g 1 L 2 .It also guarantees that not all Fourier coefficients are small (via a Plancherel identity).( 9) The final inequality, established in §5.9, is, assuming the perturbations ε j are exponentially close to 0, that the minimum which then forces all the perturbations to vanish.
Part 1 of the proof 5.1.Minimizer for equidistributed points.We first prove that for equispaced points the minimum is attained exactly midway between two subsequent points.It is somewhat remarkable, and indicative of the difficulty of the problem, that even this very intuitive statement does not appear to have a very simple proof.
Proposition 5.1.1.We have, for all 0 Proof.Suppose {x 1 , . . ., x n } ⊂ [0, 1) are equispaced points, x j = (j − 1)/n.Then Only now it is easy to find the minimum: in the product formula of θ each factor is minimized if and only if x ∈ Z + 1/2, as the cosine-term is decisive and assumes its minimum there.The following inequality is an immediate consequence: where equality holds if and only if x ∈ 1 n Z + 1 2 .The result follows from (5.1).This fact will be used frequently since it allows for the natural point of comparison (see Figure 2).The next step consists in computing the actual size of the minimum.Using, again, the fact that unit roots sum to 0 we end up with  For the sum of equispaced periodic Gaussians the minimum is achieved midway between successive shifts.For sums of shifts by a general periodic configurations it is rather difficult to grasp the minimum.For the plot we have normalized the sum to oscillate around 1, i.e., the integral over 1 period is 1.
Since we know from Proposition 5.1.1 that the minimum is attained exactly in the middle between two subsequent points, we have the explicit representation min 5.2.L 1 -estimates.We continue with a basic L 1 -estimate and a not so basic L 1estimate.The reason why L 1 is a natural space to bound deviation from the mean is given by the following elementary pigeonhole argument.
Lemma 5.2.1.Suppose g : [0, 1] → R is a periodic, continuous function with mean value 0. Then Proof.Since g has mean value 0, we have and thus The argument then follows from We also use an inequality discovered independently by McGehee, Pigno, and Smith [32] and Konyagin [29].It arose in their solutions of the Littlewood conjecture.
Theorem (McGehee, Pigno, Smith [32]).For any set of integers We note that Konyagin [29] did not explicitly provide the constant.McGehee, Pigno, and Smith work over the interval [0, 2π] and show that the inequality holds with constant c = 1/30 which leads to 1/(60π) ≥ 1/200 being an admissible constant when working over the interval [0, 1].Stegeman [44] showed that one can take c = 4/π 3 on [0, 2π] which would lead to a constant of 1/50 being admissible after rescaling to [0, 1].In any case, the precise value of the constant will not be of importance for the subsequent argument.We will use the McGehee-Pigno-Smith inequality to derive a lower bound on the L 1 -norm of the deviation of the sum of Jacobi θ-functions from their mean.We note that if the lower bound is large, then the L 1 -norm is large and, as a consequence, the minimal value attained by the function has to be quite a bit smaller than its average.Since we want to avoid this, this will implicitly force the first few Fourier coefficients to be small.It has been pointed out by an anonymous referee that, for the purposes of our argument, the McGehee-Pigno-Smith inequality can be avoided as follows: we have, for any 1 ≤ k ≤ n that This estimate is indeed sufficient for the remainder of the argument.This is partially due to the fact that the multipliers in the Fourier series decay extremely rapidly (i.e.like a Gaussian).Using the McGehee-Pigno-Smith inequality instead of the more elementary inequality might prove advantageous when trying to establish an analogous result with a kernel whose Fourier transform decays more slowly.Using the McGehee-Pigno-Smith or the more elementary inequality gives the following.
Lemma 5.2.2.We have, for all {x 1 , . . ., x n } ⊂ [0, 1) that is not quite of the required form since it is not a trigonometric polynomial.However, a simple application of the triangle inequality leads to We apply the McGehee-Pigno-Smith inequality to the trigonometric polynomial Combined with the truncation error, this leads to the lower bound We know that equispaced points satisfy Therefore, if we now assume that {x 1 , . . ., x n } ⊂ [0, 1) is a configuration maximizing the minimum, we have that min This implies that for 1 ≤ |k| ≤ n and n sufficiently large (depending only on α) This allows us to conclude that the first n − 1 Fourier coefficients of the measure given by the sum of the n Dirac measures in x 1 , . . ., x n is exponentially small (5.3) Remark.We note that the proof actually shows quite a bit more since the last step of the argument is only sharp when k = n − 1.We note the stronger inequality but this will not strictly be required in the remainder of the argument.Then, for ε > 0 sufficiently small (say ε ≤ 1/(1000n 4 )), there exists a permutation π : S n → S n and a global shift z ∈ [0, 1] such that Proof.We use the Fejér kernel Hence, applying the assumption of the first n − 1 non-zero Fourier coefficients being small, we get From the above calculation we also conclude that, for any index i = j, This inequality, by itself, is not tremendously powerful: we bound a term by a sum containing ∼ n 2 similar terms.However, we have the luxury that we will only apply the Lemma in a regime where ε is already exponentially small in n which allows for losses at a polynomial scale.The roots of F n on [0, 1) are exactly the points of the form k/n for 1 ≤ k ≤ n − 1.Since where At points of the form x = k/n this expression simplifies to Therefore, for y sufficiently close to 0, we have A similar argument can be used to give an upper bound on the third derivative.The Taylor formula with remainder shows that the inequality is valid for y in a region around 0 that shrinks polynomially in n and from this we deduce the validity of the inequality for ε sufficiently small.The previous inequality implies that x i − x j has to be of the form x i − x j = k/n + δ with some δ ≤ ε.Moreover, since F n (0) = n we can also deduce that |x i − x j | > 1/2n (provided ε is sufficiently small) which then forces the existence of a global perturbation.
Part 2 of the proof 5.5.The Main Contribution.We quickly recall what we already know from the first part of the proof.We know that any optimal configuration {x 1 , . . ., x n } has to be close to the case of equispaced points.More precisely, it has to be of the form is exponentially small in n and z ∈ R is an arbitrary shift.By translation symmetry, we can assume that z = 0 and j ε j = 0 and will do so in all subsequent arguments.We can rewrite the sum over θ−functions as a Fourier series We remark that, as already noted above, when all the ε j = 0, then In that case, the minimal value is very close to the mean value n.It remains to show that small perturbations decrease the minimal value.Using the Taylor formula with the remainder term we note that the frequency k = n contributes contribute, up to a small error term, the same quantity as the unperturbed case ε j = 0 and Recall that, in the unperturbed case, the minima are attained at (k + 1/2)/n, 0 ≤ k ≤ n − 1.We will show that a small perturbation necessarily makes one of the minima smaller and argue by contradiction: if there was a small perturbation of the points that increases the minimum, then, in particular, the size of the perturbation would have to be positive at all points of the form (k + 1/2)/n, 0 ≤ k ≤ n − 1 (since that is where the minima are attained in the unperturbed case).The remainder of the argument is dedicated to showing that this cannot happen.
5.6.A Trigonometric Lemma.This section proves a self-contained Lemma, which shows that a trigonometric polynomial of degree at most (n − 1)/2 assumes negative values at at least one of the points (k + 1/2)/n, for 0 ≤ k ≤ n − 1.The obtained bound is likely far from optimal but suffices for our purpose.Indeed, the rapid decay of the Gaussian weight ensures that any type of polynomial bound would suffice for the remainder of the argument.
We note that the restriction on the frequency |j| ≤ (n − 1)/2 is tight.Suppose n is even and consider the trigonometric polynomial Before stating the proof of Lemma 5.6.1, we establish one of the two main ingredients: a modified Poincaré inequality for functions that do not quite vanish on the boundary.Needless to say, the tools and arguments used to establish this inequality are completely standard and we do not claim the inequality to be novel in any sense.Many similar inequalities are known in the general context of trace inequalities and embedding results for Sobolev spaces.
Proof.The following makes sense in the more general Sobolev space H 1 (as opposed to the smaller space C 1 ) but this will not be relevant here.We first note that replacing f (x) by |f (x)| does not change f L 2 and leaves f ′ L 2 invariant.It thus suffices to prove the inequality for non-negative f (x).We proceed with basic facts: the first is the standard Poincaré inequality, implying that if g : This one-dimensional inequality is sometimes known as the Wirtinger inequality (for example in Blaschke's 1916 book Kreis und Kugel [11]).However, we note that it seems to have been discovered many times: for example, Hurwitz [28] already used it in his 1901 proof of the isoperimetric inequality.We refer to Payne and Weinberger [36] or work of the second author [43] for more on Poincaré inequalities.This inequality then implies that b a which we can square out and write as The first integral on the right-hand side can be bounded with Cauchy-Schwarz which leads to the estimate, abbreviating The left-hand side can be factored as and thus We also have the trivial estimate Adding the last estimate to the square of the penultimate estimate and using Proof of Lemma 5.6.1.The minimum is necessarily ≤ 0 since min a j e 2πi j 2n e 2πijk/n Let us now assume that the minimum is negative but very close to 0 Roots of unity summing to 0 then shows, just as above, that from which we deduce max Using this in combination with the modified Poincaré inequality with M = n|X| we deduce Summing over all the intervals (periodically interpreted), we get As for the remaining sum, we use the Cauchy-Schwarz inequality to bound Altogether, this implies As f is a trigonometric polynomial of degree at most (n − 1)/2, we have as well as Plugging this in, we get For an arbitrary parameter γ > 0, the inequality which, by solving the quadratic equation can be seen to imply that Remark.Much of the difficulty comes from the fact that we only evaluate the trigonometric polynomial in equispaced points.If one was just interested in the minimum being small in some place, there is a very elementary argument which we conclude for the sake of context.Lemma 5.6.3.Let f (x) = 1≤|j|≤n−1 a j e 2πijx be a real-valued trigonometric polynomial.Then Proof.We also have the trivial estimate Appealing to Lemma 5.2.1, we deduce We have, via Plancherel, that and, via the triangle inequality and Cauchy-Schwarz inequality, that From this and 2 √ 2 ≤ 3 the result follows.

5.7.
Outline of the remaining argument.In this section we outline how the argument will be concluded.We first recall that We will choose to sum over even more terms (even though they are rather small), namely k ∈ nZ, so as to allow for a comparison to the minimal value attained by equidistributed points.For this purpose we set where the simplification comes from the fact that these exponential expressions are all 1 when k is a multiple of n.In particular, all the Fourier coefficients are reasonably close to n.More precisely, using again that the sum over all displacements ε j equals 0, we get we deduce that It is our goal to show that the perturbation induced by ε j = 0 has to decrease the value in at least one of the minima.To this end, we split the function as where A sums over all multiples of n, g 1 sums over the first (n− 1)/2 frequencies, g 2 sums frequencies between (n − 1)/2 and n − 1 and h sums over the rest, frequencies larger than n and where n does not divide k.Thus The remaining argument proceeds as follows (1) We show, in the next section, that g 1 L 2 is not too small (in terms of n j=1 ε 2 j ).The Discrete Fourier Transform naturally arises in the process.
(2) Lemma 5.6.1 then implies that min 0≤k≤n−1 is fairly negative.(3) We show g 2 L ∞ ≪ g 1 L 2 (which follows again from the properties of the Discrete Fourier Transform) and that the same is true for h.(4) Thus the sum of the three terms is fairly negative in at least one of the points of the form (k + 1/2)/n and this then implies the result.We also note that the ε j are fairly small: (5.3) together with the proof of Lemma 5.4.1 gives max where the implicit constant depends on α.As it turns out, since these are exponentially small in n, the basic Taylor expansion ) is highly accurate and we deduce, as long as k is not a multiple of n, that We observe that this is, up to various types of rescaling, simply a Discrete Fourier Transform of (ε 1 , . . ., ε n ).Since the ε j are all real-valued, we have the symmetry where we omit the k = 0 term because This immediately implies that at least one Fourier coefficient is large and, in particular, is many orders of magnitude larger than the error terms (recall that the error terms are exponentially small in n).
5.9.The final estimates.This also implies, using the Plancherel identity, that g 1 is large in L 2 since .
The worst case is when most of the Fourier energy is localized at high frequencies and thus we can remove the smallest weight and deduce It is rather easy to show that g 2 is many orders of magnitude smaller than g 1 as the Fourier coefficients are very nearly the same.Since the discrete Fourier transform has the symmetry .This is exponentially smaller than g 1 (x) because e −πα(n/2) 2 is exponentially smaller than e −πα( n−1 2 ) 2 .
We will require pointwise estimates for what follows.However, the decay is sufficiently strong so that we can simply take a triangle inequality.Using again the cancellation of the sum of roots of unity together with the fact that for k ≤ n we have k 2 ε 2 j ≪ |kε j |, we get for sufficiently large n We deduce that, again for n sufficiently large, We can now conclude the argument Recalling that we deduce that the minimal value of f (x) is maximal if and only if As the equidistributed points provide that the minimum is taken exactly in between them, we obtain equality in the last calculation and, hence, derive our main result.

Small n and shifting one point
The case when n is small needs, as mentioned in §1, new ideas.We have not tried to find solutions for say n = 3, 4, 5 and we believe it is a hard problem.However, at least the case n = 2, i.e., Γ = Z ∪ (Z + c), is fairly easy: the fact that x = 1/2 gives the minimizer of θ α (x) suggests that we should place the second point exactly midway between the integers.It follows from Proposition 5.1.1 that we now have minima at 1/4 and 3/4 (in between the maxima at 0, 1/2 and 1).Taking these as points of reference it is not hard to show that the equispaced distribution is optimal.In fact, this idea leads to the following generalization.
Proof.It is seen from the product formula (5.2) that θ α (x) is symmetric in x and a decreasing function on (0, 1/2) (see also [33]).Hence, we have The inequality holds true when shifting by y ∈ (−1 + 1/n, 0) (so periodically to the right) and picking ℓ = n − 1, by symmetry.
if the points are equispaced.Moreover, among all sets of n points on the torus T ∼ = S 1 and for any fixed parameter α > 0, the quantitymax x∈T f α (x) = max x∈T n j=1 θ α (x − x j )is minimized if and only if the points are equispaced.

Figure 1 .
Figure1.Illustration of the result of Cohn and Kumar[19].Building the average of pα(x) at the points {x 1 , . . ., xn} (in this case n = 3) for periodic, non-equispaced configuration always yields a larger value than for the equispaced points.As we sum n times the maximum in the equispaced case, it follows that the maximum of pα(x) is minimal only for the equispaced configuration.

e( 4 )
−2πikxj ≤ 2000 • n 2 • e −πα(2n−1) .We note that for equispaced points the first n − 1 Fourier coefficients all vanish.§5.4 proves a basic estimate, invoking the classical Fejér kernel, showing if the first n − 1 Fourier coefficients of µ are close to 0, then the n points are (quantitatively) close to n equispaced points.Since the estimate from (

Figure 2 .
Figure2.For the sum of equispaced periodic Gaussians the minimum is achieved midway between successive shifts.For sums of shifts by a general periodic configurations it is rather difficult to grasp the minimum.For the plot we have normalized the sum to oscillate around 1, i.e., the integral over 1 period is 1.

a j e 2πiλj t dt = 1 0e −2πiλ k t n j=1 a j e 2πiλj t dt = 1 a
j e 2πi(λj −λ k )t dt = |a k |

5. 4 .Lemma 5 . 4 . 1 .
The gaps are regular.If we have n equispaced points, then the first n − 1 Fourier coefficients vanish.We prove a stability version of this statement: if the first n − Fourier coefficients are small, the points are almost equispaced.Suppose {x 1 , . . ., x n } ⊂ [0, 1) has the property that max contribution arises for k = −n.Thus the three terms B = k∈{−n,0,n} e −παk 2

nε j e −2πik j n 2
The Discrete Fourier Transform preserves the ℓ 2 -norm and therefore n

2 . 2 .
At this point, we can invoke a Taylor expansion and argue thatg 1 (x) L 2 ≥ e −πα(Now, the argument from the previous section comes into play: we do not have information about any individual Fourier coefficient but we know that at least one of them is large max for n sufficiently large,g 1 (x) L 2 ≥ e −πα(

2 ≪ g 1
We deduce, since k > (n − 1)/2 and thus k ≥ n/2, that for n sufficiently large,g 2 L ∞ ≤ L 2 n 100 .A similar argument can be applied to h.We argue that h(x) L ∞ =