Flavor Moonshine

The flavor moonshine hypothesis is formulated to suppose that all particle masses (leptons, quarks, Higgs and gauge particles -- more precisely, their mass ratios) are expressed as coefficients in the Fourier expansion of some modular forms just as, in mathematics, dimensions of representations of a certain group are expressed as coefficients in the Fourier expansion of some modular forms. The mysterious hierarchical structure of the quark and lepton masses is thus attributed to that of the Fourier coefficient matrices of certain modular forms. Our intention here is not to prove this hypothesis starting from some physical assumptions but rather to demonstrate that this hypothesis is experimentally verified and, assuming that the string theory correctly describes the natural law, to calculate the geometry (K\"{a}hler potential and the metric) of the moduli space of the Calabi-Yau manifold, thus providing a way to calculate the metric of Calabi-Yau manifold itself directly from the experimental data.


I. INTRODUCTION
Some researchers including one of the authors of this work (H. S.) have been working on flavor physics, assuming that some discrete symmetry plays an important role in its understanding [1]; S 3 , S 4 , A 4 , etc. But the outcome is very limited and so far we have no clear understanding of flavor physics. Topological definition of Higgs Yukawa coupling also has not led to any useful prediction on the flavor physics to date [2].
On the mathematical side, a dramatic phenomenon called "moonshine" has been described [3], in which a discrete symmetry (specifically, dimensions of representation of the monster group) is manifested in a modular form in a rather unexpected manner. When this happens, we may use this fact for the discrete symmetry in flavor physics: We start by assuming that the symmetry of flavor physics is manifested in a certain modular form. Corresponding to each flavor we assume such a modular form. The modular forms must contain all the information about flavor physics with the understanding that all this information is contained in the Higgs coupling to leptons and hadrons.
More precisely, we assume that the particle masses (the mass ratios), rather than dimensions of a representation of discrete group, are directly written in the Fourier coefficients of these modular forms -the flavor moonshine hypothesis. The mass ratios are scale independent quantities [4] and do not vary with energy scale. We observe that at least in the lowest order perturbation calculations the logarithmic scale dependence cancels out completely both in QCD and in EW theory, although it does not exclude the renormalization effect proportional to such terms as log(m 1 /m 2 ). We refer to the reference [4] here for the non-perturbative calculations. Therefore, the mass ratio is an appropriate quantity to discuss physics even at the highest energy scale. The gauge particle masses must also be written as some modular forms but we will not discuss that matter in this work.
In pure mathematics, we anticipate generalization of conventional "moonshine" from single variable modular form to multi-variable modular form. Certain mathematical "object," perhaps the representation matrix of a certain group rather than the dimension of the monster group, must be written in the Fourier expansion of the multi-variable modular forms.
We will identify the mass matrix with this "object." The question arises: What are those modular forms that manifest the discrete symmetry appearing in flavor physics? For the time being, we postpone the question of justifying our adoption of a certain modular form to each flavor based on a general formalism such as string theory, but rather we proceed backward and investigate instead what the experimentally acceptable modular forms are. We then determine what kind of geometry can yield such a modular form when we consider the compactification of the string theory.
We define the flavor modular form in the following way. Suppose we have a two variable modular form to each flavor. Then it can be Fourier expanded as where g ij for i ≥ 0 and g i,−j = g ij for the symmetric modular form [5]. The g ij is supposed to correspond to the Higgs coupling of i and j quarks or to the corresponding leptons. By solving the equation (1) backwards we have Here, q = e 2πiτ and r = e 2πiσ . The integration is done along the circle C of radius 1 with the center at the origin. It is important that we integrate over the modular variables to obtain the coefficient.
If the modular form is based on the ring of integers, the forms are numerous and it is hard to pinpoint the appropriate form. Fortunately, if we generalize the integer ring appropriately to constrain the possible forms, then in the case we are considering where g i,−j = g ij , called the symmetric modular form, it is known that all the modular forms can be constructed rather easily [5].
Specifically, as the simplest generalization, we use SL(2, Z( √ 2)) to define the flavor modular group of the two variable modular form rather than SL(2, Z). 1 We put Then the condition for the modularity is the transformation property: 1 When we thought of flavor moonshine, it was clear that the relevant modular form must have more than one variables. It also seemed the SL(2, Z) is insufficiently constrained, allowing too many choices for the forms. Therefore, we looked for some work enlarging SL(2, Z) so that the choice becomes manageable, and we encountered a paper by H. Cohn and J. Deutsch [5] where we learned that there are only three generators for the entire SL(2, Z( √ 2)), which is the simplest kind of SL(2, Z) extension. To our great surprise, we found its modular form in the lowest level (k = 1) describes the charged lepton mass ratios and Cohen-Deutsch [5] shows that there are only three generator modular forms in this case.
They are given by G 2 , G 4 , G 6 with k = 1, 2, 3. What we use are the coefficients in Fourier expansion of these modular forms. We may also choose different combinations H 2 , H 4 , H 6 that are given by We have only one modular form for k = 1: G 2 , two forms for k = 2: G 4 and G 2 2 , and three forms for k = 3: G 6 , G 3 2 and G 2 G 4 . Linear combination of forms of the same level 2k is again a modular form. Therefore all modular forms up to the level 6 (k = 3) are given by where a 4 , a 6 and b 6 are complex numbers.
In order to write down the Higgs coupling of quarks and leptons, we define the following: First we define, for the Higgs coupling of a certain flavor, where H is the Higgs field and ψ L , ψ R are quark or lepton fields. The Yukawa coupling is given by where U L , U R are unitary matrices and λ denotes elements of diagonalized g ij matrix, i.e., for i, j = 0, 1, . . . , G − 1. Then we have where χ Lk = U † Lik ψ Li and χ Rk = ψ Rj U † Rjk . To maintain the modular invariance of the Yukawa coupling, we assume the transformation property: under the modular transformation (5). The level −2k+2 is to take care of the transformation property of dqdr/qr: If the original τ, σ are real, so are the transformed τ, σ. Therefore, the unit circle goes to the unit circle and the modular invariance is maintained.
Some remarks are in order: 1. This construction suggests the definition of the fields: We do not need to assume any specific transformation property of the individual field under modular transformation, while the bilinear form expressed in equation (10) must transform covariantly under the modular transformation. We also note that the transformation property (10) is consistent only when the number of generation G is infinite. Finite G violates the modular invariance of the Yukawa coupling.
2. We treat here, just for simplicity, a pristine Higgs field H. However, in section IV, we will define and use the modular form corresponding to the Higgs field: corresponding to J(q, r). We can also define the field with the Higgs field H = H 0 .
3. Our modular variables q, r eventually become the moduli of Calabi-Yau manifold as will be shown later in section IV. The usual treatment of these variables is to regard them as a scalar field in the four-dimensional space-time and to try to find a way to stabilize them. We regard them as variables to distinguish different vacua, and we integrate over them as in equation (13) to obtain the Yukawa coupling. This roughly corresponds to superposing all possible equivalent vacua. The Yukawa interaction resolves this degeneracy, so that each value of generation G corresponds to a different vacuum. We have G = 3 in this work as it concerns the low energy experimental data.
It may happen that a phase transition occurs at high energy, in which case the particle masses would change suddenly at that energy scale.
4. Our definition of the "generation" is not the same as the usual one in string theory.
It corresponds to the expansion coefficient of the modulus dependent fields defined in equations (16) and (18).

II. NUMERICAL RESULTS
Equation (12) shows that g ij is a mass matrix, and equation (1) shows it is just the Fourier coefficient of the modular form J(q, r). In this section we consider each case of equations (7), (8) and (9), separately.
The modular form J(q, r) = G 2 in this case. From a table given by Cohn and Deutsch [5], we have From now on we restrict ourselves to the G = 3 case: The mass square matrix is given by gg † and it will be diagonalized as with sum over the indices k. We diagonalize the mass square matrix M 2 3 and find its square root is By normalizing the lowest mass to be the electron mass of 0.5110 MeV, we obtain This shows that the modular form G 2 embodies the charged lepton masses in its Fourier coefficients. There is no free parameter in this case except for the entire normalization which is of course scale dependent, unlike the mass ratios [4]. In this case, we have the modular form For the time being we ignore the second term (i.e., put a 4 = 0). Then we have, for the three generation case, The normalized and diagonalized mass matrix becomes Here we used top quark mass of 173 GeV as the input mass. Then the charm quark mass is obtained as 0.964 GeV, which is a little smaller than the actual mass 1.27 GeV (by 24.1%).
The up quark mass turns out to be 0.163 MeV, which is too small compared with the QCD calculations. We have one complex parameter a 4 in this case and we must work out its effect: The detailed fit to the quark masses and also CKM matrix will be given in appendix A. This discussion justifies that the modular form of k = 2 (level 4) writes down the charge +2/3 quark masses in its Fourier coefficients.
C. Case of k = 3 In this case, we have Suppose for the sake of argument we take then we find We regard the modular form of k = 3 as an expression of the charge −1/3 quark masses.
With the QCD calculated bottom quark mass of 4.18 GeV as an input mass, we obtain The down quark mass is zero and the strange/bottom mass ratio is off by a factor of 1.6. Of course we have two complex parameters a 6 , b 6 to be fixed in this case, and we must adjust these parameters to get more precise fit to the experimental data.
As shown above, in the case of k = 1 where there is no adjustable parameter the fit is almost perfect, and the other two cases require refinement but it is amazing that the values obtained in these cases also are not that distant from the experimental data. Now we need to choose appropriate values for a 4 , a 6 and b 6 . In fact, these complex parameters are needed to fit the CKM matrix which contains some phase factor to explain the CP violation. See appendix A for the concrete calculation.

D. Case of k = 4
In this case, we assume the modular form describes the neutrino masses. The neutrino has two possibilities: 1. Dirac neutrino and 2. Majorana neutrino.
In the case of pure Dirac neutrino, the mass matrix becomes where In the case of Majorana neutrino with seesaw approximation, the mass matrix is given Although the right-handed Majorana mass M R has the same form as in equation (31), it turns out that it has the following unique form since it must be a symmetric matrix, In this work we discuss these two limiting cases: one is the pure Dirac case corresponding to Majorana mass = 0 and the other is the seesaw case where Majorana mass is much larger than Dirac mass. The actual data fitting is done in appendix A.
For k = 5, for example, we have the modular forms This sort of new flavor particles presumably has neither charges nor color charges, but they may have some week interactions in addition to gravitational interactions. Therefore, they may be a good candidate for the dark matter.

A. Lagrangian
We may write down the kinetic energy part of the Lagrangian using the fields defined in equation (16). We have where the indices a, b indicate flavor type and α, β are indices for the gauge group representation. The right and left modes can belong to different representations. The covariant derivative includes the gauge field A µ : Then the kinetic part of the Lagrangian density is given by To maintain the modular invariance we must impose the modular transformation: which means the kinetic term is a single variable modular form of level 2 in contrast to the Yukawa coupling.

B. Supersymmetrization
We may trivially write the Lagrangian in a supersymmetric form. Corresponding to equation (10), we define Corresponding to equation (13), we obtain where Φ Rj and Φ Li are the chiral fields corresponding to a certain flavor. Then we have for i, j = 0, 1, . . . , G − 1. Using a standard form for the chiral field Then the Yukawa coupling (41) can be written as where The kinetic energy part is given by With these conceptual modifications, our Yukawa coupling before the modular variable integration may be interpreted as coming from the compactification of the superstring theory.
First, we assume that the following formula derived first by Strominger and Witten [2] is correct in spite of above conceptual modifications: where K is a certain Calabi-Yau Manifold and Ω is a holomorphic 3-form. The a, b, c originate from gauge fields (principal or vector bundle) in the compactified Calabi-Yau space and are interpreted as harmonic (massless) (0, 1)-form. If we restrict to the case of moduli corresponding to the complex structure deformation, rather than the Kähler structure deformation, the (0, 1)-form a, b, c must originate in the (2, 1)-form. The gauge group A is the maximal subgroup such that, for example, We restrict ourselves to this case, and then it is shown by Candelas and de la Ossa [7] that the rightmost hand side of equation (48) can be written as Here the moduli variables z α (α = 1, 2, . . . , Betti number b 2,1 ) are chosen to be the periods themselves: where A α is an appropriate homology basis.
By identifying our modular variables with the complex structure variables z α [7], we can explicitly calculate G and, therefore, the Kähler potential K is and the Kähler metric of the moduli space of the Calabi-Yau manifold is The precise relation between our modular variables q, r and w and the period z α must respect the scaling behavior under z → λz: whereas the scaling behavior of a modular form depends on its level. Here we consider the SL(2, Z( √ 2)) transformation (5) with β, γ, β ′ , γ ′ = 0: With q = e 2πiτ , r = e 2πiσ and w = e 2πiρ , we have since α must be equal to α ′ so that τ and σ have the same scaling factor. This means that the √ 2 term in α = a + b √ 2 must be zero and the scaling is guaranteed only for integers. If one allows this, then we obtain and where h is the level of the Higgs modular form. Therefore, then we can put and This shows that the period variables z α are given by our modular variables: There are four of these combinations corresponding to charged leptons (k = 1), charge +2/3 quarks (k = 2), charge −1/3 quarks (k = 3), and neutrinos (k = 4): Although ρ corresponds to the Higgs field, each combination has a different relation between ρ and z α as in equation (62) because each combination has its own value of k. This means that there are multiple modular variables corresponding to the Higgs particle, which is acceptable because these variables turn out just to be integration variables. We obtain Therefore, Then the metric of the moduli space of the Calabi-Yau manifold is given by For example, We remark that the other derivatives such as can correspond to some Yukawa couplings, but all these seem not to appear in physics because of the gauge symmetry of the theory. For example, where g µν is the Calabi-Yau metric, B µν is a 2-form related to g µν by supersymmetry, and V is the volume of the Calabi-Yau manifold. For example, If we restrict to the minimum Calabi-Yau manifold, meaning that all of its moduli are directly determined by the experiments as above, we may be able to determine its metric g µν by solving equation (68) together with the Ricci flat and Kähler constraints for g µν . We would like to come back to this issue in a future publication.
V. CONCLUDING REMARKS 1. As we have shown above, the hypothesis of flavor moonshine is at least correctly realized experimentally to some extent. We need to use multi-variable modular forms for this purpose. These forms are well studied in mathematics as a brunch of number theory and they constitute a part of more general forms called Hilbert modular form [8].
2. We use only the Fourier coefficients of these forms to define the Yukawa coupling and the modular invariance of the total Lagrangian is assumed [9]. As such, it corresponds to the procedure of integrating over the modular variables which are identified as Calabi-Yau moduli if we combine our model with string theory. We do not regard these moduli as scalar fields to be stabilized. Insofar as we can see, there seems to be no justification for regarding them as scalar fields. Therefore, our treatment of them as moduli to be integrated out when we define the low energy action seems to be a natural process.
3. Of course, there are many mysteries to be solved. Why nature seems to choose a very specific form such as the one we used that is based on SL(2, Z( √ 2))? Why k = 1 for charged leptons, k = 2 for charge +2/3 quarks, k = 3 for charge −1/3 quarks, and k = 4 for neutrinos?
There remain a lot of works to be done: How good or bad are the other modular groups like SL(2, Z( √ N), SL(2, Z(i)) etc.? Can we extend the modular form to be more than two variables? What exactly is the mathematical moonshine for the modular form of two variables? If we understand the mathematical implication of the matrices which appear in the Fourier coefficients of two variable modular forms, we will be able to prove the flavor moonshine by understanding the physical principle that identifies mass matrices with these matrices.
4. Probably more urgent work from the string theory standpoint is to find out the specific Calabi-Yau metric by solving equation (68) and to elucidate its other physical consequences. Further questions arise such as: Do we have a grand unified scale? Do we have a phase transition from G = 3 to G ≥ 4 at some point in higher energy?
5. Experimentally, we need to explore the property of Higgs particle in more detail, especially its coupling to low mass particles such as u, d, e, µ and even neutrinos. Construction of ILC, therefore, is urgent. A good neutrino facility is also highly desirable. The Higgs particle is indeed the "God particle," the term coined by Leon Lederman [10], in the sense that its Yukawa couplings determine the highest energy physics without the need to perform the highest energy experiments.
6. It is possible that the whole idea of flavor moonshine is just nonsense [11], although the agreement with the experimental data seems to us too good to be just an accident.
Appendix A: Numerical fitting for experimantal data We calculated numerically the CKM and PMNS matrices and fit the experimental data to them. In the former case we have three complex parameters a 4 , a 6 and b 6 as shown in equations (8) and (9). For the PMNS matrix we have two choices of pure Dirac neutrino or Majorana neutrino (with seesaw approximation). In either way, we have again three complex parameters a 8 , b 8 and c 8 shown in equation (31). Since the parameter d 8 in equation (34) is an overall factor, we need not consider it in our discussion.
Let us briefly explain how we get the CKM matrix, which is parallel to the PMNS matrix. Now we have the mass matrix M 3 for u, c, t quarks, as in section II B, with the complex parameters.
First we calculate the squared mass matrix as in equation (21): Here we have the two choices that give us the same eigenvalues but different eigenvectors. To obtain its eigenvalues and eigenvectors, we compute where U is a unitary matrix and D is a diagonal matrix. The masses of u, c, t quarks are given by the square root of the eigenvalues: (A2) Here we have swapped the columns of U and D so that m u < m c < m t . Then the eigenvectors are regarded as the quark mass states where we set the quark current states as We repeat similar calculations for d, s, b quarks (see section II C) and obtain where V is a unitary matrix including the eigenvectors of the squared mass matrix for d, s, b quarks. Note that, by definition, the current quarks should satisfy Therefore, the CKM matrix can be calculated as For calculation of the PMNS matrix, we use the mass matrix M 3 for charged leptons in section II A and M 3 = M D or M M for neutrino in section II D.

Methods
Our goal is to find a set of complex parameters that best fit the experimental results.
The experimental results we use here are • the absolute values of the elements of the mixing (CKM or PMNS) matrix ζ ij • the ratios of masses ξ k .
The mixing matrices in both cases have 3 × 3 = 9 elements. Note that the CP violation phases are not used for our fittings. For quark masses, we choose the parameters ξ k = (m t /m c , m b /m s ). This means we do not fit u and d quark masses: In all the results we obtained they are much smaller than experimental results, just as we saw in section II B.
For lepton masses, we choose ξ k = ∆m 2 21 /∆m 2 32 , i.e., a ratio of difference of squared neutrino masses. Since the masses of e, µ and τ are already fixed, as in section II A, we have no parameters to fit them.
Then we define the loss function to measure a "difference" between our results and the experimental results: where ζ exp ij and ξ exp k are the experimental results, while ζ cal ij and ξ cal k are our results of numerical calculations (which depend on the three complex parameters). The factor of 2 exists in the second term ensures that the contribution from this term cannot be much smaller than that from the first term: the ratios of masses have only 2 (or 1) parameters in the quark (or lepton) case, while the mixing matrix has 9 parameters. Now let us search the complex parameters at the minimum of the loss function (A8).
First we divide the 3 complex parameters into 6 real parameters x i . Since the following discussion includes calculating eigenvectors of matrices, the iterative approximation with gradient descent is not suitable to be used. Instead, we choose 11 lattice points for each real parameter where for simplicity the lattice spacing δx is the same for all i, and at first we set x 0 i = 0 for all i. Then we have 11 6 lattice sites in total.
After calculating the loss function (A8) at all the lattice sites, we find a set of parameters x min i with the minimum loss among them. Next we set x 0 i = x min i and δx → δx/6, and repeat this procedure in six times. Finally the lattice spacing becomes δx/6 6 .
We tried several cases satisfying 10 −3 ≤ δx/6 6 ≤ 10 −2 , and calculated both cases of the squared mass matrix (A1). Then we obtain a certain set of parameters with the minimum loss among all the results we obtained. In our discussion we regard it as the best fit for the experimental results.

CKM matrix
The best fit we obtained for the CKM matrix is Here we input m t and m b for normalization. The CKM can be expressed in terms of Wolfenstein parameters and we obtain The experimental values for these are [12]  Note that, again, we look at only the central values of the experimental data.
Some comments are in order for these results: 1. The agreement is generally excellent.
2. Masses of u, d, s quarks come out to be rather small. This is due to large hierarchical property of the mass matrices. Lattice QCD mass is somewhat different from the Higgs coupling, especially its renormalization corrections, but it is not clear at this time whether this fact can account for the difference.
3. The CKM matrix has also renormalization corrections [13]. The fact that our result is not far from the experimental value may indicate that our theory is indeed a low energy theory rather than the very short distance theory.

PMNS matrix
Our best fit for the PMNS matrix is obtained as follows. We discuss the two cases of pure Here ∆m 2 21 is our input for normalization, which is the same for all the fittings below. .
We see that