Lattice quantum chromodynamics at the physical point and beyond

the work of the PACS-CS to lattice


Introduction
Lattice quantum chromodynamics (QCD) has close to four decades of history [1], and numerical simulation of lattice QCD has now been pursued for over 30 years [2][3][4][5][6]. Despite the promise that it would provide a first-principles method for extracting physical predictions from the basic Lagrangian of QCD, lattice QCD simulation in practice suffered from restrictions due to the need for various approximations and extrapolations. Efforts over the past three decades have removed these restrictions one by one, and numerical lattice QCD has now reached the point where calculations are possible directly at the physical point with light up, down, and strange quark masses taking physical values. We believe that this is a turning point in lattice QCD research. The purpose of the present article is to review our effort to achieve this goal, to present physics results obtained along the way [7][8][9][10][11][12][13][14][15][16][17][18], and to discuss future directions.
In order to place our work in a historical perspective, let us discuss the situation of lattice QCD simulation leading up to the middle of the 2000s when the work to be reviewed here was started.
There have been five sources of systematic errors which have plagued the numerical simulation of lattice QCD. These are (i) the quenched approximation, which ignores quark vacuum polarization effects, (ii) the effects of finite lattice spacing, (iii) the effects of finite lattice volume, (iv) the difficulty of maintaining chiral symmetry on the lattice, and (v) the need to make an extrapolation from heavy quark masses to their physical values.
Algorithmically, the quenched approximation was overcome by the development of dynamical fermion algorithms, notably the hybrid Monte Carlo method [19] in the middle of the 1980s. Together with more recent extensions, including the polynomial [20,21] and rational [22] hybrid Monte Carlo methods, simulations with dynamical up, down, and strange quarks had become routine by around 2000.
Much work has been devoted over the years to improving the lattice action and operators in such a way that lattice spacing dependent errors in physical observables are reduced in magnitude. This procedure is known as improvement [23][24][25], and the results, implemented non-perturbatively, have been widely employed.
Finite-volume effects occupy an interesting position in this discussion. In general we need large enough volumes to suppress finite-volume corrections, and no methods other than increasing the computer power are viable for overcoming this obstacle. However, there are situations where finitevolume effects measured under a judicious choice of lattice volume can be exploited to explore physics in infinite volume [26,27]. An important case is the property of resonance such as ρ, K * , and . The long-standing issue of CP violation in K meson decays to two pions is also connected to this issue [28].
Concerning chiral symmetry, the well-known Nielsen-Ninomiya theorem [29,30] states that, under reasonable assumptions, lattice fermion action cannot maintain chiral symmetry without species doubling. It was realized in the 1990s that this conclusion could be avoided by modifying chiral symmetry suitable for lattice regularization; both domain wall fermion [31,32] and overlap [33,34] fermion formalisms achieve chiral symmetry in this way. Numerical simulations using these formalisms are much more time consuming, but the obstacle as a matter of principle has been removed.
The final obstacle was an extrapolation from the region of heavy quark masses to much lighter values corresponding to the physical point. The problem stems from the need to invert the lattice Dirac operator D(U ) for quarks on a background gluon field U . The condition number of D(U ) that controls the convergence of iterative algorithms for carrying out inversion is proportional to the inverse of the quark mass, 1/m q . This number is large for physical up and down quarks whose masses of a From a computing point of view, PACS-CS was built using commodity technology. An Intel Xeon processor was used for the CPU and an Ethernet with commercially available switches was adopted for the network. The network throughput was relatively weak for running the standard hybrid Monte Carlo code for lattice QCD. However, the domain decomposition significantly lessened the demand for communication between nodes, and was hence quite well suited to the architecture of PACS-CS. A great advantage of using commodity components was a much shorter time for designing and building the computer compared to developing a custom-made CPU and network.
Let us now list the physics issues we explored in lattice QCD using the domain-decomposed HMC algorithm on the PACS-CS cluster computer. Our study was carried out for the Wilson quark action with a non-perturbatively improved clover term [53] and a renormalization-group improved gluon action developed by Iwasaki [54]. The discretization error from the action is hence O(a 2 ). Three flavors of quarks, corresponding to a degenerate pair of up and down quarks, and a strange quark with a separate mass, are incorporated in the calculation. We choose a lattice size of 32 3 × 64 and the inverse bare coupling constant of β = 1.90, which corresponds to a lattice spacing of a ≈ 0.1 fm.
Our first set of simulations [7,8] explored the quark mass dependence of light hadron masses. With a significant effort to optimize the algorithm, which included a number of techniques that we shall describe below, we were able to lower the up and down quark masses down to the value corresponding to the pion mass of m π ≈ 150 MeV. This was quite close to the physical pion mass of 135 MeV. The first physics result from this work is the light hadron mass spectrum for the ground state mesons and baryons. While this was a calculation at a finite lattice spacing of a ≈ 0.1 fm, the result was consistent with experiment within a few %. Another important result is the implication for the range of validity of chiral perturbation theory. The SU(3) chiral perturbation theory formula treating all three quarks as light could not fit our data for pseudoscalar mesons, strongly implying that the physical strange quark mass is too heavy to be treated by chiral perturbation theory.
The smallest pion mass of 150 MeV, albeit close to the physical value, is still 15% away from it. The question naturally arises whether it is possible to carry out a simulation exactly on the physical point. Running a number of simulations successively adjusting the bare quark mass parameters until the outcome matches the physical values is an unattractive option since it is too time consuming. The answer turned out to be the reweighting technique [10]. This is an application of an old idea [55] in which one adjusts the quark mass parameter through a calculation of the ratio of the quark determinant for a mass parameter pair, one for the original value and the other for the shifted value. In general the ratio fluctuates exponentially with an exponent proportional to volume, and hence one may not expect the reweighting to work in practice. The reason why this is a viable option now is that the original quark mass is so close to the physical value that the fluctuation of the reweighting factor can be reasonably brought under control.
The value of the strong coupling constant and the physical quark masses are fundamental constants of nature. The finite-volume Schrödinger functional approach [56], which was used for determining the non-perturbative value of the clover coefficient for our O(a) improved action, can be applied to calculate the non-perturbative evolution of the relevant factors: the step scaling function for the coupling constant and the renormalization factor for quark mass. The results are described in Refs. [9] and [11]. 4 There are a number of physical applications which can benefit from lattice QCD calculations carried out either near or at the physical point. We have investigated three classic examples: (i) the calculation of the physical properties of the ρ meson as a resonance [15] making use of the relation between the two-particle phase shift and two-particle energy on a finite volume [26], (ii) the electromagnetic form factor and charge radius of a pion [13], whose interest stems from the singular behavior expected toward vanishing quark mass, and (iii) charmed meson spectroscopy [14] explored on a sea of physically light quarks.
A new arena which is opened by the ability to control quark masses around the physical values is the exploration of isospin breaking and electromagnetic difference of hadron masses. The experimental mass differences are so tiny that extrapolations from the region of heavy quark masses cannot determine them in a precise manner. The physical interest in these tiny mass differences stems from the fact that they have implications beyond particle physics, e.g., the proton neutron mass difference is an important parameter in nucleosynthesis after the Big Bang. We have attempted a simulation fully incorporating QED effects and splitting the up and down quark masses through the reweighting technique. Our results are reported in Ref. [17,18].
Since the single hadron properties have been brought under control, one can contemplate the possibility of exploring the world of multi-hadrons, or nuclear physics, directly from lattice QCD. We have explored this area using deuteron and helium, i.e., nuclei with mass numbers 2, 3, and 4 as our initial playground [12,16].
In this article we discuss the topics listed above. We conclude with a brief discussion on the future perspective of lattice QCD.

Formalism
We briefly review the formalism to fix our notations. We employ the renormalization-group improved Iwasaki gluon action [54] and the non-perturbatively O(a)-improved Wilson-clover [53] quark action. The former is composed of a plaquette and a 1 × 2 rectangle loop: with c 1 = −0.331 and c 0 = 1 − 8c 1 = 3.648, and g 2 the bare coupling constant. The latter is expressed as where D q (U ) in the first line represents the lattice Wilson-clover operator for flavor q whose form is spelled out in the second line with κ q the hopping parameter. The field strength F μν in the clover 5 term is given by and The improvement coefficient c SW for O(a) improvement was determined nonperturbatively in Ref. [57].

Acceleration by domain decomposition
In order to discuss the salient features of the domain-decomposed hybrid Monte Carlo algorithm, let us consider a simplified problem given by the partition function where φ is a bosonic field with Dirac and color indices. The hybrid Monte Carlo (HMC) algorithm is based on the rewriting of (3.2) as a canonical partition function for a four-dimensional system with a fictitious Hamiltonian given by where P nμ is an su(3) Lie algebra valued momentum conjugate to U nμ and obeys the Gaussian distribution indicated by the kinetic term. In order to generate gluon configurations distributed according to the weight exp(−H )/Z , one uses the Hamilton equation with respect to the fictitious HMC time τ given by For computer implementation one discretizes τ in steps of δτ . Employing a symplectic integrator such as the leapfrog, one integrates the Hamilton equation from an initial configuration {P nμ (0), U nμ (0)} with randomly generated initial Gaussian momenta over a time interval τ and generates a trial configuration {P nμ (τ ), U nμ (τ )}. This is followed by a Metropolis accept/reject step with the probability P = min{1, exp(− H )}, where H is the change of Hamiltonian from the initial to final configuration, to correct for the bias introduced by the violation of Hamiltonian conservation due to a finite step size δτ . In short the HMC algorithm is a Markov chain Monte Carlo method that consists of molecular dynamics evolution and the Metropolis test.
Since the computational cost is proportional to the number of steps τ/δτ per trajectory, one likes to make the step size as large as possible. On the other hand, the product of the step size and force δτ · F nμ has to be kept bounded since beyond a certain value the trajectory generated by a discrete integrator changes nature from elliptic to hyperbolic, subsequently losing the required properties of high acceptance and reversibility for finite precision arithmetic. In practice the hyperbolic behavior tends to be signaled by a sudden and large jump in the Hamiltonian.
The gluon contribution to the force is bounded since ∂ S gluon (U )/∂U nμ is local and consists of terms made from unitary SU (3) matrices. On the other hand, the quark contribution is non-local. It involves the inverse of the lattice Wilson-clover operator whose minimum eigenvalue decreases with the quark mass m q . Thus one expects the quark contribution to the force to grow as the quark mass decreases, and hence the step size has to be made proportionately smaller.
Calculating the inverse of the Wilson-clover operator by an iterative solver such as the conjugate gradient method requires computations growing as the condition number of the matrix, which is proportional to 1/m q . Including the constraint δτ ∝ V −1/4 for keeping acceptance of the Metropolis step at a reasonable value, and empirical information on the auto-correlation time between successive configurations also growing as 1/m q , the experience on the cost of hybrid Monte Carlo algorithm for full QCD with two flavors of Wilson-clover quark action was summarized as [36] cost[Tflops · years] = C #conf 1000 · 0.6 m π /m ρ with C ≈ 2.8. The rapid rise proportional to m −6 π ∝ m −3 ud represented a serious obstacle in reaching the physical point where m π /m ρ = 0.18 (m π = 135 MeV, m ρ = 785 MeV). Since the estimate (3.6) was reported in a panel discussion [35] at the 2001 International Lattice Field Theory Symposium held in Berlin, the rapid rise has since been called the Berlin Wall.
The crucial point which led to the resolution of the Berlin Wall is the observation by Lüscher [48] that the magnitude of the force is hierarchically ordered. Let us decompose the whole lattice into a set of sub-lattices i , i = 1, 2, . . . , n, and rewrite the Wilson-clover operator in terms of those defined on each of the sub-lattice D UV i and the remainder D IR (this is called Schwarz domain decomposition). One can then decompose the quark contribution to the force into the ultraviolet part F UV nμ arising from the sub-lattice operator D UV i and the infrared part F IR nμ coming from the remainder D IR . Adding the gluon contribution F g nμ to the list, and comparing the magnitude in actual simulations, it was empirically found that the three terms satisfy where the norm is defined by M 2 = −2 nμ tr(M 2 nμ )/4V . Since the magnitudes are so separated, one can employ different step sizes for the three terms which are inversely reordered in magnitude, i.e., Symplectic integrators using more than one step size for various pieces of the force have been known for a long time [49].
In the standard HMC algorithm with a single and common step size, that step size has to be the smallest one corresponding to the gluon force δτ = δτ g . Hence the quark force, which involves the time consuming inversion of the Wilson-clover operator, has to be calculated at every time step. However, with multi-time step integration, the number of inversions of D UV i containing the ultraviolet modes is reduced to every δτ UV /δτ g 1 steps, and that of D IR for the infrared modes is further reduced to every δτ IR /δτ g δτ UV /δτ g 1 steps. Since it is the inversion of D IR containing the 8/31 Downloaded from https://academic.oup.com/ptep/article-abstract/2012/1/01A102/1555639 by guest on 28 July 2018 infrared modes that requires the most computations, one essentially accelerates the HMC algorithm by a factor δτ IR /δτ g 1. In practice one controls the step sizes by three integers N 0,1,2 and writes δτ g = τ/N 0 N 1 N 2 , δτ UV = τ/N 1 N 2 , δτ IR = τ/N 2 with τ the trajectory length. We call this algorithm DDHMC, and implement it for the degenerate up-down quark sector of our full QCD code.

Further acceleration techniques
In our actual simulations on a 32 3 × 64 lattice at the lattice spacing a −1 ≈ 2 GeV, we found that the DDHMC algorithm using a sub-lattice size of 8 4 worked well down to the pion mass m π ≈ 300 MeV. For lighter pion masses, however, the trajectories generated by the algorithm tended to exhibit large fluctuations of F IR with large spikes of H signaling unstable molecular dynamics evolution. We employed the method of mass preconditioning [51] to tame this problem. Let D IR (U ; κ) be the infrared part of the Wilson-clover operator with hopping parameter κ. Let κ = ρκ be a slightly smaller hopping parameter with ρ slightly less than unity. Then one can separate the infrared modes further by decomposing The IR force F IR nμ is then split into F IR nμ andF IR nμ . We call this algorithm MPDDHMC. Employing this algorithm with ρ = 0.9995, we could further reduce the pion mass to m π ≈ 150 MeV [7] where the various terms of the force took the values given by  One can further extend the algorithm in which two mass preconditioners are used to split the infrared force to three terms F IR nμ , F IR nμ , andF IR nμ by setting κ = ρ 1 κ and κ = ρ 2 κ . This version is then called MP 2 DDHMC.
Even with the added mass preconditioning, we found calculations at around the physical pion mass of m π ≈ 150 MeV to be extremely demanding. We implemented the following improvements to stabilize and speed up the molecular dynamics evolution: (i) a chronological inverter [58] which uses the last 16 solutions to construct the initial solution vector of D −1 on the full lattice, (ii) a nested BiCGStab solver, which consists of an inner solver accelerated with single precision arithmetic, and an outer solver with a stringent tolerance of 10 −14 operated with double precision. The approximate solution obtained by the inner solver works as a preconditioner for the outer solver, (iii) a deflation technique when the inner BiCGStab solver becomes stagnant during the inversion of D. The inner solver is then automatically replaced by the GCRO-DR (Generalized Conjugate Residual with implicit inner Orthogonalization and Deflated Restarting) algorithm [59] which automatically constructs and deflates the low modes of D, the source of the stagnant behavior, during the solver iteration.
For the strange quark, we employ the polynomial hybrid Monte Carlo algorithm [20,21] in which the inverse of the Wilson-clover operator is approximated by a polynomial, with the bias corrected by a global Metropolis test. This algorithm works for odd number of flavors, and we improve it [60] with the technique of UV filtering [61]. The domain decomposition is not used. Since we find F s ≈ F IR , the step size is chosen as δτ s = δτ IR .  Full QCD simulation with a degenerate pair of up and down quarks and a strange quark with a separate mass is often called N f = 2 + 1 QCD. In Fig. 1 we compare the computational cost of N f = 2 + 1 QCD for the standard HMC algorithm estimated from (3.6) with the actual cost obtained in our domain-decomposed HMC simulation at m π ≈ 300 MeV and 150 MeV. The figure demonstrates in a dramatic fashion that simulations of full lattice QCD at the physical point for light up, down, and strange quarks are now reality.
The details of our mass-preconditioned domain-decomposed hybrid Monte Carlo algorithm are spelled out in the Appendix of Ref. [7].

Chiral behavior of hadron masses toward the physical point
We generated a set of configurations using the domain-decomposed HMC algorithm on the PACS-CS massively parallel cluster computer. The up and down quarks are assumed degenerate with a common hopping parameter κ ud and the strange quark hopping parameter κ s is separate. The lattice size is chosen to be 32 3 × 64 and the inverse bare gluon coupling constant β = 6/g 2 = 1.90. This value corresponds to the lattice spacing a = 0.0907(13) fm. In Table 1 we list the run parameters.
We note that the run at (κ ud , κ s ) = (0.137 00, 0.136 40) coincides with the lightest pion mass with our previous attempt [46,47] based on the standard hybrid Monte Carlo algorithm. Thus we have come down in the pion mass from m π ≈ 700 MeV to 150 MeV, which corresponds to a factor (700/150) 2 ≈ 22 reduction in the up-down quark mass.
In Fig. 2 we show m 2 π /m AWI ud (for a definition of AWI quark mass, see Section 6), and the ratio of pseudoscalar decay constants f K / f π . The almost linear behavior observed for heavier quark masses (filled points) [46,47] is quite deceptive. The ratio m 2 π /m AWI ud bends upwards when the up-down quark mass approaches the physical value shown by the vertical line. The ratio f K / f π , while not exhibiting such dramatic behavior, nonetheless shows a clear upward deviation toward the experimental point.
Of course this is precisely the behavior expected from chiral perturbation theory toward vanishing quark mass. It is clear, however, that without data close to the physical point, it would be, and in fact it has been, very difficult to reliably pin down the value at the physical point from data in the heavy quark mass region alone.
Given that calculations at the physical point are now possible, one should turn the tables and ask if chiral perturbation theory correctly describes our data. We carry out such an analysis for the octet Table 1. Simulation parameters of our PACS-CS runs on a 32 3 × 64 lattice. Pion and kaon masses are measured values multiplied by a −1 = 2.176 GeV as estimated in Ref. [7]. The third row lists the integers (N 0 , N 1 , N 2 , N 3 , N 4 ), specifying the multi-time steps. MD time is the number of trajectories multiplied by the trajectory length τ . CPU time for unit τ using 256 nodes of PACS-CS is also listed. (left) and f K / f π (right) as a function of m AWI ud reproduced from Ref. [7]. Vertical line denotes the physical point and star symbol represents the experimental value. of pseudoscalar mesons [7], and for the octet and decuplet of baryons [8]. Since we use the Wilsonclover quark action with explicit chiral symmetry breaking, we have to employ chiral perturbation theory incorporating such breaking effects [62]. It turns out that these effects can be absorbed into the leading order low energy constants so that the continuum expressions can be applied. If one assumes that up, down, and strange quarks are light, then one should employ SU(3) chiral perturbation theory. Taking the pion as an example, the well-known formula reads [63] where 3) S. Aoki et al. Fig. 3. SU(2) chiral perturbation theory fit for m 2 π /m AWI ud and f K reproduced from Ref. [7]. FSE in legends means including finite-size corrections. @ph means fit prediction at the physical point. We simultaneously fit data to the formula such as the above for m 2 π /(2m ud ), m 2 K /(m ud + m s ), f π , and f K . We find that the fit exhibits a large χ 2 /dof, and the dependence on the strange quark mass is not reproduced well. Furthermore, the next-to-leading order contribution coming from the kaon loop is uncomfortably large in the decay constants.
A baryon spectrum analysis is tried using heavy baryon SU(3) chiral perturbation theory [64]. Incorporating the chiral symmetry breaking effects of the Wilson-clover quark action is left for future work. The leading order formula yields a reasonable fit of the data. Including next-to-leading order corrections, however, the flavor SU(3) coupling constants are found to take very small values quite different from existing phenomenological estimates. It is our conclusion that the strange quark mass is too large to be treated by chiral perturbation theory.
This situation leads us to make a reanalysis treating only up and down quarks as light. For this purpose we use the SU(2) chiral perturbation theory formula, and make a linear extrapolation or interpolation for the strange quark mass since our simulation points are close to its physical value. We find good fits for pion masses below m π ≈ 400 MeV as shown in Fig. 3.
In Fig. 4 the light hadron spectrum extrapolated to the physical point using SU(2) chiral perturbation theory is compared with experimental values. Finite-size corrections are taken into account for completeness, but they do not make large contributions. For the vector mesons and the baryons we use a simple linear formula m had = α h + β h m AWI ud + γ h m AWI s . Data in the range m π ≤ 400 MeV are used for these analyses. 12  We need three physical inputs to determine the up-down and strange quark masses and the lattice cutoff. We choose m π , m K , and m . The choice of m has both theoretical and practical advantages: the baryon is stable in the strong interaction and its mass, being composed of three strange quarks, is determined with good precision with small finite-size effects.
Numerical values are listed in Table 2. The largest discrepancy between our results and the experimental values is at most 3%, although errors are still not small for the ρ meson, the nucleon, and the baryon. The results are clearly encouraging, but we have to note that we still need to remove errors of O((a QCD ) 2 ).

The reweighting technique and lattice QCD calculations at the physical point
Since reaching the physical point for light up, down, and strange quark masses has been achieved, we may ask if it is possible to carry out calculations in lattice QCD precisely at the physical point. This would be beautiful since, aside from effects of finite lattice spacing, one would be directly exploring the physics of strong interactions as they take place in nature.
A priori we do not know the precise values of bare quark masses corresponding to the physical point. Repeating simulations until one hits the physical point is clearly unattractive. The question, therefore, is whether one can readjust the parameters to the physical point, given a set of configurations generated with parameters close to the physical point. The reweighting technique [55] turned out to meet this need.
Let us consider evaluating O[U ](κ ud , κ s ) (κ ud ,κ s ) , which is the expectation value of a physical observable O at the target hopping parameters (κ ud , κ s ), using the configuration samples generated at the original hopping parameters (κ ud , κ s ). We assume that ρ ud ≡ κ ud /κ ud 1 and ρ s ≡ κ s /κ s 1.
The key is the following identity:  where The reweighting factors can be evaluated by a stochastic method. For the up and down quark sector, for example, introducing a complex bosonic field η with color and spinor indices, the determinant of W is expressed as where · · · η means the expectation value with respect to η.
To reduce the fluctuations in the stochastic evaluation of R ud [U ] and R s [U ] we employ the determinant breakup technique [66]. The interval between κ q and κ q is divided into N B subintervals: κ q ,κ q + q , . . . , κ q + (N B − 1) q , κ q with q = (κ q − κ q )/N B , and the determinant ratios are evaluated with the different set of η and multiplied together.
We implement the reweighting procedure in the following way. Since the lightest pion mass m π ≈ 150 MeV and the lightest kaon mass m K ≈ 550 MeV reached in the first round of simulations [7] are still heavier than the physical values by about 10%, we make an additional run at (κ ud , κ s ) = (0.137 785, 0.136 60). With preliminary trial runs, we then estimate (κ ud , κ s ) =  under control. In addition, we observe a clear positive correlation between the reweighting factors and the plaquette value as expected.
In Fig. 6 we plot the hadron masses calculated at the target point normalized by that of in comparison with experiment. Matching to the physical point is very good. The deviation for the ρ meson may be ascribed to its resonance nature, and those for baryons probably reflect finite-size effects; the box size of 3 fm for our 32 3 × 64 lattice is probably too small for baryons.

Strong coupling constant and quark masses-fundamental constants of QCD
The values of the strong coupling constant α s = g 2 /4π and the quark masses m q , q = ud, s are fundamental constants of nature. Since they both vary under scale change, we need to understand their renormalization, and specify the scale μ and the scheme to quote their values. We employ the finite-volume Schrödinger functional method for this purpose [67,68].

Strong coupling constant
The Schrödinger functional is the partition function defined on a lattice of a finite extent L 3 × T with certain boundary conditions imposed on the gluon and quark fields. The coupling constantḡ 2 (L) in the Schrödinger functional scheme is defined through a variation of the effective action under a certain change of the gluon boundary condition.
The fundamental quantity in this method is the step scaling function σ (u). It represents the change of the coupling constant under a scale change by some factor s, typically taken to be s = 2; g 2 (s L) = σ (u) with u =ḡ 2 (L). One numerically calculates the lattice value of the step scaling function (u, a/L) for a set of lattice sizes and the bare coupling constant, and takes the limit a/L → 0 to find the continuum limit σ (u) = lim a/L→0 (u, a/L). This is done for a set of values of u covering a sufficient range from weak to strong coupling. Once one has σ (u), one can calculate the β function β( Fig. 7 we plot our results for the step scaling function (left-hand panel) and the beta function (right-hand panel) for N f = 3 QCD in the Schrödinger functional scheme [9].
The strong coupling constant α s (μ) is usually quoted in the MS scheme at a high energy scale μ = M Z . Finding this value requires evaluating the renormalization-group invariant scale for N f = 3 QCD at low energy, and then running up to M Z , crossing the charm and bottom thresholds on the way. Since we have the step scaling function, the first step can be carried out if we have the value of the coupling constantḡ 2 (L max ) at some scale L max . This in turn requires the value of  L max in physical units, which we supply from the determination of the lattice spacing through hadron mass spectrum calculations.
In order to control the continuum limit, we may employ the scale determination from the previous N f = 2 + 1 QCD simulation at three lattice spacings [46,47] carried out by the joint CP-PACS-JLQCD Collaboration using the same gluon and quark actions. The result is given by [9] α s (M Z ) = 0.12047(81)(48)( +0 −173 ), (6.1) where the first error is statistical and the second represents the systematic error of perturbative matching across the charm and bottom thresholds. The last parenthesis shows an estimate of the systematic error from a constant and a linear continuum extrapolation. For comparison, the world average excluding lattice results is α s (M Z ) = 0.1186 (11). [70] We have to note that the above results do not include systematic errors due to chiral extrapolation. If one uses the scale from the current N f = 2 + 1 runs using the domain-decomposed HMC algorithm down to m π ≈ 150 MeV [7], one finds α s (M Z ) = 0.1225(14)(5) and

Light quark masses
We define the bare quark mass through the axial vector Ward-Takahashi identity (AWI) by taking the ratio of matrix elements of the pseudoscalar density P =qγ 5 q and the fourth component of the axial vector current A 4 :m where |PS denotes the pseudoscalar meson state at rest and f and g ( f, g = u, d, s) label the flavors of the valence quarks. We employ the non-perturbatively O(a)-improved axial vector current A imp 4 = A 4 + c A∇4 P with∇ 4 the symmetric lattice derivative, and c A = −0.03876106 as determined in Ref. [71]. The O(m q a) corrections are removed by defining  where the improvement coefficients b A,P are perturbatively evaluated up to the one-loop level [72] with tadpole improvement. The renormalization of quark mass as defined above is given by those of A μ and P. We therefore have to determine the step scaling function of the pseudoscalar density σ P (u), and the renormalization constant Z A for the axial vector current which is scale independent due to partial conservation.
In Fig. 8 we show our results for the step scaling function for the pseudoscalar density and the renormalization constant for the axial vector current. We combine them with the bare quark masses determined in hadron mass spectrum calculations to calculate the quark masses m q (μ) at a scale μ in physical units. With the present PACS-CS data [7], we find m MS ud (μ = 2 GeV) = 2.78 (27) MeV for the average up and down quark, and m MS s (μ = 2 GeV) = 86.7(2.3) MeV for the strange quark [11]. We have to wait until simulations at weaker couplings are made to see if these values stay toward the continuum limit.

Resonance properties through finite-volume two-particle analyses
Masses of resonances such as ρ, K * and cannot be correctly extracted from the exponential decay of two-point functions. Instead, one can use the relation between the energies of two-particle states on a finite box and the elastic phase shift to calculate the latter, from which one can extract the pole position and the resonance width. While the theoretical foundations of this topic were laid out quite early [26,27] and have been explored in simulations for many years [73,74], it is only recently, with the development of full QCD simulations near the physical point, that a realistic calculation of the resonance properties has become feasible.
The basis is the formula which connects the energy E of a two-particle state on a finite box with the elastic phase shift δ(k) in the same channel. When the two particles are degenerate in mass, k is defined by E = 2 √ m 2 + k 2 with m the mass of each particle. For zero center of mass momentum and a spatial lattice of a size L 3 , the connection is given by with q = k L/(2π). The extension to a non-zero center of mass is also known [75]. 17 (1, 0) for the irreducible representations considered in Ref. [15], ignoring hadron interactions. P is the total momentum, g is the rotational group on the lattice in each momentum frame and is the irreducible representation of the group. The vectors in parentheses after (ππ) and ρ refer to the momenta of the two pions and the ρ meson in units of 2π/L. We use the notation (ππ)(p 1 )(p 2 ) = π + (p 1 )π − (p 2 ) − π − (p 1 )π + (p 2 ) for two-pion states. Index j for the T − 1 representation takes j = 1, 2, 3 and k for E − takes k = 1, 2.
We apply this formalism to the two-pion channel to extract the I = 1 P-wave phase shift, with which we calculate the resonance properties of the ρ meson [15]. Calculations are carried out for the two configuration sets corresponding to m π = 410 MeV and 300 MeV on a 32 3 × 64 lattice listed in Table 1.
In order to calculate the phase shift at various energies from configurations with a single size, we consider three choices for the total momentum: the center of mass frame with total momentum P = (0, 0, 0) (CMF), the non-zero momentum frame with P = (2π/L)(0, 0, 1) (MF1), and P = (2π/L)(1, 1, 0) (MF2). The ground and first excited states for these momenta with isospin (I, I z ) = (1, 0), ignoring the hadron interactions, are shown in Table 3 where g is the rotational group in each momentum frame on the lattice and is the irreducible representation of the rotational group.
For the T − 1 and E − representations the invariant mass √ s = E 2 − P 2 for the two-pion state is sufficiently heavier than the ρ mass so that we only calculate the ground state energy from the ρ meson propagator in the standard way. For the A − 2 and B − 1 representations for which the invariant mass is close to or lighter than the ρ mass, we calculate a 2 × 2 matrix of correlation functions in the ρ and ππ channels given by We extract the ground and first excited state energies by a single exponential fit of the two eigenvalues λ 1 (t) and λ 2 (t) of the matrix M(t) = G(t)G −1 (t R ) with some reference time t R , assuming that the lower two states dominate the time correlation function. In Fig. 9 we plot the results for the phase shift in terms of k 3 /( √ s tan δ(k)). For an effective ρππ coupling of form L eff = g ρππ μabc abc (k 1 − k 2 ) μ ρ a μ ( p)π b (k 1 )π c (k 2 ), we expect a linear relation given by A χ 2 fit then yields an estimate of the resonance mass m ρ and the ρππ coupling g ρππ . We list the results in Table 4.
In terms of g ρππ , the physical ρ meson width is given by where k = m 2 ρ /4 − m 2 π and physical values should be used for m ρ and m π . Substituting the experimental width ρ = 146.2(7) MeV yields g ρππ = 5.874 (14). Our results at the two pion masses 18/31

Pion electromagnetic form factor
The electromagnetic form factor of a pion is defined by where Q 2 = −q 2 = −( p − p) 2 is the four-momentum transfer, and J μ is the electromagnetic current given by An interesting feature is that the slope of the form factor at the origin, i.e., the electromagnetic charge radius, defined by is expected to diverge logarithmically as m π becomes small. To extract the form factor, we use the ratio defined by is the pion-to-current three-point function. If one chooses the momenta of the two pions satisfying p = p but | p | = | p|, the ratio converges for large τ and t f as  In contrast with more conventional choices of momenta such as p = 0 and p = 0, our ratio does not involve the two-point function of the pion, and this leads to much better signals as the pion mass is decreased [13]. On a 32 3 × 64 lattice with a 2 GeV inverse lattice spacing, the minimum non-zero quark momentum for the periodic boundary condition 2π/La is about 0.4 GeV. In order to study the form factor for smaller momentum transfer, we use the partially twisted boundary condition technique [76]. If one imposes the following boundary condition on valence quark fields: the spatial momentum of that quark is quantized according to where L denotes the spatial lattice size, e j the unit vector in the spatial jth direction, and θ j a real parameter. In this way one can explore arbitrarily small momenta on the lattice by adjusting the value of twist θ j . We apply the Z (2) ⊗ Z (2) random noisy source to improve on statistical errors, as introduced in Ref. [77].
In the left-hand panel of Fig. 10, we plot our result [13] for the form factor at m π ≈ 300 MeV and 410 MeV as a function of the momentum transfer Q 2 . The dashed and solid lines are data fits of the next to next leading order (NNLO) formula of the form factor to SU (2) chiral perturbation theory, combining data for m π = 300 MeV and 410 MeV.
In the right-hand panel we shown the charge radius calculated from the NNLO chiral perturbation theory fit (open circles) as compared to experiment at the physical point (red filled circle). Blue filled circles are the results of the monopole fit of form We see that chiral perturbation theory is reasonably convergent over the range m π ≈ 300 − 410 MeV and Q 2 ≤ 0.1 GeV 2 explored in this calculation. The charge radius extrapolated to the physical point is in agreement with experiment, albeit with a 10% error. The PACS-CS gauge configurations have a set corresponding to m π ≈ 150 MeV. We tried to calculate the form factor on this set, and found that the pion two-and three-point functions exhibited 20 very large fluctuations, to the extent that taking a meaningful statistical average was difficult. This trend became more pronounced as the twist carried by quarks was made larger. Since Lm π ≈ 2.3 at this pion mass for L = 32, we suspect that this phenomenon is caused by the small size of the lattice relative to the pion mass, and the consequent increase in large fluctuations. It is our conclusion that calculating the pion form factor directly at the physical point requires larger lattices such as 64 4 .

Heavy quark physics on a physical light quark sea
We have seen in Section 5 that the reweighting technique allows us to adjust the light up, down, and strange quark masses to the physical values. We can therefore explore heavy quark physics on a physical light quark sea background. This has the advantage that systematic errors associated with chiral extrapolation in the light quark masses, which otherwise plague physical predictions, are absent. Errors due to charm and bottom quenching are expected to be small. We apply this method to the charm quark for the set of configurations at (κ ud , κ s ) = (0.137 785, 0.136 60) reweighted to (0.137 796 25, 0.136 633 75).
Since the lattice spacing on our set of configurations is a −1 ≈ 2 GeV, the charm quark can be treated directly, i.e., without resorting to heavy quark effective theory. We employ the relativistic heavy quark formalism [78] to deal with the discretization errors of O(m charm a).
The relativistic heavy quark formalism is designed to reduce the cutoff errors of O((m Q a) n ) with arbitrary order n to O( f (m Q a)(a QCD ) 2 ), where f (m Q a) is an analytic function around the massless point m Q a = 0, once all of the parameters in the relativistic heavy quark action have been determined non-perturbatively. The action is given by where κ Q is the hopping parameter for the heavy quark. In the present calculation we adjust the parameters r t , r s , c B , c E and ν as follows. We are allowed to choose r t = 1, and we employ the one-loop perturbative value for r s [79]. For the clover coefficients c B and c E we include the non-perturbative contribution in the massless limit c NP SW for three-flavor dynamical QCD [57], and calculate the heavy quark mass dependent contribution perturbatively to one-loop order [79] according to The parameter ν is determined non-perturbatively to reproduce the relativistic dispersion relation for the spin-averaged 1S states of charmonium. Writing  Fig. 11. Left: charmonium mass spectrum in N f = 2 + 1 lattice QCD normalized by the experimental values reproduced from Ref. [14]. Right: hyperfine splitting of charmonium calculated with different numbers of dynamical light flavors, Ref. [14] for N f = 2 + 1 and Ref. [14] for N f = 2, 0.
of f (m Q a)(a QCD ) 2 , due to the use of one-loop perturbative values in part for the parameters of our heavy quark action. We tune the heavy quark hopping parameter by the mass for the spin-averaged 1S states of charmonium, experimentally given by [65] M(1S) = (M η c + 3M J/ψ )/4 = 3.0678(6) GeV. where the first error is statistical, the second is systematic from the scale determination, and the third from uncertainties in the renormalization factors. We show in the left-hand panel in Fig. 11 the charmonium spectrum normalized by experiment. The agreement is satisfactory within one standard deviation. The right-hand panel demonstrates that light dynamical sea quark effects are crucial for understanding charmonium hyperfine splitting. The filled circle shows the result from PACS-CS [14]. Results with dynamical up and down quarks (N f = 2) and with the quenched approximation (N f = 0) are from the CP-PACS Collaboration [80] using the same heavy quark formalism. A small discrepancy that is still visible between the filled circle and experiment should go away as we approach the continuum limit.
The decay constants f D and f D s of charmed D and D s mesons are important for determining the V cd and V cs elements of the Cabibbo-Kobayashi-Maskawa matrix since they are directly related to the leptonic decay width  Table 5. Results for the decay constants of the D and D s mesons from Ref. [14]. The first error is statistical, the second is a systematic error from the scale determination, and the third is from the renormalization factor. Experimental data are also listed [65].
Lattice Experiment (6) Our results are compared with experiment in Table 5. While the agreement for f D s is satisfactory, the result for f D differs by about two standard deviations. It is necessary to take the continuum limit for a definite comparison.

QCD + QED simulation and electromagnetic mass splitting of hadron masses
Up to now we have assumed isospin symmetry in the quark mass, m u = m d . However, in nature, isospin symmetry is broken by the quark mass difference, m u = m d , and by the difference of the quark electric charges, e u = e d . Isospin symmetry breaking causes mass splittings in isospin multiplets of light hadrons, e.g., m K 0 − m K ± , m n − m p . The magnitude of the splittings is not large, but is important since, e.g., it is this difference which guarantees the stability of the proton. Thus, our next target should be a QCD + QED lattice calculation at the physical point incorporating quark mass differences.
A pioneering QCD + QED lattice simulation was carried out a long time ago [83,84] using the quenched approximation both for QCD and QED. The possibility of including dynamical QED effects was first investigated in Ref. [85]. Recently, results for the low-energy constants in chiral perturbation theory including dynamical quark effects both in QCD and QED were reported [86]. We present our results for a full QCD + full QED simulation at the physical point using the reweighting technique [17,18].
Let D(e, κ) be the Wilson-clover operator for electric charge e and hopping parameter κ on a background gluon field U nμ and U (1) electromagnetic field U em nμ . The reweighting factor we have to calculate is given by where e u,d,s and κ * u,d,s are the physical charge and hopping parameter of up, down, and strange quarks. Controlling the fluctuations of the reweighting factor turned out to be more difficult than the pure QCD case treated in Section 5. First, the reweighting factor for each flavor has a significant fluctuating O(e) part. This problem was tamed by estimating the reweighting factor for the three flavors together, the reason being the fact that the total charge vanishes as e u + e d + e s = 0 [86]. Secondly, for the Wilson-clover quark action, the additive quark mass renormalization due to short-distance 23 Table 6. Comparison of masses calculated in QCD + QED and experiment from Ref. [17,18]. Lattice spacing a −1 = 2.235 (11) GeV is fixed by the − baryon mass. QED effects turned out to be quite large. This difficulty was treated by generating the U (1) electromagnetic field on a lattice twice as fine as the original QCD lattice and blocking over 2 4 sublattices to smooth out short distance fluctuations. We also applied a large number of breakups. In Table 6 we use the − baryon mass to fix the lattice spacing a −1 and compare the calculated π + , K + , K 0 masses with experimental values; this demonstrates that we have successfully tuned the hopping parameter taking into account full QCD and QED effects. Figure 12 shows the ratio of K 0 to K + propagators whose time dependence is expected to follow where we assume (m K 0 − m K + )t 1. The negative slope indicates m K 0 > m K + . The fit result with [t min , t max ] = [13,30] gives 4.54(1.09) MeV, which is consistent with the experimental value 3.937(28) MeV [65] within error.
In Table 7 we list the quark masses in the MS scheme at scale μ = 2 GeV. Non-perturbative renormalization factors employed for the pure QCD case are also used here [11]. The QED corrections to the renormalization factor, whose contribution would be a level of 1% or less, are ignored. The first errors are statistical and the second ones are associated with computation of the non-perturbative normalization factors. Finite-size corrections due to QED, if estimated with chiral perturbation theory [87], would be −13.50%, +2.48%, and −0.07% for the up, down, and strange quark masses, respectively. 24

Going beyond hadrons to nuclei
The strong interaction dynamically generates a hierarchical structure: three quarks are bound to form a nucleon with an energy of 1 GeV, and nucleons are bound to form atomic nuclei with a binding energy of about 10 MeV per nucleon. Since all nuclei are ultimately made of quarks and gluons, lattice QCD should help us quantitatively understand the structure and property of known nuclei based on its first principles. This direct approach will be more important and indispensable if we are to extract reliable predictions for experimentally unknown nuclei in the neutron-rich region of the nuclear chart. Two approaches are being pursued in this direction; one can extract nuclear potentials from lattice QCD calculations and use them as input in nuclear physics calculations [88], or one can calculate nuclear observables directly in lattice QCD [12,16]. We describe here two attempts of the latter category.

Deuteron
The simplest nucleus is the two-nucleon system which includes deuteron. The dynamics for this system, however, is already subtle; the binding in the spin triplet deuteron channel is quite shallow and the spin singlet state is not bound. Lattice studies of the two-nucleon system were started quite some time ago [89,90], and have been addressed extensively, including recent simulations in full QCD with dynamical quarks. Nonetheless, no clear evidence of bound state formation has been reported so far. Indeed, all studies analyzed data based on the assumption that bound states do not form for heavy pions beyond m π 0.3 GeV, and tried to extract the scattering lengths using Lüscher's formula [26,27] relating two-particle energies to elastic scattering data.
Clarifying this first issue requires a systematic finite-size study of the energy levels of the twonucleon system. Our strategy is: (i) resort back to quenched QCD to access a large range of spatial lattice sizes from L = 3 fm to 12 fm with high statistics, (ii) initially work with a heavy quark mass, and (iii) in addition to the standard single-state analysis looking for the lowest energy for the two-particle ground state, carry out a two-state analysis for the ground and first excited states diagonalizing the 2 × 2 matrix of the nuclear Green's function. The last point is important to confirm the bound state nature since, as is well known from quantum mechanics, a shallow bound state is accompanied by a scattering state just above the two-particle threshold in a finite volume. In this analysis we employ the Helmholtz kernel for the sink operator given by where q 2 is a parameter, r is the relative coordinate between two nucleons, and the overall factor C q 2 is determined from the normalization condition |W q 2 ( r max )| = 1.   13. Left: ground state energy in the 3 S 1 (deuteron) channel relative to the two-nucleon threshold in GeV units from reproduced Ref. [16]. Several types of operators and ensembles are employed. The blue triangle comes from two-state diagonalization analysis. Results extrapolated to the infinite spatial volume limit (filled circle) and experimental values (star) are also presented. Right: the same for the 1 S 0 channel.
In Fig. 13 we plot the ground state energy relative to the two-nucleon threshold obtained in our analysis as a function of spatial lattice volume. The results from two-state diagonalization analysis (blue triangles) and standard single-state analysis (other symbols) are mutually consistent, and together point strongly towards a non-vanishing negative value in the infinite spatial volume limit (filled circle) both for the triplet 3 S 1 and the singlet 1 S 0 channel at the heavy pion mass m π = 0.800 GeV.
The bound state nature of the ground states is corroborated by the results in Fig. 14 for the first excited states. For both channels, the energy levels are above the two-nucleon threshold, and behave as E − 2m N ∝ 1/L 3 , which is consistent with scattering states just above the threshold.
We emphasize that the quenched approximation used in this work should be replaced with full QCD with dynamical up, down, and strange quarks, and their masses have to be brought down to the physical values. Clarification of how the bound state in the singlet channel becomes a scattering state while the triplet state remains bounded as one approaches the physical point is an interesting issue.

Helium nuclei
Helium is the first non-trivial nucleus, and we carry out our study [12] in the same spirit as for deuteron, using the quenched approximation and a heavy quark mass to obtain spatial lattice sizes and statistics. 26  nuclei obtained with two types of sources S 1 , S 2 reproduced from Ref. [12]. Results extrapolated to the infinite spatial volume limit (filled circles) and experimental values (stars) are also presented.
The helium calculation exposes the computational difficulty of the multi-nucleon system: that the number of Wick contractions for quarks for constructing nuclear Green's functions grows factorially fast with the mass number of the nuclei. A naive counting gives (2N p + N n )!(2N n + N p )! for a nucleus composed of N p protons and N n neutrons, which quickly becomes prohibitively large beyond the three-nucleon system, e.g., 2880 for 3 He and 518 400 for 4 He. This is a challenging issue for nuclear calculations based on lattice QCD for which a general solution is not yet available. For the helium nuclei at hand, we reduce this number by working out equivalent contractions (i) under the permutation symmetry in terms of the protons or the neutrons, (ii) under isospin symmetry for 4 He consisting of the same number of protons and neutrons, and (iii) a computer scrutiny of the remaining equivalent contractions. We find that only 1107(93) contractions are required for the 4 He ( 3 He) nucleus correlation function.
Another technique to save computational cost is a modular construction of the nucleus correlation functions. We first make a block of three quark propagators where a nucleon operator with zero spatial momentum is constructed in the sink time slice. We also prepare several combinations of the two blocks which are useful for the construction of the nucleus correlators.
The actual computations are made at the same lattice spacing and quark mass as in the two-nucleon study in Section 11.1, and employ L 3 = 24 3 , 48 3 , and 96 3 spatial sizes. In Fig. 15 we show the energy difference − E L = M − N N m N in GeV units between the helium ground state M relative to the same number of free nucleons as a function of spatial volume for two choices of smearing at the quark source. The dependence on volume is consistent with an almost constant behavior, indicating a negative value and hence a bound state for the infinite-volume limit. This applies to both 4 He and 3 He channels.
Clearly, extending the line of work presented here to lighter quark masses in full QCD should be very interesting to pursue. Another important issue is the development of a strategy to cope with the factorial growth of Wick contractions to open a way to calculate nuclei with larger atomic numbers.

Conclusions
In this article, we have presented a review of work carried out by the PACS-CS Collaboration using a massively parallel cluster computer, PACS-CS, developed at the Center for Computational Sciences at the University of Tsukuba. The computer was commissioned in June 2006 and was operated for over five years until it was shut down in September 2011. The PACS-CS Collaboration for lattice QCD was formed in August 2005 to follow up on the CP-PACS Collaboration, and continued till March 2011.
The major physics goal of the Collaboration was to realize the physical point simulation of lattice QCD with light dynamical up, down, and strange quarks, taking advantage of the algorithmic progress offered by the domain-decomposition idea [48]. We have shown in this article how this goal was achieved.
In lattice QCD, chiral perturbation theory was often employed to reach the physical point from data obtained at quark masses away from the physical point. Our results confirm clearly the characteristic features of the chiral logarithm in the ratios m 2 π /m ud and f K / f π . We find, however, that SU(3) chiral perturbation theory does not describe the data well due to bad convergence of the strange quark contributions. The SU(2) chiral perturbation theory for up and down quark masses does seem to converge for the pion masses below m π ≈ 300-400 MeV, but since we can actually calculate on the physical point, the need to rely on chiral perturbation theory is very much weakened.
It is in general not possible to choose the bare action parameters so that the simulation is done exactly at the physical point. However, we have shown how the reweighting technique applied to the quark determinant allows us to adjust the bare parameters to the physical point.
These developments have resolved the long-standing issue of chiral extrapolation in lattice QCD. Physical predictions still require a large enough volume to control the finite-size effects and continuum extrapolation to remove lattice spacing errors, but we have already had a great deal of experience in dealing with them. Indeed, for the ground state hadron spectrum and light quark masses, the BMW Collaboration has carried out a continuum extrapolation based on simulations with small pion mass and large volumes [91,92].
Progress in lattice QCD depends on the triad of physics understanding, algorithms for computation, and computing power. The recent progress discussed in this article is a good example in which the triad worked together extremely well. Since computing power is rapidly growing into the peta-scale and beyond, lattice QCD can take advantage of the progress to move out into new directions.
One direction will be to explore the physics of strong interactions directly at the physical point with large volumes and high precision. Some of the problems discussed in this article, such as the ρ meson as a resonance and hadron form factors, belong to this category. The calculation of electroweak matrix elements of hadrons to precisely pin down the CKM matrix elements and CP violation also occupy a very important role in the physics of the Standard Model and beyond.
This follows the traditional approach of lattice QCD, however. A somewhat different direction might be to step beyond verifying lattice QCD, and start asking what happens if the parameters of QCD are different from what they are in the present universe. For example, the helium abundance in primordial nucleosynthesis sensitively depends on the neutron proton mass difference. Hence, understanding how this mass difference varies as a function of up and down quark masses will deepen our understanding of the present universe. Another example is deuteron. Model calculations tend to imply that deuteron becomes unbound if up and down quark masses are slightly larger. The whole area of nuclear physics is abundant in such issues, and hence interesting beyond pushing calculations directly based on lattice QCD.