Worldvolume approach to the tempered Lefschetz thimble method

As a solution towards the numerical sign problem, we propose a novel Hybrid Monte Carlo algorithm, in which molecular dynamics is performed on a continuum set of integration surfaces foliated by the antiholomorphic gradient flow ("the worldvolume of an integration surface"). This is an extension of the tempered Lefschetz thimble method (TLTM), and solves the sign and multimodal problems simultaneously as the original TLTM does. Furthermore, in this new algorithm, one no longer needs to compute the Jacobian of the gradient flow in generating a configuration, and only needs to evaluate its phase upon measurement. To demonstrate that this algorithm works correctly, we apply the algorithm to a chiral random matrix model, for which the complex Langevin method is known not to work.


Introduction
The sign problem is one of the major obstacles to numerical computation in various areas of physics, including finite density QCD [1], quantum Monte Carlo simulations of statistical systems [2], and the numerical simulations of real-time quantum field theories.
There have been proposed many Monte Carlo algorithms towards solving the sign problem, such as those based on the complex Langevin equation [3,4,5,6,7,8,9] and those on Lefschetz thimbles [10,11,12,13,14,15,16,17,18,19,20], each of which has its own advantage and disadvantage.The advantage to use the Complex Langevin equation is its cheap computational cost, but such algorithms are known to suffer from a notorious problem called the "wrong convergence problem" (giving incorrect results with small statistical errors) for physically important ranges of parameters [6,7,9].On the other hand, although computationally expensive, the algorithms based on Lefschetz thimbles are basically free from the wrong convergence.However, this is the case when and only when a single Lefschetz thimble is relevant for the estimation of observables, because otherwise there can appear another problem of multimodality due to infinitely high potential barriers between different thimbles.
The tempered Lefschetz thimble method (TLTM) [16] is a Lefschetz thimble method that avoids the sign and multimodal problems simultaneously, by introducing a discrete set of integration surfaces (replicas of integration surface) and exchanging configurations between replicas.The TLTM has proved effective and versatile when applied to various models, including (0 + 1)-dimensional massive Thirring model [16], the two-dimensional Hubbard model away from half filling.The disadvantage of the original TLTM is its computational cost, the cost coming from the computation of the Jacobian and from the additional cost due to the introduction of replicas.
In this paper, as an extension of the TLTM, we propose a novel Hybrid Monte Carlo (HMC) algorithm, where molecular dynamics is performed on a continuum set of replicas, not on each replica as was done in [20] (see also [13,21]).This algorithm no longer requires the computation of the Jacobian in generating a configuration, which is expensive for large systems.To overview this new algorithm, we first review the basics of the sign problem, and then introduce our algorithm.
Let us consider a system of an N-dimensional dynamical variable x ∈ R N with an action S(x).Our aim is to evaluate the expectation value of an operator O(x), where dx ≡ dx 1 • • • dx N .When the action takes complex values, the Boltzmann weight e −S(x) /Z can no longer be interpreted as a probability distribution, which invalidates a direct application of the Markov chain Monte Carlo (MCMC) method.The simplest workaround is the so-called reweighting method, where a positive weight is constructed from the real part of the action, e −Re S(x) / R N dx e −Re S(x) , and O(x) is estimated by a ratio of reweighted averages, O(x) = e −i Im S(x) O(x) rewt e −i Im S(x) rewt , ( where f (x) rewt ≡ R N dx e −Re S(x) f (x) R N dx e −Re S(x) . (1.3) For large degrees of freedom (DOF), however, the phase factor e −i Im S(x) in reweighted averages makes the integrals highly oscillatory, so that (1.2) becomes a ratio of exponentially small quantities of e −O(N ) even when the ratio should give a quantity of O (1).Since the reweighting is a mathematically equivalent rewriting, it should not give any problems if one can obtain the values of the reweighted averages precisely both in the numerator and the denominator.However, in the Monte Carlo calculations, they are evaluated separately as sample averages, which should be accompanied by statistical errors of O(1/ √ N conf ), where N conf is the size of the sample.Thus, for the naive reweighting method, the expectation value is estimated in the form which means that we need an exponentially large sample size, N conf = e O(N ) , in order to make the statistical errors relatively small compared to the exponentially small mean values.This enormous computational time makes the MCMC computations impractical.This is the sign problem.
Lefschetz thimble methods are a class of algorithms towards solving the sign problem.In these methods, we complexify the integration variables, x ∈ R N → z ∈ C N , with the assumption that e −S(z) and e −S(z) O(z) are entire functions over C N .Then, from Cauchy's theorem, the integrals do not change under continuous deformations of the integration surface from R N to Σ ⊂ C N , as long as the boundary at infinity (|x| → ∞) is kept fixed under the deformations: where dz ≡ dz 1 • • • dz N .We expect that the sign problem is reduced if we can find an integration surface Σ on which e −i Im S(z) is almost constant.
Such surfaces are obtained from the following antiholomorphic gradient flow z t (x) at large flow times: In fact, this flow defines a map from the original integration surface Σ and the flowed surface Σ t approaches in the limit t → ∞ a union of Lefschetz thimbles, on each of which Im S(z) is constant. 1We thus expect that the sign problem is substantially remedied on Σ t for sufficiently large t.The expectation value is then expressed in the form 2 (zt) . (1.9) Here, |dz t | is the invariant volume element of Σ t , and can be expressed with the Jacobian of the flow, J t (x) ≡ ∂z t (x)/∂x, as The phase factor e iϕ(z) in (1.8) is then given by The Jacobian J t (x) can be computed by solving the second flow equation: where H(z) ≡ (∂ i ∂ j S(z)).
When Σ t approaches more than one Lefschetz thimble, Σ t gets decomposed into separate regions as t increases, each region being surrounded by infinitely high potential barriers.This causes a multimodal problem in MCMC calculations. 4The tempered Lefschetz thimble method (TLTM) was proposed in [16] to solve this multimodality by implementing the 2 The original reweighting [eqs.(1.2) and (1. 3)] corresponds to the t = 0 case.When only a single Lefschetz thimble is relevant, one can argue that the exponentially small part in the estimation (1.4) increases as e −e −λt O(N ) , where λ is the minimum singular value of H(z) = (∂ i ∂ j S(z)) at the critical point.We thus expect that the sign problem is removed for the flow time t O(ln N ). 3 This can be shown as 4 In the following discussions, we assume that there is no multimodal problem on the original integration surface Σ t=0 = R N .If this is not the case, we implement an extra algorithm to resolve the multimodality (such as the tempering with respect to the overall coefficient of the action) or make a shift of the starting integration surface from R N .parallel tempering algorithm [22,23,24] with the flow time t used as the tempering parameter.Namely, we prepare a finite set of flow times, {t α }, and introduce copies (replicas) of the corresponding configuration spaces, {Σ tα }. 5 The set {t α } is chosen so as to include large enough flow times to resolve the sign problem, as well as small enough flow times to resolve the multimodality. 6Then, in addition to Monte Carlo updates on each Σ t , we swap configurations between adjacent replicas, which enables easy communications between configurations around different modes, and thus accelerates the relaxation to global equilibrium.Thus, the TLTM is an algorithm that solves the sign and multimodal problems simultaneously, and has proved effective for various models [16,19,20], as mentioned before.However, in this original TLTM, we need to increase the number of replicas as we increase DOF in order to keep sufficient acceptance rates in the swapping process.Furthermore, we have to compute the Jacobian J t (x) every time in the swapping process, which is expensive because the second flow equation (1.12) involves a matrix multiplication, whose cost is of O(N 3 ).
In this paper, we propose a novel Hybrid Monte Carlo (HMC) algorithm, where molecular dynamics is performed on a continuum set of integration surfaces, t Σ t .This algorithm solves the multimodal problem without preparing replicas.Furthermore, the Jacobian of the gradient flow no longer needs to be computed in generating a configuration, and only its phase needs to be evaluated upon measurement.This algorithm is based on the observation that, since integrals on Σ t do not depend on t due to Cauchy's theorem, the values do not change even when we average them over t in a range [T 0 , T 1 ] with an arbitrary weight e −W (t) : Σt dz t e −S(zt) O(z t ) zt)   . (1.13) We denote the new integration region by R (see Fig. 1): which we regard as the worldvolume of an integration surface moving in the "target space" C N = R 2N .We abbreviate the TLTM based on (1.13) as the WV-TLTM.Although the weight function e −W (t) can be chosen arbitrarily, a practically good choice is the one which gives an almost uniform distribution with respect to t (see subsection 3.4 for details).
The expectation value is now expressed as a ratio of reweighted averages over R: Here, the reweighted average is defined with respect to the (real-valued) invariant volume element Dz on the (N + 1)dimensional region R and to the new weight7 e −V (z) ≡ e −Re S(z)−W (t(z)) . (1.17) The accompanying reweighting factor A(z) is then given by A(z) ≡ e −S(z)−W (t(z)) dt dz t e −V (z) Dz = e −i Im S(z) dt dz t Dz . (1.18) The aim of this paper is to establish a HMC algorithm for the reweighted average (1.16) on the worldvolume R. To demonstrate that this algorithm works correctly, we apply the algorithm to a chiral random matrix model (the Stephanov model) [28,29], for which the complex Langevin method is known not to work [30].
This paper is organized as follows.In section 2, we review the basics of the HMC algorithm on a general constrained space R. In section 3, we deepen the argument for the case where the constrained surface R is the worldvolume of an integration surface.We first study the geometry of the worldvolume R by using the Arnowitt-Deser-Misner parametrization of the metric.We then construct molecular dynamics on R, with a prescription to determine the weight e −W (t) .After summarizing the HMC algorithm on the worldvolume R, we give an algorithm to estimate observables.In section 4, we apply this algorithm to the Stephanov model, and show that our algorithm correctly reproduces exact results, solving both sign and multimodal problems.Section 5 is devoted to conclusion and outlook.

HMC algorithm on a constrained space (review)
In this section, we briefly review the basics of the RATTLE algorithm [31,32], which is a HMC algorithm on a constrained space such as our worldvolume R. A detailed discussion is given in appendix A with more geometrical terms.

Stochastic process on a constrained space
Let R be an m-dimensional manifold embedded in the flat space R M = {z = (z I )} (I = 1, . . ., M).We assume that R is characterized by M − m independent constraint equations φ r (z) = 0 (r = 1, . . ., M − m).When R is the worldvolume of an integration surface, we set M = 2N and m = N + 1, treating C N as a real space R 2N .
Denoting the coordinates on R by ξ = (ξ µ ) (µ = 1, . . ., m), the embedding is expressed by functions z I = z I (ξ), and the induced metric on R is given by which defines the invariant volume element as where The probability distribution p(z) on R is defined with respect to Dz, and thus is normalized as R Dz p(z) = 1.The transition matrix is also defined for Dz, so that a transition from a probability distribution p(z) to p ′ (z) (z ∈ R) is expressed with a transition matrix (2.3) For the equilibrium distribution on R with respect to a potential V (z), the detailed balance condition is given by Throughout this paper, we denote a function on R by f (z) and f (ξ), interchangeably, with the understanding that z = z(ξ).The transition matrix on R is also written as P (z ′ |z) and P (ξ ′ |ξ) for z = z(ξ), z ′ = z(ξ ′ ) ∈ R.

HMC on a constrained space
Denoting by π = (π I ) the conjugate momentum to z = (z I ) ∈ R, we consider the Hamiltonian dynamics on R with the Hamiltonian which can be expressed as a set of first-order differential equations in time s with Lagrange multipliers λ r : φ r (z) = 0, (2.9) Equations (2.7)-(2.10)can be discretized such that the symplecticity and the reversibility still hold after the discretization (below ∆s is the step size) [31,32]: ) where λ r and λ ′ r are determined, respectively, so that the following constraints are satisfied: One can easily show that the map Φ ∆s : (z, π) → (z ′ , π ′ ) actually satisfies the symplecticity and the reversibility (with λ r and λ ′ r interchanged): ) (2.17) 8 We regard the tangent bundle T R = z T z R (not the cotangent bundle T * R) as the phase space for motions on R [31,32].See appendix A for details. 9 ω(z, π) ≡ dπ I ∧ dz I T R is the induced symplectic form for the embedding of the phase space T R into T R M (see appendix A).
The HMC algorithm on R then consists of the following three steps for a given initial configuration z ∈ R: Step 1. Generate a vector π = (π and project it onto T z R to obtain an initial momentum π = (π I ) ∈ T z R.
Step 3. Update the configuration z to z ′ with a probability min 1, e −H(z ′ ,π ′ )+H(z,π) . (2.19) The above process defines a stochastic process on R. One can show that its transition matrix P (z ′ |z) satisfies the detailed balance condition (see appendix A):

HMC on the worldvolume
In this section, we apply the general formalism in the previous section to the case where R is the worldvolume of an integration surface.We first clarify the geometry of the worldvolume R and then construct the HMC algorithm on R.

Geometry of the worldvolume R
Recall that our worldvolume R is an N +1-dimensional submanifold embedded in As in the previous section, we again assume that R ⊂ R 2N is characterized by a set of independent constraint equations: ) in R, we can use ξ as coordinates of R. The flow equation (1.6) then takes the form Similarly, we write an N-dimensional The vectors E µ = (E I µ = ∂z I /∂ξ µ ) form a basis of T z R (see Fig. 2), from which the induced metric g µν on R is given by Since our worldvolume R = t Σ t is foliated by the antiholomorphic gradient flow, its intrinsic geometry should be best described (at least for physicists) by the Arnowitt-Deser-Misner (ADM) parametrization, for which the metric is expressed in the form Here, γ ab is the induced metric on Σ t (with its inverse matrix γ ab ), β a is the shift vector, and α is the lapse function, The inverse matrix of (g µν ) can be easily calculated to be (3.9) The geometrical meaning of the ADM parametrization is explained in Fig. 3.
Figure 3: Geometrical meaning of the ADM parametrization.α dt is the geodesic distance from a point ξ = (t, x a ) on Σ t to the surface Σ t+dt .β a dt shows that the time axis t is tilted with respect to the normal of Σ t by this amount in x-coordinates.γ ab is the induced metric on Σ t .The geodesic distance ds between two points ξ = (t, x a ) and ξ +dξ = (t+dt, x a +dx a ) is then obtained from Pythagorean theorem as in (3.5).
Since the flowed surfaces Σ t are determined by the flow equation, we can write down the explicit form of the basis E µ = (E I µ = ∂z I /∂ξ µ ) of T z R as where we have defined J(ξ) = J(t, x) ≡ J t (x).Thus, the induced metric γ ab can be directly expressed in terms of the Jacobian as 12γ ab = Re (J † J) ab = (J † J) ab . (3.11) The lapse function α can be expressed as the length of the normal component of E 0 : Here, the decomposition of E 0 to the tangential and normal components is given by where Π Σt is the projector onto T z Σ t :13 Note that the following relation holds: Then, by using (3.9), (3.12), (3.15) and (3.16), we see that the projector from In the ADM parametrization, the volume element of R is given by (see Fig. 4) Since the complex measure on Σ t is given by dz t = det J dx, we find that the reweighting factor (1.18) takes the form Note that the inverse lapse, α −1 (z), plays the role of the radius of A(z).
In appendix B, in order to show the typical behaviors of various geometrical quantities near critical points at large flow times, we give explicit expressions of these quantities for the Gaussian case with the action There, we find that α −1 (z) increases exponentially in flow time t as z k = z k (t, x) approaches the Lefschetz thimble at Im z k = 1.

Molecular dynamics on the worldvolume
We first rewrite the Lagrange multiplier term λ r ∂φ r in (2.8) in a more convenient form.Note that λ r ∂φ r is normal to R, and thus it satisfies Since the vectors span the normal vector space N z Σ t at z ∈ Σ t , 14 the second equation in (3.21) means that λ r ∂φ r can be written as a linear combination of F a with new Lagrange multipliers λ a ∈ R (a = 1, . . ., N): 15 The first equation in (3.21) is then treated as a constraint on λ a F a : 16 From the above argument, we find that the RATTLE algorithm (2.11)-(2.15)can be 14 F a form a basis of the normal space N z Σ t because E a • F b = −Im (J † J) ab = 0 (see footnote 12).They can be written as F a = iE a as complex vectors. 15We here put the summation symbols to stress the summation ranges for r and a. 16 It is possible to solve the constraint (3.24) as follows.We first take a subset {F r } (r = 1, . . ., N − 1) of {F a }, whose elements are not parallel to E 0 .Then, we construct a basis of written as where λ a and λ ′ a are determined, respectively, so that the following constraints are satisfied: The gradient of the potential, ∂V (z), now takes the form In order to define the gradient ∂t(z) at z ∈ R, we regard (φ r ) as coordinates in the extra dimensions and construct 2N coordinates (ξ µ , φ r ) in the vicinity of R in R 2N .Then, one can show (see appendix C) that the gradient ∂t(z) is given by Since the last term is a linear combination of the gradients ∂φ r (z) and thus can be absorbed into the Lagrange multiplier terms in (3.25) and (3.27), we can (and will) set ∂t(z) to the form (3.32)

Solving the constraints in molecular dynamics
In this subsection, we present numerical algorithms to solve the constraints (3.28) and (3.29).

Solving (3.28)
The condition z ′ ∈ R for a given z = z(ξ) ∈ R is equivalent to the existence of an N + 1-dimensional vector ε = (ε µ ) = (h, u a ) such that z ′ = z(ξ + ε) = z t+h (x + u).Thus, (3.28) can be solved by finding a solution to 2N + 1 equations f P (w) = 0 (P = 0, 1, . . ., 2N) (3.33) for 2N + 1 unknowns w = (w P ) = (ε µ , λ a ) ∈ R 2N +1 (see Fig. 5), where Adopting Newton's method to find a solution, we iteratively update the vector w = (w P ) = (ε µ , λ a ) as w → w + ∆w, where ∆w is a solution of the linear equation The matrix elements ∂f P /∂w Q are easily found to be When the DOF (= N) is small, (3.37) can be solved by a direct method such as the LU decomposition of (3.38), for which we integrate the second flow equation (1.12) to know the value of J.When the DOF is large, the computation of J becomes expensive, and we can instead use an iterative method such as GMRES [33] or BiCGStab [34], for which we do not need to compute the matrix elements of J as in [18].To see this, we first note that the right hand side of (3.37) can be written in terms of complex vectors as The left hand side of (3.37) can also be written as We thus see that in the above equations, J appears only in the form J i a (ξ)v a or J i a (ξ + ε)v a with a real vector v = (v a ) ∈ R N .The former can be evaluated from the solution to the flow equation (1.6) and the following equation [see (1.12)]: by setting The latter is obtained in a similar way, by replacing t → t + h and x a → x a + u a .We thus find that the both hand sides of (3.37) can be calculated without computing the matrix elements of J.

Solving (3.29)
We first note that solving the constraint (3.29) is equivalent to projecting the vector Here, E ⊥ 0 (z ′ ) can be computed as a complex vector to be The second term in (3.45) can also be computed as The expressions (3.46) and (3.47) can again be evaluated either by a direct method with the computation of J(z ′ ), or by an iterative method without computing J(z ′ ) as in [18].When the iterative method is used, the inversion J −1 (z ′ ) c is obtained for a given complex where J(z ′ ) a and J(z ′ ) b are evaluated by integrating the flow equation (3.43) with the initial condition v t=0 = a and v t=0 = b, respectively.The multiplication J(z ′ ) v are again calculated by integrating (3.43).Therefore, the projection (3.45) can be performed without computing J(z ′ ).

Construction of W (t)
In this subsection, we present a prescription to construct a weight function e −W (t) in (1.13) [or in (1.17)] so that it gives an almost uniform distribution with respect to t.The key is that, for a given weight e −W (t) , the probability to find a configuration at t is proportional to Thus, when a weight e −W (t) does not give a uniform distribution of t, the desired weight can be obtained by (see e.g., [35]) because then Z(t; W (new) ) becomes constant in t: Of course, the above procedure is possible only when we know the values of Z(t; W ) explicitly, which is usually not the case.However, we can estimate Z(t; W ) from the histogram of flow times {t}.To be more specific, we first divide the interval [T 0 , T 1 ] into p + 1 bins, I ℓ ≡ [ℓh, (ℓ + 1)h] (ℓ = 0, . . ., p) with h ≡ (T 1 − T 0 )/(p + 1), and generate a certain number (≡ n tune ) of configurations by using V (z) = Re S(z) + W (t(z)) as the potential.The numbers h ℓ of configurations inside the bins I ℓ give a rough estimate of the functional form of Z(t; W ) up to a normalization factor (see Fig. 6).Then, from the histogram h ℓ (ℓ = 0, . . ., p), we calculate and construct a function W (new) (t) to be approximated by a polynomial satisfying with a ℓ ≡ (ℓ + 1/2)h.
In general, the minimum-order polynomial that have values b ℓ at a ℓ = (ℓ + 1/2)h is given by the Lagrange interpolation of the form where, for an array {v 0 , v 1 , v 2 , . ..}, we define ∆v ℓ ≡ v ℓ+1 − v ℓ , so that Using this polynomial, we define 17 Since the estimate of Z(t; W ) from the histogram {h ℓ } includes statistical errors, we use an iterative algorithm to update {W ℓ } until an almost uniform distribution is obtained: • Generate n tune configurations with the potential V (z) = Re S(z) + W (k) (t(z)), and record the numbers h of configurations in the intervals I ℓ .
• Update {W ℓ } as Here, ǫ c is a cutoff to avoid the divergence arising when h (k) ℓ = 0.

Summary of the HMC algorithm on the worldvolume
We summarize the HMC algorithm for a given initial configuration z ∈ R.
In the course of molecular dynamics (Step 2), it sometimes happens that the equation (z ′ , π ′ ) = Φ ∆s (z, π) does not have a solution because z + ∆z passes over the boundary of R (see Fig. 7).Here, the boundary in the t direction is given by T 0 , T 1 , while that in the x direction by zeros of e −S(z) .When a solution is not found near a boundary, we replace the operator Φ ∆s by the momentum reflection T : where for π which is expanded with E µ ≡ g µν E ν as π = η 0 E 0 +η a E a , the reflected momentum π ′ is defined by This preserves the reversibility and the phase-space volume element because the induced symplectic form is given by ω = dη µ ∧dξ µ = dη 0 ∧dξ 0 + dη a ∧dξ a (see appendix A).However, this can change the value of the Hamiltonian.The change comes only from the difference between the norms of momenta π and π ′ , and its effect is absorbed in the probability at the Metropolis test in Step 3 above, so that the detailed balance condition (2.20) still holds.If the change is larger than a prescribed value (e.g., if e −|∆H| = e −|π ′2 −π 2 |/2 < 0.8), we instead use the momentum flip Ψ [20]: Since the replacement of Φ ∆s by T or Ψ preserves the phase-space volume element and the reversibility, the detailed balance condition (2.20) still holds. 18

Estimation of observables
We first recall that the boundary flow times (T 0 and T 1 ) can be chosen arbitrarily due to Cauchy's theorem.In practice, T 0 must be set sufficiently small in order to keep in R a region that is free from multimodality (to be set to T 0 = 0 when the multimodal problem is absent there).On the other hand, T 1 must be taken sufficiently large in order to keep a region where the sign problem is resolved, but, at the same time, T 1 should not be set too large in order to avoid introducing an unnecessarily large computational time.
When estimating observables, we take a subinterval in [T 0 , T 1 ] (to be denoted by [ T0 , T1 ]).Namely, for a sample of configurations that are generated in the range [T 0 , T 1 ] (with sample size N conf ), we construct a subsample {z (k) } (k = 1, . . ., Nconf ) taking configurations from the interval [ T0 , T1 ] (T 0 ≤ T0 < T1 ≤ T 1 ), and take a ratio of sample averages over this subsample, 19 (3.61) T0 now should be set sufficiently large in order to exclude a region that is contaminated by the sign problem.Note that T0 and T1 should be set enough apart in order to maintain a sufficient size for the subsample.Then, if the original range [T 0 , T 1 ] is properly chosen as above, and if the system is well close to global equilibrium, there must be a region in twoparameter space ( T0 , T1 ) such that the estimations Ō( T0 , T1 ) are stable against the variation of the estimation ranges (i.e., estimates change only within statistical errors).
The whole process of the WV-TLTM thus proceeds as follows: 1. Choose a sufficiently small T 0 and a sufficiently large T 1 to tame both sign and multimodal problems.
2. Construct a weight function e −W (t) such that the distribution of t becomes almost uniform (see subsection 3.4 for more details).
3. Use the HMC algorithm in subsection 3.5 to generate configurations in the range 4. For the obtained full sample, vary the estimation range [ T0 , T1 ], looking for a stable region (plateau) in the two-parameter space ( T0 , T1 ) that give the same estimate Ō( T0 , T1 ) within statistical errors.
5. Choose a point ( T0 , T1 ) from the plateau and take the corresponding Ō( T0 , T1 ) as the estimate of O(x) .The error of estimation is read from the statistical error for the chosen subsample.

Application to a chiral random matrix model
In this section, to confirm that the WV-TLTM works correctly, we apply the WV-TLTM to a chiral random matrix model, the Stephanov model [28,29].We show that the algorithm correctly reproduces exact results, solving both sign and multimodal problems.

The model
The Stephanov model is a large N matrix model that approximates QCD at finite density.
For N f quarks with equal mass m, the partition function is given by the following integral over n × n complex matrices X = (X ij = x ij + i y ij ): Here, D + m represents the Dirac operator in the chiral representation and takes the form where The parameters µ and τ correspond to the chemical potential and the temperature, respectively [28,29].The number of DOF is N = 2n 2 , which may be compared with the DOF of the SU(N c ) gauge field on the lattice of linear size L as N = 4(N 2 c − 1)L 4 .For the case N f = 1, the partition function at finite n can be written as an integral over a single variable: where I k (x) (k = 0, 1, 2, • • • ) are the modified Bessel functions of the first kind.Then, the chiral condensate is expressed as Similarly, the number density is expressed as We apply the WV-TLTM to this model, by complexifying the real and imaginary parts (x ij and y ij ) separately, and by considering the antiholomorphic gradient flow with respect to the action given in (4.1).We estimate the chiral condensate and the number density using the formulas It is convenient to introduce the matrices with which D + m and (D + m) −1 are expressed as20 The flow equation is then written only with A, B, K, and the expectation values (4.7) and (4.8) are estimated from the expressions

Setup in the simulation
We summarize the setup in the simulation.We set n = 10, m = 0.004, τ = 0, and estimate the chiral condensate ψψ and the number density ψ † ψ as functions of µ = 0.4, . . ., 0.8. 21e set T 0 = 0 while we choose T 1 depending on µ as in Table 1.Other simulation parameters are also given in Table 1.There, N init is the number of initial configurations, and N conf is the number of configurations in the simulation range [T 0 , T 1 ], while N conf ( T0 , T1 ) is that in the estimation range [ T0 , T1 ], corresponding to the point ( T0 , T1 ) chosen from a plateau.Note that T0 and T 1 depend on the choice of observables.We set the initial configuration to the final configuration in the test run determining W (t). The tuning of W (t) turns out to take two iterations to realize the condition (3.56).In Fig. 8, we show the final form of W (t) at µ = 0.625 and the resulting histogram of t.It sometimes happens that R = t Σ t is not well explored for large t because of the complicated geometrical structure there.To facilitate transitions, at every start of the HMC algorithm we change the step size ∆s and the step number n by randomly taking them from a set C = {(∆ c , n c } (c = 1, . . ., c max ). 22In the calculation below, we set c max = 3 and choose C as in Table 2. Table 2: HMC parameters.
We comment that, if we use the original TLTM based on the parallel tempering, we need about 70 replicas for n = 10.We list in Table 3 the numbers of replicas at µ = 0.6 for various n.Here, we first determine the maximum flow time T so that the sign problem is  well resolved there, and then determine the number of replicas so that the acceptance rate of the swapping is in the range 0.2-0.5.

Results and analysis
Figure 9 shows the average reweighting factors from the naive reweighting method (blue) and from the WV-TLTM (orange), the former exhibiting the existence of the sign problem around µ = 0.6.Figure 10 gives the estimates of ψψ from the WV-TLTM at µ = 0.625 rewt from the naive reweighting method (blue) and the average reweighting factors A = A(z) R from the WV-TLTM (orange).The estimation range [ T0 , T1 ] in the WV-TLTM is set to that for the chiral condensate (see Table 1).
with various estimation ranges [ T0 , T1 ].We see a plateau with a value close to the exact one

Conclusion and outlook
In this paper, we proposed a HMC algorithm on the worldvolume R of an integration surface Σ t , where the flow time t changes in the course of molecular dynamics, and thus the multimodal problem is resolved without introducing replicas.Furthermore, the computation of the Jacobian is not necessary in generating a configuration.We applied this algorithm to a chiral random matrix model (the Stephanov model) and confirmed that it reproduces correct results, solving both sign and multimodal problems simultaneously.
The validity of this algorithm should be further investigated by applying it to other systems that also have the sign problem, including finite density QCD, strongly correlated electron systems and real-time quantum field theories as well as frustrated spin systems like the antiferromagnetic Heisenberg model on the triangular lattice and the Kitaev model on the honeycomb lattice.
It is important to keep developing the algorithm as well in order to perform large scale calculations for such systems listed above with less computational cost.For example, it should be nice if we can find a more efficient algorithm to tune W (t). 23 At the same time, it is worth developing an algorithm where the weight e −W (t) needs not be introduced, as what happens when one switches from the simulated tempering [36] to the parallel tempering [22,23,24].It is also desirable if one can construct an algorithm to evaluate the phase e iϕ = det J/| det J| without computing the matrix elements of J explicitly.Furthermore, in order to make a more accurate statistical analysis, it is important to develop a systematic method to estimate numerical errors that are necessarily introduced in integrating the antiholomorphic gradient flow and in solving Newton's method iteratively (Step 2 in subsection 3.5).
The modification of the flow equation (1.6) should also deserve intensive investigation for various reasons.To see this, note that (1.6) is not the only possible equation deforming the original integration surface R N so as to approach a union of Lefschetz thimbles.For example, it can be modified with a positive hermitian matrix G ij (z, z * ) to the form without changing the structure of thimbles.However, this modification changes flows of configurations off the thimbles, and can be designed so that flowed configurations approach zeros of e −S(z) only very slowly (see, e.g., [37]).We have investigated this type of modification, proposing to take G ij of the following simple form [38]: (5.2) This actually removes zeros from R for finite flow times, and sometimes is helpful in iteratively solving the constraint (3.28).However, it seems that the obtained gain does not exceed the increased complexity of the algorithm, and also that the functional form of G ij needs to be fine-tuned, depending on parameters on each model.This is the reason why we did not pursue this possibility in this paper.However, it will be essentially important when one develops a Metropolis-Hastings algorithm described in appendix D, because the configuration space R = {ξ = (t, x a ) | T 0 ≤ t ≤ T 1 } comes to have a simple structure if points to be mapped to zeros do not exist in the region.
Another possible application of modifying the flow equation is to provide a mechanism to solve the so-called global sign problem (cancellation among contributions from different thimbles).In fact, since α −1 (z) increases exponentially in the vicinity of a Lefschetz thimble [see a comment below (B.8)], a change of flows caused by the modification may significantly shift the distribution of A(z) and distort the balanced contributions from different thimbles, that was the origin of the global sign problem.
A study along these lines is now in progress and will be reported elsewhere.
Step 2. Calculate Φ ∆s (z, π) from (A.27)-(A.31).We repeat this step n times to obtain (z ′ , π ′ ) = Φ n ∆s (z, π) Step 3. Update the configuration z to z ′ with a probability min 1, e −H(z ′ ,π ′ )+H(z,π) .(A.36) The above process defines a stochastic process on R with the following transition matrix for z ′ = z: The diagonal (z ′ = z) components are determined from the probability conservation.P (z ′ |z) can be shown to satisfy the detailed balance condition as follows:

B. Analytical expressions for the Gaussian case
We present the analytical expressions for some geometrical quantities defined in subsection 3.1 for the action This has a single critical point at z σ = (z k σ = i), and the corresponding Lefschetz thimble is given by J σ = {z = (z k ) ∈ C k | Im z k = 1 (∀k)}.Complex vectors will be used throughout this appendix.
The solution of the antiholomorphic flow equation (1.6) takes the form z k (t, x) = x k e βt + i(1 − e −βt ). (B.2) The Jacobian J(t, x) is thus given by J k a (t, x) = ∂z k (t, x) ∂x a = e βt δ k a . (B. 3) The tangent vectors E 0 and E a are We see that the inverse lapse is given by α −1 = e βt /(β √ N) and increases exponentially in flow time t as z(t, x) approaches the Lefschetz thimble.
The ideal weight function e −W (t) giving a uniform distribution of t is given by [see (3.49We have ignored t-independent constants.We see that the weight factor also increases exponentially, e −W (t) ≃ e βt , at large flow times.
C. Proof of eq.(3.31)In this appendix, we prove the equality (3.31).The weight e −W (t) is determined so that the function Z(t; W ) ≡ e −W (t) R N dx e −Re S(z(t,x)) (D.6) is almost independent of t, as in subsection 3.4.
The distribution e − Ṽ (ξ) /Z R (Z R = R dξ e − Ṽ (ξ) ) can be obtained from a Markov chain without evaluating J explicitly, if one uses the Metropolis-Hastings algorithm to update a configuration. 29 x ) N/2 e −(x ′ −x) 2 /2σ 2 x , (D.7) where we have treated t and x anisotropically.We then accept ξ ′ with a probability In a generic case, Re S(z(t, x)) changes rapidly at large flow times t, and thus we should better change the proposal depending on ξ = (t, x a ).One way is to change (σ 2 t , σ 2 x ) by randomly taking them from a set C = {(σ 2 t,c , σ 2 x,c )} (c = 1, . . ., c max ) as in subsection 4.2.Another way is to use an asymmetric proposal Prop(ξ ′ |ξ) by making σ 2 t and σ 2 x t-dependent functions: e −(t ′ −t) 2 /2σ 2 t (t) 1 (2πσ 2 x (t)) N/2 e −(x ′ −x) 2 /2σ 2 x (t) .(D.11) In the latter case, the functional form of σ 2 t (t) and σ 2 x (t) are fixed manually or adaptively from test runs.After a sample is obtained for the region [T 0 , T 1 ], we consider subsamples for various estimation ranges [ T0 , T1 ], and estimate an observable by looking at a plateau in the twodimensional parameter space {( T0 , T1 )}, as in subsection 3.6. 29One can also use the HMC algorithm in principle, but this requires the computation of the Jacobian J = (J i a (ξ)) because Hamilton's equation in molecular dynamics involves the gradient ∂ x a S(z(ξ)) = ∂ z i S(z) J i a (ξ). 30The diagonal components are determined by the probability conservation, dξ ′ P (ξ ′ |ξ) = 1.
and replace λ a F a in (3.24) by µ r Fr with new Lagrange multipliers (µ r ) ∈ R N −1 .

Figure 7 :
Figure 7: A transition passing over the boundary at T 1 .

Table 3 :
Maximum flow time T and the number of replicas at µ = 0.6.

Figure 9 :
Figure 9: Average phase factors A = e −i Im S(x)rewt from the naive reweighting method (blue) and the average reweighting factors A = A(z) R from the WV-TLTM (orange).The estimation range [ T0 , T1 ] in the WV-TLTM is set to that for the chiral condensate (see Table1).

Figure 11 :
Figure 11: (Left) The chiral condensate.(Right) The Number density.The top panels are the results from the reweighting method, and the bottom panels are from the WV-TLTM and the complex Langevin method.
A.39)    where we have used (A.34) n times to get the second line, and have made the change of integration variables, π → −π and π ′ → −π ′ , with the relation H(z, −π) = H(z, π) to obtain the third line.
Namely, from a configuration ξ we first propose a new configuration ξ ′ with a probability