The Curious Early History of CKM Matrix - miracles happen! -

The 1973 Kobayashi Maskawa paper proposed a compelling link between Cabibbo’s flavor-mixing scheme and CP violation but, since it required the existence of six quarks at a time when the physics community was happy with only three, it received zero attention. However, two years after the paper appeared—at which time it had received a grand total of two citations—the charmed quark was discovered and it finally got some notice and acceptance. After this stumbling start, it subsequently emerged as the focal point of an enormous amount of experimental and theoretical research activity. In an invited talk at a KEK symposium to celebrate the 50 th anniversary of the KM paper, I reviewed some of the less well known circumstances that occurred in the years preceding and following the paper’s appearance. Some spoilers:


Introduction
The challenge of reviewing a subject that is fifty years old to a community of experts is to find something to say that isn't already well known to everyone in the audience.However, this obvious truth didn't occur to me when I was invited by the organizers to speak at the KEK special symposium to celebrate the fiftieth anniversary of the Kobayashi-Maskawa six-quark model.An invitation that, in a reckless capitulation to my vanity, I immediately accepted.Upon subsequent reflection, I realized my dilemma: there was precious little that I could say about the hundreds of CKMrelated published Belle results-which I expect the organizers had in mind when they offered this invitation-that wasn't already very familiar to the symposium participants.So, instead, I decided to exploit the one advantage I might have over most other participants, and that was that I would be the oldest, or least one of the oldest, person in attendance and reminisce about the early days of the KM era, including some of its pre-history.So, with the forewarning that all historical accounts suffer from mistakes and oversimplifications, and are varnished to match the preconceptions and prejudices of the chronicler, here goes: 2 Prehistory: Cabibbo flavor-mixing and the discovery of CP violation The prehistory started sixty years ago during the 1963-64 academic year1 when there were three major discoveries that all played a major roles in the Kobayashi-Maskawa story: flavor-mixing, quarks, and the observation of CP violation in K L →π + π − decays.

Cabibbo flavor mixing
In their classic paper that identified the V −A coupling of the weak interaction [1], Feynman and Gell-Mann proposed that the weak interaction was a current-current interaction where the hadron current has the form where g is a coupling constant, V ∆S=0 µ and A ∆S=0 µ are the vector and axial vector currents for strangeness conserving processes and V ∆S=1 lepton currents all have a common coupling strength, i.e., g = g W , and α = β = 1 in eqn. 1, where g W is related to the square root of the Fermi constant G F by The other one was the so-called Conserved Vector Current (CVC) hypothesis that says that the hadronic matrix elements for the vector component of the weak interaction current are the same as those for the electromagnetic interactions.This has the consequence that vector form-factors for weak decays of hadrons at zero squared momentum-transfers are unity, f V (q 2 =0) = 1.These two conjectures translated into a prediction that the coupling strength extracted from the vectormediated semileptonic process K + →π 0 e + ν e , i.e., g ∆S=1 V shown in Fig. 1a) should be the same as g W in µ + →e + ν e νµ .
In a paper that appeared in June 1963 [2], Cabibbo pointed out that Feynman-Gell-Mann universality conjecture failed miserably.His comparison of experimental measurements of the partial width for the ∆S = 1 vector weak-interaction process K + →π 0 ℓ + ν [3] to the well known width for muon decay found and about a factor of four below expectations.He also found a similar deviation from universality in the ratio of the axial-vector-mediated partial decay widths Γ(K + →µ + ν)/Γ(π + →µ + ν): (Although the axial-vector currents are not "protected" by CVC, corrections to them were expected to be small [4], and certainly not large enough to account for a factor of four.) Cabibbo proposed modifying the Feynman-Gell-Mann α = β = 1 conjecture to α 2 +β 2 = 1, in which case where β ≈ 0.25 could accommodate the abovementioned experimental results.In his paper, Cabibbo proposed his eponymous angle θ C , which he estimated to θ C ≈ 14.9 • , as a convenient way to express two parameters α and β that were subject to the constraint α 2 +β 2 =1, and he didn't mention anything about rotations.The earliest experiments that addressed Cabibbo's hypothesis [5] were focused on testing the validity of Cabibbo's relation, The notion that this might represent a rotation didn't become apparent until the 1970 GIM paper [6] that proposed the c-quark as a way to suppress flavor-changing neutral currents.If one accepts the existence of two quark doublets, the Cabibbo d-s mixed quark state d ′ = d cos θ C + s sin θ C is produced by the application of a 2x2 unitary rotation matrix: and has an orthogonal partner, s =−d sin θ C + s cos θ C .In this formulation, it is apparent that Cabibbo's form of weak universality is the same as Feynman-Gell-Mann universality applied to the rotated d ′ and s ′ quarks. 2   2 In addition to suppressing ∆S =±1 weak interaction couplings relative to that for muon decay by a factor of sin θ C = 0.2245, Cabibbo's weak universality predicts that ∆S = 0 couplings are suppressed by a factor of cos θ C = 0.974.In fact, nuclear physicists had known since 1955 that the half-life for 14 O→ 14 Nβ + ν, a vector-mediated 0 + →0 + nuclear β + decay transition, was ∼3% longer than the value that was predicted using the g W value determined from muon decay [7,8].In 1960, three years before Cabibbo's paper, this discrepancy was noted in the introductory remarks of a Nuovo Cimento article on the axial-vector current by Gell-Mann and Levy [4], together with a footnote that suggested that this might be because the unitarity condition might be, in fact, α 2 +β 2 = 1, and not the α = β = 1 condition that was conjectured in the Feyman-Gell-Mann V -A paper.The footnote includes a estimate on the mixing that translates into θ ≈ 14 • , consistent with-and three years before-Cabibbo's estimate for θ C based on ∆S = 1 transitions.This may explain why K and M, but not C, were awarded the Nobel prize in 2008.

Gell-Mann Zweig quarks
During this same year Gell-Mann [9] and Zweig [10] proposed the quark model in which hadrons were comprised of fractionally charge fermionic constituents (Zweig called then "aces").Gell-Mann's paper was published in January 1964; Zweig's paper was never published. 3With rotated quarks, the short-distance weak interaction hadronic currents are the same as those for leptons: where , and V ij is the eqn.8 quark mixing matrix.The long distance quark-to-hadron processes are described by form factors.

Discovery of CP violation
The Christenson, Cronin, Fitch and Turley discovery of the CP violating decay mode was reported in the summer of 1964 [11].This was a relatively low priority experiment that was not aimed at investigating CP violation but, instead, was designed to investigate some anomalies in coherent K 2 →K 1 regeneration measurements that had been reported during the previous year [12].It failed to qualify for a spot in the main experimental hall of the then, almost new, AGS synchrotron that was occupied by spectrometers specialized for total cross section determinations, and π, K, p and µ-proton elastic scattering measurements.Instead, the experimental apparatus was located in a relatively inaccessible area inside the AGS magnet ring that the laboratory technical staff referred to as "Inner Mongolia,"4 in a neutral particle line that was essentially a hole in the AGS shielding wall that was pointed at a target located in the accelerator's vacuum chamber, as illustrated in Fig. 2a.The high flux of γ-rays emerging from the target were attenuated by a 3.8 cmthick lead block followed by a collimator and a bending magnet that swept charged particles out of the beam aperture.A double-arm spectrometer consisting of tracking spark chambers before and after two vertically bending magnets measured the directions and momenta of charged particles that were produced by K L meson decays that occurred in a 2 m-long decay volume that was a plastic bag filled with atmospheric pressure helium-a low-budget approximation of a vacuum chamber-as shown in Fig. 2b.Most of the detected events were due to CP-allowed K L →π + π − π 0 decays and K L → π ± ℓ ∓ ν (ℓ = e, µ) semileptonic decays.In these decays, the π 0 or ν was not detected and, as a result, the invariant mass of the two detected charged particles was not, in general, equal to the K Lmeson mass (m K L = 498 MeV).For K L →π + π − π 0 decays where the π 0 is undetected, the π + π − invariant mass is always below 363 MeV; for K L →π ± ℓ ∓ ν, where the ν is missed and the ℓ ∓ track is assigned a pion mass, the two charged track invariant mass distribution ranges from 280 to 546 MeV, with no peak near m K L .Although the energies of the decaying K L mesons were not known, their three-momentum directions were confined to be within an rms spread of ±3.4 mrad (±0.2 • ) around the beamline.A consequence of the missed particle in the three-body decay channels was that the vector sums of the two charge track's momentum vectors did not usually point along the well defined K L beamline.
Thus, the experimental signature for K L →π + π − decays was a pair of oppositely charged tracks that, when assigned pion masses, had an invariant mass that was within ±5 MeV of m K L and with a summed vector momentum that is directed along the K L beam direction.Results for these two quantities are shown in Fig. 2c, where the horizontal axis is the cosine of the angle between the ⃗ p π + + ⃗ p π − direction and the K L beamline, and the upper, central and lower panels show the experimental distributions for M (π + π − ) below, centered on, and above m K L , respectively.In the central panel there is a pronounced peak totally contained within cos θ > 0.99996 (θ < 9 mrad), a feature that is absent in the distributions for M (π + π − ) below or above m K L shown in the upper and lower panels.The ratio of the branching fractions for K L →π + π − to the sum of all (CPconserving) decays to charged particles was (2.0 ± 0.4) × 10 −3 .
The signal peak in the central panel of Fig. 2c contained an excess of 45 ± 10 events, but these were not all K L →π + π − events.About ten of them were due to the coherent K L →K S →π + π − regeneration process on the helium nuclei in the gas bag decay region, and were indistinguishable from the K L →π + π − signal events.Nature was kind.If the branching fraction had been much smaller or the regeneration cross section were higher, the interpretation of the observed signal peak would have been ambiguous.As mentioned above, this was a low-priority experiment.If it had ended up by simply setting an upper limit on the K L →π + π − branching fraction, who knows when, if ever, a follow-up experiment with higher sensitivity would have occurred.

The Kobayashi-Maskawa paper
The famous Kobayashi-Maskawa paper [13]  Simply put, a CP violation means that the amplitudes for a processes that involve initial-state particles converting to final-particles and its corresponding antiparticle equivalent are not the same, e.g., But the hermiticity of the Hamiltonian requires that the squares of the amplitudes are equal: and the only way these two conditions can be satisfied is if M and M differ by a phase, i.e., So, to incorporate CP violation into the Standard Model, all you have to do is find a way to insert a complex phase in it somewhere, which, at first glance, wouldn't seem to be so difficult.However this CP-violating phase is special and, unlike all other phases that show up quantum theories, and including it into the the theory is not at all trivial.
There are countless phases that occur in quantum mechanics; both the Schrödinger and Dirac equations have an imaginary coefficient and their solutions are complex wave functions.But all of these phases, with, so far, only one exception, have the same sign for particles and antiparticles.
Only a CP-violating phase has opposite signs for particles and antiparticles.
The KM paper examines various possible ways that a complex CP phase might be incorporated into the Standard Model.In the following I discuss the first five pages and the last page separately.

The KM paper: pages 1→5
In the first five pages, various possibilities were examined and the authors concluded that "no realistic models of CP-violation exist in the quartet scheme without introducing any other new fields." Here, by the "quartet scheme" they meant the four-quark model that included the charmed quark.
Note that this was written in 1971, nearly three before the c-quark discovery in the "November 1974 revolution."This, and their page-five conclusion that no realistic model for CP-violation exists with four quarks raise two questions: i) Why were Kobayashi and Maskawa so sure of the existence of the c-quark at such an early data?
ii) Why can't a CP-violating phase be introduced together with the Cabibbo angle into the eqn.8 2×2 quark-flavor mixing matrix?

The discovery of charm: Japanese version
In 1970, a small team of experimenters in Japan led by Kiyoshi Niu, exposed a stack of photographic emulsions to cosmic rays in a high altitude commercial cargo airliner [14].Upon subsequent inspection they found a remarkable event, shown in Fig. 3, in which an ultra-high energy (multi-TeV) cosmic ray particle interaction produced four charged tracks and two very high energy, closely spaced γ-rays that, when attributed to a π 0 → γγ decay, had a total energy of 3.2 ± 0.4 TeV.Two of the charged tracks, labeled B & C in the figure, have kinks within ∼5 cm of the production point that are quite distinct in both the X and Y projections shown in Figs.3a & b, indicating that they decayed to charged daughters (tracks B' & C').When the event is viewed along the flight direction of track B (Fig. 3c), its daughter charged track (B') and the high energy π 0 are very close to being back-to-back.The transverse momentum of the π 0 relative to the direction of track B was 627 ± 90 MeV, and much higher than was possible for the decay of any known particle at that time.With the π 0 setting the energy scale and assumptions of two-body decays at each kink: π ± π 0 for B→B' and π 0 p for C→C' where the secondary π 0 is missed, transverse momentum balance was used to estimate the masses and lifetimes of B and C:

Assumed
Mass lifetime decay mode (GeV) (sec) The estimated B mass and the proper time intervals are consistent with the GIM estimates of ∼2 GeV for the charmed-quark mass (and in reasonable agreement with what the now very well determined D − mass (1.869 GeV) and Λ c mass (2.286 GeV)).The lifetimes were much shorter than that of any known weakly decaying particle as well as the O(10 −13 s) estimate that was given in the GIM paper.But the latter fact is perhaps not too surprising since emulsion measurements are biased towards shorter lifetimes.For these reasons, Nagoya theorist Shuzo Ogawa interpreted Niu's event as being the associated production of an anticharmed meson charmed-baryon pair and their subsequent decays.Although whether or not Niu's event and Ogawa's interpretation amounted to a Nobel-prize-worthy claim of a discovery might be a subject of dispute, what matters for our story here is that many people in the Japanese theoretical physics community, especially those in Nagoya that included Kobayashi and Maskawa, were convinced that the charmed quark had been discovered, and that four quarks existed in nature, a scenario that they called the "quartet model."Here the tracks labeled B and C have kinks at depth of 1.38 cm and 5.14 cm, respectively, that are evident in both views.c) The same event viewed along track B, where the direction of a high energy π 0 , inferred from two detected γ-rays, is very nearly opposite the direction of B', the daughter track that emerges from the track B kink.(The figures are taken from ref. [14].)

A CP phase in the four-quark mixing matrix?
In general, a 2×2 matrix like that in eqn.8 has four complex elements that correspond to eight distinct real numbers.In the four-quark model, flavor mixing is completely described by the single real number θ C , why can't one of the other seven numbers be used to specify δ CP , a CP-violating phase?
The flavor-mixing matrix describes a rotation and, thus, has to conserve probability.This means it should be unitary: i.e.V V † = I, where I is the identity matrix: This corresponds to four relations that reduces the number of independent parameters from eight to four.
In the weak interaction quark currents (eqn.9), the total number of quarks is conserved: q j , which annihilates a q j -quark, is always accompanied by qi , which creates a q i -quark.The theory has a subtle property: if each quark field is multiplied by an arbitrary phase factor, and the interactions are modified by the same phases, there is no net effect on on the J q µ current: This process is called rephasing and the four phases can be expressed as three independent phase differences plus one overall phase that has no effect on the physics.Thus, of the the eight real numbers we started with, four are removed by the unitarity constraint, and three can have any value with no net effect.Thus there is only one number left to define the matrix, and that is needed for the rotation angle θ C .There is no freedom to add a CP phase and is what led Kobayashi and Maskawa to conclude that there was no way to incorporate CP violation into the four quark model.

The KM paper: page 6
In his 2008 Nobel prize lecture [15], Maskawa recalled that he and Kobayashi had completed the work that was covered in pages 1-5 of their paper and were disappointed with their failure to find any way to incorporate a CP phase into the four quark model, and were reconciled to the unhappy likelihood that they would have to publish the negative result.Then, one evening, while-as is customary in Japan-he was taking his after-dinner bath, he mentally went through the calculation described in the previous paragraph, except this time for a six-quark scenario with a 3×3 flavormixing rotation matrix in three dimensions.In this case, there are 9 complex elements that are described by 18 real numbers.Of these, 9 are constrained by the unitarity requirement, 5 are taken up by rephasing, and 3 are needed to specify the 3-dimensional rotation, 5 and that left one number that remained available.
Eureka!With six-quarks there is room for a CP-violating phase!
A sixth page was added to the manuscript that included the remarks: Next we consider a 6-plet model, another interesting model of CP violation,... with a 3×3 instead of 2×2 unitary matrix.In this case we cannot absorb all phases of matrix elements into the phase convention and can take, for example, the following expression:

The first proposal for (and the naming of) charm
The distinction between electron-and muon-neutrinos had been established in 1962 [16], and, in what was at that time the beginnings of the Standard Model, the electron and muon and their neutrinos occupied two weak-isospin=1/2 spinors.In a 1964 paper, Bjorken and Glashow [17] discussed an expansion of SU (3) to SU (4) with the addition of another strangeness-like flavor quantum number.They formulated their model in terms of the Sakata model, the predecessor of the quark model that used the proton-neutron isospin doublet and isoscalar Lambda as basic constituents, with the Lambda replaced by a doublet that had a fourth baryonic constituent with a non-zero value of the new quantum number.When reformulated in the context of the quark model, which had just emerged at that time with three quarks, this was equivalent to adding a fourth quark with matching patterns for the leptons and quarks: While their proposal was not very different from schemes that other authors had proposed around that time (see, for example refs.[18][19][20]), Bjorken and Glashow gave the new flavor the charismatic name "charm," and that's the one that stuck; the fourth quark was known as the "charmed quark" (and not the grammatically awkward "charm quark") even before it was discovered.In retrospect, this four-quark scheme seems like a pretty sound-almost obvious-idea;6 how could anyone dismiss such a simple and sensible suggestion?Nevertheless, for the next six years this idea didn't go very far.By the time the GIM paper [6] appeared in 1970, the Bjorken-Glashow paper had received a grand total of six citations.For some reason, there seem to have been a special affinity in the physics community for the number three and an aversion to the number four. 7Moreover, even the GIM paper that proposed a four-quark scenario as a compelling explanation for the suppression of flavor-changing neutral-currents, an important theoretical issue at that time, didn't experience a boom in citations until after the J/ψ discovery, which finally convinced the world-wide physics community that there were (at least) four quarks.
In 1975, soon after the J/ψ discovery, Maiani (the "M" in GIM), who, at that time, was unaware of the KM result,8 wrote an interesting paper [21] that contained all of the KM model and then some.But, thanks to their three-year-long head start, it was Kobayashi and Maskawa-and not Maiani-that got to go to Sweden in December 2008.The six-quark era started in earnest in 1977, after the discovery of the τ -lepton [22] and the b-quark [23], and the KM model was elevated to role of being the most likely mechanism for CP violation.Although the b-quark's charge=2/3 partner, the t-quark, wasn't discovered until 1995 [24,25], there was very little doubt about its existence. 9The only question was its mass; based on the mass pattern of the known quarks, the general consensus was that it was probably in the ∼25-35 GeV range [30].

"Discovery" of the KM paper
As mentioned above, for over two years the KM paper remained completely unnoticed outside of Japan (and only received limited attention inside Japan).It was finally brought to the attention of the world-wide physics community in a curious set of circumstances that are recounted here.But first it should be noted that version of the quark-mixing matrix that appeared in the KM paper and is reproduced above in eqn.19 contains a pretty obvious typographical (or transcription) error.Since the KM matrix nominally describes a rotation, if all three mixing angles and the CP phase are set to zero, it should revert to the identity matrix, i.e., However in the matrix that appears in the KM paper, the zero-angle limit for the lower-right diagonal element V tb , which should be V tb →1, is incorrect: The first paper to establish that the KM model could account for all that was known about CP violations at that time, was by Pakvasa and Sugawara and published in the July 1976 issue of Physical Review D [31].In their paper, they pointed out that in the six-quark model, a non-zero value of ε, the neutral K-meson mass-matrix CP violation parameter, of the correct magnitude would be produced by the interference between the virtual cand t-quark contributions to the K 0 -K0 mixing box-diagram shown in Fig. 4a.They also pointed out that in the KM picture, the penguin diagram10 shown in Fig. 4b, that would mediate direct-CP violating K 2 →π + π − decays, i.e., the ε ′ parameter, would be small and consistent with the then existing experimental limits.
Since PRD is a widely distributed physics journal, this paper provided the first awareness of the KM paper to the international particle physics community.

a) b)
Example of Feynman graph using axodraw2 macros: In their paper, Pakvasa and Sugawara made no mention of the typo in the KM paper and included an expression for the matrix that they attributed to KM but, in fact, was different.This paper also had a typo that mistakenly identified Toshihide Maskawa as K. Maskawa in their citation to the KM paper.Interestingly, many of the papers that immediately followed the Pakvasa-Sugawara paper used the Pakvasa-Sugawara version of the CKM matrix with no mention of the mistake, and with citations to the KM paper that had T. Maskawa incorrectly listed as K. Maskawa (see, it e.g., refs.[32][33][34][35]).This recurrence of the typo in the citations provided pretty clear evidence that the Pakvasa-Sugawara PRD article was the source that researchers used for both the matrix and the citation, and that the PTP paper itself was probably not very widely read. 11 But, in addition to the typo in their citation to the KM paper, the Pakvasa-Sugawara version of the KM matrix had a problem of its own.In their paper, the KM expression for V tb , given above in eqn.22, was replaced by which doesn't go to zero in the limit or zero mixing angles.But neither does it go to 1, instead, 11 But why didn't Pakvasa and Sugawara alert their readers about the problem with the matrix in the KM paper?Sugawara's recollection is that when he first learned about the KM paper at a University of Tokyo physics seminar, he was impressed by their six-quark scheme and "reconstructed it in [his] own way, without reading their paper carefully."When he and Pakvasa subsequently did their analysis and wrote up their results, they included Sugawara's version of the matrix, which was the KM version without the error, in their paper.In fact, Pakvasa and Sugawara remained unaware of the KM paper's typo.According to Sugawara, "I never realized that the paper had this typo until it was recently pointed out to me." and for δ = 0, has a negative determinant.Nevertheless, this version of the matrix is unitary, which is the only essential requirement for a quark-flavor mixing matrix, and this form was used in much (but not all) of the literature until 1984, when the currently widely accepted Chau-Keung parameterization was proposed and soon thereafter endorsed by the Particle Data Group.4 Reparameterizing the CKM matrix A rotation in three dimensions can be accomplished by three successive rotations: first by an angle θ 1 around the z axis that mixes x and y [(x, y)→(x ′ , y ′ ))], then by θ 2 about the new x ′ axis that mixes y ′ and z [(y ′ , z)→(y ′′ , z ′ )] and, finally, by θ 3 around the y ′′ axis that mixes x ′ and . This is just one of of many ways that can be used to specify a given rotation.
For example, the order of the three rotations can be changed, and there is freedom in the choice of the axes that are used to define the rotations.For these different choices, the values of the individual rotation angles are different, as are the expressions for each matrix element in terms of these rotation angles.Ultimately, however, the numerical value of the magnitude of each matrix element for any of these choices has to be the same, and independent of the choice of the individual rotations.In addition, in the case of the CKM matrix, which is complex, rephasing invariance provides five independent arbitrary phase parameters that can be attached to the various matrix elements to establish whatever phase convention may seem convenient.The physics content is independent of these parameterization choices.
On what basis should a parameterization be selected?In answer to this, Haim Harari suggested some criteria for what he would consider to be a "good" parameterization.These included [36]: • There should be a simple relation between the most directly measurable matrix elements V ij and the quark mixing angles.
• The matrix elements above the diagonal, which correspond to kinematically allowed decay processes that are directly measurable, should have the simplest possible expressions.• If possible, the CP violating phase should be linked to only one angle, and preferably the sine of that angle.
During the years immediately following the wide recognition of the KM paper, there was considerable effort aimed at finding a suitable parameterization.This was aided by concurrent experimental measurements of the relative magnitudes of the V cb and V ub matrix elements using B-mesons that were produced via e + e − annihilations at two colliders that existed at that time: PEP at SLAC and CESR at Cornell.

Experimental information about quark transitions
The PEP project was initially conceived as a two ring proton-electron-positron collider with an electron-positron ring that could support E cm = 30 GeV e + e − collisions primarily to search for the top-quark (if its mass was less than 15 GeV), and a second 150 GeV proton ring that could support e ± p collisions with E cm ≈ 100 GeV for measurements of deep inelastic scattering at high energies and Q 2 values.The CESR collider was conceived during 1974 as a follow-up to the Cornell fixedtarget e − p scattering program, and proposed to the U.S. National Science Foundation in 1975, soon after the J/ψ discovery, as an E cm ≈ 16 GeV e + e − collider primarily aimed at studies of charmed particles.The PEP and CESR projects were both well underway when the b-quark was discovered in 1977.Fortuitously, the initial CESR design energy could comfortably operate in the E cm = 9−11 GeV range, and cover the three narrow Υ(1S, 2S, 3S) resonances and the threshold region for e + e − → B B meson pair production, that was expected to be around E cm = 10.5 GeV. 12he two projects started running in 1979 and soon thereafter provided convincing experimental evidence for a strong hierarchy among the weak interaction mixing angles for the three quark generations.It was already well established that transitions within the first quark generation, e.g., u→d and within the second generation (c→s) were strongly favored over transitions between the first and second generations (s→u and c→d), i.e., the Cabibbo angle.Experiments at PEP found that transitions between the second and third generation (c→b) are more suppressed those between the first and second generations, and the CESR experiment determined that transitions between the first and third generations (u→b) are the least favored of all.(mainly B ± and B 0 mesons) produced via the e + e − →b b annihilation process (about 1/10 th of the total annihilation cross section) to determine the b-quark lifetime by measuring the impact parameters of charged leptons from B→Xµν and Xeν inclusive semileptonic decays, where X is a hadronic system [37].(The definition of the impact parameter is indicated in Fig. 5a.)This was a difficult measurement: the parent B-meson's energy and direction were not precisely known on an event-by-event basis; there was contamination from semileptonic decays of charmed mesons; and the mean value of the impact parameter that was eventually determined (≈170 µm) was substantially smaller than the experimental resolution (∼600 µm) as well as the horizontal size of the beam-beam interaction region (∼400 µm).Nonetheless, the measured impact parameter distributions for muons and electrons shown in the upper and lower panels of Fig. 5b, respectively, both had excesses at positive values and these translated into a b-quark lifetime of τ b = 1.8 ± 0.7 ps.Since this was about a factor of four times longer than the lifetime of the much lighter c-quark, it was a big surprise at that time.As discussed below, b→c transitions are the b-quark's dominant decay mechanism, and the measured lifetime could be used to determine a value for |V cb | using the expression a difference that is similar to (but not exactly the same as) the factor of five Cabbibo suppression 13 between V us and V ud .The MAC results were confirmed by the MarkII and DELCO experiments at PEP [39,40] and the TASSO experiment [41] at PETRA, an E cm =40 GeV e + e − collider at the DESY laboratory in Hamburg.
1 st ↔ 3 rd generation: The CLEO experiment studied semileptonic B→Xℓν decays of B mesons that were produced at E cm = 10.58GeV, the peak14 of the Υ(4S)→B B resonance [42].Since this energy is only 20 MeV above the B B threshold, the B mesons are produced very nearly at rest (the boost factor is γβ = 0.062) and the energy of a decay lepton (ℓ = e, µ) in the laboratory frame is very nearly equal to what it is in the B meson rest frame.In b→ cℓν-mediated decays, the minimum mass of the hadronic system is M min X = m D = 1.86 GeV, and this translates into a maximum lepton momentum of p max cℓν = 2.31 GeV/c; in b→ uℓν-mediated decays, the mass of the hadronic system can be as light as M min X = m π , and the end-point momentum is p max uℓν = 2.60 GeV/c.Thus, measurements of the lepton momentum spectra in the end-point region can be used to determine the relative strengths of the b→c and b→u transitions and extract the value of |V ub | 2 /|V bc | 2 .Figure 6 shows the measured momentum distributions for electrons (upper) and muons (lower) together with expectations for b→ cℓν (dashed curves) to the e + e − beamline, together with indications of the size of the beam-beam interaction region, and the definition of the impact parameter, i.e., the distance of closest approach of the charged lepton track to the e + e − interaction point.The unknown B meson's direction was assigned with reasonably good accuracy to be along the event's thrust axis.b) Impact parameter distributions for muons (upper) and electrons (lower) (from the MAC experiment [37]).Here each entry is weighted by the inverse square of its measurement error.c) A quark-line diagram for b-quark decays.The subscript α indicates the quark color.and b→ uℓν (dotted curves).There are no events in the 2.31<p lepton <2.60 GeV/c range that could be unambiguously attributed to b→uℓν decays and the shapes of the spectra are consistent with expectations for ∼100% b→ cℓν with no significant contribution from b→ uℓν.From these data, the CLEO group established a 90% CL upper limit 15 of |V ub |/|V cb |<0.14.

The Chau-Keung and Wolfenstein parameterizations
Since the early parameterizations fail on all three of the Harari criteria, a better one was needed.

Chau-Keung parameterization
The parameterization proposed by Chau and Keung was specifically motivated by the occurrence of the CP violating phase in the V tb = cos θ 1 cos θ 2 cos 3 − cos θ 2 cos θ 3 e iδ term in the Pakvasa-Sugawara version of the original KM parameterization that seemed to suggest that there is a large CP violation that is confined to the (t, b) quark sector.This was in sharp contrast to the results of detailed calculations of measurable CP violating effects that invariably resulted in small numbers that involved factors of sin θ 2 sin θ 3 e iδ .1986 [46] and, according to recent measurements [38],

Wolfenstein parameterization
The Wolfenstein parameterization is an approximation that employs a polynomial expansion in terms of λ ≡ sin θ C = 0.2265 that reflects the hierarchical character of the CKM matrix.With an accuracy up to O(λ 3 ) it has the form: where the parameter A≈ 0.8 accounts for the fact that the Cabbibo-like suppression between V cb and V cd is about 20% more severe than that between V cd and V ud .In this parameterization, all of the CP violation resides in the single parameter η that is confined to the V cd and V ub corners of the matrix where it is multiplied by λ 3 A, a small number.
This parameterization is very convenient and is widely used, but in a somewhat modified form that was suggested by Buras and colleagues [47].In the Buras version, the Wolfenstein λ, A, ρ and η parameters are redefined in terms of the Chao-Keung angles to be exactly With these parameter definitions, V ub is the same as in the Chau and Keung parameterizations, and the higher corrections to V us and V cb start at O(λ 7 ) and O(λ 8 ), respectively.In addition Buras et al. defined new parameters ρ and η as in which case and Although the distinction between Wolfenstein's ρ, η and (the nearly equal) ρ, η parameters may seem to be confusing and unnecessary, the latter parameters are preferred and are more generally used (for reasons that are discussed in detail in ref. [48]).The PDG review [38] only provides values for the four (redefined) "Wolfenstein parameters" that, in 2020, were λ = 0.22650 ± 0.00048 A = 0.790 +0.017 −0.012 (35) ρ = 0.141 +0.016 −0.017 η = 0.357 ± 0.011.
The eqn. 28 form of the matrix is carefully tuned for CP violations in the b-quark sector that is produced by the imaginary parts of V ub and V td , and, since all the matrix elements in the second row and column, i.e., the ones that involve the strange and charmed quarks, are real, it is not applicable to descriptions of CP violation the cand s-quark sectors.For this, Wolfenstein provided a version of the matrix that is expanded to include CP-violating terms of O(λ 4 ) in V ts and O(λ 5 ) in V cd : An expression involving the same parameters that is exactly unitary is given by Kobayashi in ref. [49] 5 CP violation in b-quark decays?
In the KM model, CP violation in neutral K-meson decays, other than the ε mass-matrix parameter, are mainly produced by complex phases in the upper-left 2×2 corner of the KM matrix 16 that, in Wolfenstein's O(λ 5 ) parameterization, (eqn.36) is the confined to the phase of V cd , and is tiny: In contrast, the CKM matrix element for charmless decays of B-mesons that proceed via b → u transitions is V ub , with a CP phase δ that (we now know) is δ ≈ 70 • .If the kaon's direct-CPparameter ε ′ , caused by a tiny, fraction of a degree phase, can be measured in the 20 th century, the observation of CP violations produced by a 70 • phase in B-meson decay in the 21 st century should be easy.
Wrong!There are a number of important differences between the neutral kaon and B-meson systems that make the types of measurements that were used to discover and elucidate the properties of CP violations in the kaon system inapplicable to the B-meson system.

B-mesons have a huge number of different decay channels.
In contrast to the K-meson system, where 99.98% of K S decays are to either π + π − or π 0 π 0 final states, and 99.7% of K L decays are to either πℓν or πππ final states, B-mesons have hundreds of different decay modes almost all of which have, at best, fraction of a percent level branching ratios.The B 0 -B0 mass eigenstates have very short, and nearly equal lifetimes.In the K-meson system, the K S and K L mass eigenstates have large lifetime differences (0.1 ns vs. 52 ns, respectively), and an essentially pure K L beam can be achieved by simply making a neutral beam line that is longer than several K S proper decay lengths.In comparison, the equivalent B H and B L mass eigenstates have very nearly equal lifetimes of a mere 1.5 ps (cτ B = 0.45 mm), and there is no possibility for making a beamline of CP-tagged B-mesons, much less one that distinguishes between the two different CP values.B-meson decays to final states that are eigenstates of CP are infrequent.K S mesons decay almost exclusively to CP-even π + π − and π 0 π 0 eigenstates; K L decays to CP-odd π + π − π 0 and π 0 π 0 π 0 eigenstates occur with branching fractions of 19.5% and 12.5%, respectively.In contrast, the most prominent CP-eigenstate decay mode for neutral B-mesons is B 0 →J/ψK S , with a meager 0.045% branching fraction.
Moreover, as noted above, V cb , which has no CP-violating phase, has a magnitude that is an order of magnitude larger than V ub and, thus, branching fractions for b → u mediated "non-charmed" decays of B mesons are strongly suppressed.As a result the prospects for finding and studying CP violations in the B-meson system looked pretty hopeless.

Prospects for testing the KM CP mechanism: pre 1980
Sometime around 1979-80, Abraham Pais who, along with Murray Gell-Mann was responsible for many of the fundamental theoretical discoveries in the early days of flavor physics, discussed the prospects for CP measurements with charmed and beauty mesons in a seminar at Rockefeller University in New York City, where he had this to say about CP violations with heavy quarks [50]: " There is good news and bad news.The good news is that CP violation in a heavy meson system is quite similar to that of the K-meson system.The bad news is that there is little distinction like the K S -K L mass eigenstates.For heavy meson systems, both lifetimes are short." In the audience was a young theorist Ichiro (Tony) Sanda, who recalls thinking at that time [50]: "'CP violation in a heavy meson system is quite similar to that of the K meson system?'-How could anything as interesting as CP violation be so uninteresting." and he resolved to find a way to prove that Pais was wrong.

Tony Sanda's great idea
At this same time, I was one of the founding members of the CLEO experiment that was located on the Cornell University campus in upstate New York, about a two-hour drive from New York City. 17The CESR e + e − collider was in its infancy and had a maximum instantaneous luminosity of L∼ 5 × 10 30 cm −2 s −1 .We had just discovered the Υ(4S) resonance [51] and while running at E cm = 10.58GeV, its peak energy, we could collect about 30 Υ(4S)→B B events/day (see Fig. 7a).
This was, at that time, the most prolific source of B mesons in the world and we were anxious to make good use of them.To this end, my Cornell colleagues invited Sanda for some seminars.
During his first seminar, the best strategy that Sanda had to offer was a vague plan to search for a ℓ + ℓ + vs. ℓ − ℓ − asymmetry in events of the type with the faint hope that somehow a measurable CP violation would show up.But this was very much like the frequently performed K 0 (τ )→π − ℓ + ν vs. K 0 (τ )→π + ℓ − ν asymmetry measurements, but without any of the above-listed advantages that make the neutral kaon system so special.
The experimenters in the audience, who were all hyped up to do great and wondrous things with the B mesons that they had worked so hard to produce, were noticeably disappointed.When Sanda got back to New York City, he felt under strong pressure to come up with something that was new and unique to B mesons.
A few months later, in his second seminar at Cornell, he did just that.He proposed a scheme that he developed in collaboration with Ashton Carter [52,53] for using interference between the B 0 →K S J/ψ & B 0 → B0 →K S J/ψ decay amplitudes that eventually became the primary motivation for the BaBar and Belle asymmetric B-factory experiments and was the basis for Kobayashi and Maskawa's Nobel prize.
The idea, which is illustrated in Fig. 7 b, was very elegant.You start with a flavor-tagged B 0 (or B0 -here I use a tagged B 0 to illustrate the idea) that can directly decay via the B 0 →K 0 J/ψ diagram show in the top panel, or decay via the indirect route where it first mixes into a B0 that then decays via B0 → K0 J/ψ.But experiments don't distinguish between K 0 or K0 decays, instead they measure K S and K L decays.Thus, in events where a K S is detected, the direct and indirect decay routes access identical final states and interfere.(Likewise for events where a K L is detected, except here the interference term has an opposite sign.) The direct amplitude has no CP phase (at least not at leading order), but the indirect amplitude has an extra factor of V 2 td (not |V td | 2 !) and so, a CP phase of −2β.For tagged B0 decays, the CKM factor is V * 2 td and the CP phase is +2β.Thus, the B0 (τ )→f CP vs. B 0 (τ )→f CP time-dependent asymmetry, where f CP is any CP-eigenstate is where ξ CP (=−1 for K S J/ψ and +1 for is the mass difference between the neutral B H and B mass eigenstates (i.e., the B 0 -B0 mixing frequency), and τ is the proper time between the B 0 →KJ/ψ (B CP ) decay and the flavor-specific decay of the accompanying B0 meson (B tag ), whose decay products are used to tag the B 0 meson's flavor.Note that in e + e − colliders, τ can be positive (when the B CP decay occurs after the B tag decay), or negative (when the B CP decay occurs first), and the time-integrated asymmetry is zero.

Great idea! but is it practical?
The idea was new, and the mechanism was unique to B mesons, but there were many pieces that had to fall into just the right places for Sanda's proposal to have any chance of being practical.
Since at that time there was no experimental information about the b-quark-related CKM elements or B 0 -B0 mixing, there was no way to form any opinion about the prospects for their favorability.
Tens of millions of tagged B→f CP decays would be required: In 1980, the world's best source of B-mesons was CESR, with a production rate of ∼30 B B events/day, of which only half were the desired B 0 B0 pairs.Sanda's golden mode was B 0 ( B0 )→K S J/ψ, which he estimated to have an O(10 −3 ) branching fraction, and this implied that the fractional probability of usable events would be where ϵ trk is the efficiency for charged track detection that, even in a nearly perfect detector, cannot be much higher than ϵ trk ≈ 0.85, and ϵ tag eff ≈ 0.3 is an estimate of the maximum possible effective B-flavor tagging efficiency.Thus, an A CP KJ/ψ measurement with even modest precision would require ∼30 M B B events (and a million days of operation at 1980 state of the art collider and detector performance levels).B 0 -B0 mixing had to be substantial, |V cb | had to be small, and |V ub | even smaller: An essential part of the Sanda-Carter scheme is that the fraction of B 0 mesons that oscillate into a B0 before they decay has to be reasonably large.This meant that ∆M B would have to be similar in magnitude to Γ B = 1/τ B , which was then known to be Γ B ≈ 4.4 × 10 −10 MeV.In the 1980s, when the t-quark mass was (almost universally) expected to be m t ∼ 35 GeV, calculations [54] found ∆M B ≈ 1.2 × 10 −10 MeV.If this were the case, only ∼6% of the tagged B 0 -mesons would oscillate into a B0 before decaying.(After three lifetimes, the sin ∆M B τ factor in eqn.39 would have barely reached 0.5.)In addition the B lifetime had to be relatively long: i.e., |V cb |<0.1, and |V ub |<|V cb |.The time sequence between the tag-and f CP -decay has to be distinguished: The asymmetry in eqn.39 has opposite signs for negative and positive values of τ , which makes it essential to distinguish between events in which the B CP decays occurred first from those when it decayed last.The B mesons that are produced in Υ(4S)→B B decays have c.m. momenta p B = 327 MeV/c, corresponding to γβ = 0.062 and have a mean decay distance of βγcτ B = 28µm, which is unmeasurably small in a c.m. e + e − collider environment.For a collider operating at the Υ(4S), it would be impossible to distinguish the time sequence of the B CP and B tag decays.
Since the existence of six-quarks was pretty well established, the KM-mechanism provided a compelling and almost obvious mechanism for explaining the existence of CP violation.However the prospects or a conclusive experimental test of this idea seemed hopeless.The fortuitous set of circumstances that made studies of CP violation in the neutral kaon system possible seemed unlikely to be repeated.

Three miracles
Nevertheless, in spite of these obstacles, Sanda maintained a nearly mystical belief that "Mother Nature has gone out of Her way to show us CP violation, and She will also show us the way to the fundamental theory" [50], and forcefully advocated an aggressive program of experimental investigations of CP violation in the decays of B mesons.However, for the reasons itemized above, Sanda's advocacy was initially met with considerable skepticism from his colleagues in both the theoretical and experimental physics communities.
And then three miracles occurred: Miracle 1: B 0 -B0 mixing was discovered at DESY: The most exciting event in flavor physics during the 1980s was the 1987 discovery of a large signal for B 0 -B0 mixing by the ARGUS experiment at DESY [55].The strength of the mixing was clear evidence the that top-quark mass, now known to be 173 GeV, was nearly an order of magnitude larger than expected, which was shocking news to almost everyone.This discovery, coupled with the 1.5 ps B-meson life-time measurements from PEP and PETRA that translated into |V cb |≈ 0.04, and the suppression of b→u relative to b→c transitions meant that |V ub | was about a factor of ten smaller than |V cb |.These measurements confirmed Sanda's strong belief that Mother Nature would indeed help us "find the way to the fundamental theory."Miracle 2: Three-order-of-magnitude improvement in e + e − collider luminosity: Advances in the understanding and modeling of beam dynamics, the use of separate magnet rings that enabled multibunch collisions, and major advances in RF feedback systems provided the huge increases in the e + e − →Υ(4S) production rate that were required by the experiment [56,57].Miracle 3: The invention of asymmetric e + e − colliders and innovations in detector technology: Pierre Oddone realized that an e + e − collider operating at the Υ(4S) resonance with a modest (i.e., factor of ∼2) difference between the e + and e − beam energies would produce boosted B-mesons with O(100 µm) separation distances between the B tag and B CP vertices [58].This idea, coupled with the concurrent development of high resolution silicon-strip vertex detectors that could measure such small displacements in a collider environment [59][60][61][62], offered a realistic solution to the decay-time-sequence determination problem.Parallel improvements in detector performance levels in areas of particle identification [63,64], and γ-ray & K L detection [65][66][67] advanced the state of the art levels of detection efficiencies and B-flavortagging quality.
6 First measurements of the KM angle β At leading order, measurements of the eqn.39 Carter-Sanda asymmetry determines sin 2β, where β = tan −1 η/(1 − ρ) and η & ρ are the modified Wolfenstein parameters described in Section 4.2.2.Prior to the summer of 2001, the best measurements of sin 2β were from CDF [68] (0.79 ± 0.44), BaBar [69] (0.34 ± 0.21) and Belle [70] (0.58 ± 0.34).The BaBar result was based on a sample of 23 M Υ(4S)→B B events and Belle, which was struggling with electron cloud effects in the KEKB positron ring [71], had a smaller data sample of 11 M Υ(4S)→B B events.Each of the three measurements were about 1.5σ from zero, and their weighted average, 0.46 ± 0.17 indicated a non-zero CP violation at the ∼2.5σ level.The situation was tantalizing, but not conclusive.This changed in August 2001 when, in back-to-back articles in Physical Review Letters, BaBar, now with a sample of 32 M B B pairs, reported [72] sin 2β = 0.59 ± 0.14 ± 0.05 BaBar ( 2001), and Belle, with a 31 M B B pair data sample, reported [73] sin 2β = 0.99 ± 0.14 ± 0.06 Belle ( 2001).
The BaBar data sample contained 803 B CP event candidates with a signal purity of about 80%.The top three panels in Fig. 8a show the BaBar experiment's measured time distributions for ξ CP =+1 B CP decays.The uppermost plot shows the number of B 0 tags where it is evident that there are more events with τ >0 than with τ <0, while the B0 tags in the panel beneath it display an opposite pattern.The third panel shows the bin-by-bin asymmetry where there is a clear indication of the sine-like behavior that is expected for a CP violation as given in eqn.39.The three lowest panels show the corresponding results for B →K L J/ψ decays with ξ CP =−1, where the τ -dependent asymmetries have opposite signs, again as expected.During the two decades following the Belle and BaBar reports, there have been many hundreds of measurements of non-zero CP violating symmetries in B meson decays, mostly by BaBar and Belle, which continued operating until 2008 and 2010, respectively.This program is being continued by the LHCb [74], an experiment specialized for heavy flavor physics at the CERN Large Hadron Collider that began operating in 2010, and Belle II [75], an upgraded version of Belle at KEK that began operating in 2018.As is discussed in more detail in other contributions to this symposium, all of these hundreds of measured CP violations are well explained as being due to the effects of the single KM CP-phase angle δ.
The KM model has been a remarkable success.

A few comments on mixing and CP violation in the neutrino sector
As mentioned above, the two-doublet nature of the of the leptons was identified in 1961 [16], a decade before it was established for quarks.The notion of neutrino-mixing was first proposed four years earlier by Pontecorvo [76] in 1957 and the PMNS neutrino mixing matrix was suggested Maki, Nakagawa and Sakata [77] in 1962, a year before Cabibbo's paper appeared.When the τlepton was discovered in 1975, the six-lepton picture was established and the PMNS matrix for neutrinos expanded to the same 3×3 structure as the CKM matrix for quarks.If neutrinos are Dirac particles, the mathematics of the neutrino-flavor mixing matrix are the same as the KM matrix with three mixing angles θ ij and one CP violating phase δ CP , the so-called the "Dirac" phase.If neutrinos are Majorana particles, lepton number is not conserved and the matrix's number of degrees of freedom increases by two and there are two additional CP-violating "Majorana" phases [78] that have no measurable effects on neutrino mixing experiments (which makes them hard to measure).The commonly used parameterization of the PMNS matrix that doesn't include the Majorana phases uses same mixing angle definitions as the eqn.
where ν 1 , ν 2 , ν 3 denote the three neutrino mass eigenstates and θ 12 , θ 23 , and θ 13 are the "solar," "atmospheric," and "reactor" neutrino mixing angles.The explicit form of the matrix is: 18  and the PDG 2020 [38] world averages for the three rotation angles are: Although mixing in the quark-and lepton-sectors have the same mathematical structure, there are important differences in their practical applications, including: Strikingly different hierarchies: The three PMNS mixing angles listed above differ in values by at most a factor of five, in contrast with the the CKM-matrix where the corresponding mixing angles are smaller and span a two-order-of-magnitude range in magnitudes: Differences between the two matrices are illustrated in Fig. 9a, where the areas of the squares are proportional to the magnitudes of the matrix elements.Here Nature has been helpful again; if the PMNS matrix had a hierarchy that was similar to that of the CKM matrix, neutrino oscillations and neutrino masses would likely not have been discovered.
They operate in opposite "directions:" In the CKM case, the quark mass-eigenstates are well known and the matrix is used to determine the flavor states.For the PMNS case, the neutrino flavor states are well known and the matrix is used to determine the mass eigenstates.Oscillation experiments have determined the mass-difference hierarchies shown in Fig. 9b, where [38] atmos : ∆m 2 32 = ±(2.44 ± 0.03) × 10 −3 (eV) 2 and solar : ∆m 2 21 = (7.53± 0.18) × 10 −5 (eV) 2 .
The still unknown sign of ∆m 2 32 (= m 2 3 -m 2 2 ) leaves two possible hierarchies as shown in Fig. 9b.
Quarks decay, neutrinos (probably) don't: CKM-related measurements are almost entirely based on measurements of decay processes with a formalism for oscillations and CP violations is done in the hadron's restframe.In contrast case, PMNS-related measurement always involve production processes that produce pure flavor states and detection experiments located at some baseline distance that determine how the neutrino flavor-contents changed during their propagation to the detector.Since the neutrino's restframe is ill defined, 19 the formalism is usually done in the laboratory frame and expressed in terms of E ν -and ∆m 2 -dependent oscillation lengths, i.e., where the factor of 1.27 is specific to (l, E ν , ∆m 2 ) units of (m, MeV, eV 2 ), or (km, GeV, eV 2 ).For the atmospheric and solar mass differences given above, these are where the units km/GeV and m/MeV are equivalent (and interchangeable), with km/GeV usually attached to atmospheric and m/MeV to solar for historical reasons.
The 320 km atmospheric length is long, but not impossibly long.It is nearly the same as the 295 km distance between J-PARC and Super Kamiokande (and the soon-to-be-completed HyperK detector), and corresponds to a 90 • phase-change-induced oscillation maximum for 600 MeV muon-neutrinos that can be copiously produced by the J-PARC synchrotron.The 10.5 km solar length corresponds to an oscillation maximum for E ν = 3 MeV reactor anti-electron-neutrinos at a baseline of 50 km, which is the baseline for the JUNO reactor neutrino experiment that will soon be operating in China.The HyperK [79] and JUNO [80] experiments will certainly not be easy, but they will be done.If the neutrino oscillation lengths were factors of two or more longer, both experiments would be much more difficult, if not impossible.

Conclusions
The related subjects of CP-violation and quark & neutrino flavor mixing provide peeks into some of Nature's most intimate secrets.When Fitch and collaborators measured the ∼0.2% branching fraction for K L →π + π − in 1963, they were seeing the influence of the t-quark that wasn't discovered until thirty years later (see Fig. 4).Thanks to Kobayashi and Maskawa, the existence and most of the properties of the t-quark (other than its mass) were pretty well understood more than twenty years before it was discovered.

Nature has been kind
(Einstein famously said "The Lord God is subtle, but not malicious.")A common thread that characterizes this entire story is that we have been able to probe these subjects at considerable depth, even though it didn't a priori have to be that way.The 0.2% K L →π + π − branching fraction is as large as it is because of the phase-space suppression of the partial decay width for the CP-allowed K L → 3π mode that has a Q-value of only 83 MeV.This is a consequence of the relative masses of the Kand π-mesons: if the kaon mass were higher and/or the pion mass were lighter, the K L → 3π partial width would be larger, and the K L →π + π − branching fraction would be reduced.It wouldn't take very large changes in these masses to push the π + π − branching fraction down to a value that was below the Fitch experiment's sensitivity level.Unlike parity violation, which is a huge effect that is difficult to miss, the particle physics consequences of CP-violations in processes other than K L decays have only been seen in elaborate, highly focused experiments that, absent the K L measurements, very likely wouldn't have occurred.
As described above, the KM phase was only measurable because Nature's parameters are aligned in a way that meet the stringent requirements that were first identified in the Carter-Sanda papers, including a strong |V us |>|V cb |>|V ub | hierarchy and a large enough t-quark mass to produce a B-B mixing rate that is close to the B-meson lifetime, but not so large that the mixing rate was much faster than the lifetime.As a result, the phase could be measured, as could all three of the CKM matrix's mixing angles.
Remarkably, the same thread extends into the neutrino sector in which the PMNS matrix has a very different, almost flat hierarchy that facilitated the discovery of neutrino oscillations.This hierarchy has also enabled precise measurements of all three of its mixing angles and will likely allow for a determination of its Dirac phase in the not-so-distant future.Moreover, and as mentioned above, Nature's choices of the differences between the ν 1 , ν 2 , ν 3 eigenstate masses have made these measurements realizable.

What's next?... CPT ?
Although CP violations only show up as small, subtle effects in particle physics experiments, the influence of CP violations on the evolution of the Universe is glaringly obvious [81].However, the CP violations that we measure in K-, Band (recently [82]) D-meson decays that are produced by the KM mechanism for quarks and, likely, leptons, cannot nearly account for the matter-antimatter asymmetry of the Universe (see, e.g., ref. [83]).There must be other sources of CP violation that have yet to be discovered, and these are the primary motivations for the LHCb and Belle II experiments.With the KM phase, Nature has given us a glimpse of CP violation, but not the whole story.
But what about CPT ?The CPT theorem [84] states that any quantum field theory that is Lorentz invariant, has local point-like interaction vertices, and is Hermitian (i.e., conserves probability) is invariant under the combined operations of C, P and T .Since the three QFTs that make up the Standard Model-QED, QCD, and Electroweak theory-all satisfy these criteria, CPT symmetry has been elevated to a kind of hallowed status in particle physics.But the nonrenormalizability of quantm gravity calls into question the validity of the locality assumption [85], and suggests that at some scale, CPT has to be violated.Strictly speaking, this violation only has to occur at impossibly high energies near the O(10 19 GeV) Planck scale, but maybe the same thread of Nature that gives us a taste of CP violations at energy scales well below the scale needed to explain the baryon asymmetry of the Universe, will give us a hint of CPT violation at scale below the one that's needed to rescue quantum gravity.In any case, since it is a fundamental feature of the Standard Model, CPT invariance should be routinely challenged at the highest feasible experimental sensitivities.
To date the most stringent test of the CPT prediction that particle-antiparticle masses are equal comes from kaon physics20 [86,87] and sets a 90% C.L. limit on the K 0 -K0 mass difference of which is seven orders-of-magnitude more stringent than that for m ē−m e and nine orders-ofmagnitude more stringent than the m p−m p limit.This high sensitivity is because of the magic of the virtual processes in the Fig. 4a box diagram and the unique properties of the neutral kaons.
The kaon result is based on experiments done nearly thirty years ago with data samples containing tens of millions of K→π + π − decays.One of the reasons these measurements have not been updated since then is that technologies for improved sources of flavor-tagged neutral kaons have not been pursued.However, if the above-described three order of magnitude improvements in e + e − collider luminosity that were developed for the B-factories would be applied to a dedicated collider operating at J/ψ mass peak, multi-billion-event/year samples of flavor-tagged neutral kaons, produced via J/ψ→K ∓ π ± K 0 ( K0 ) decays, would be available to support CPT tests with more than an order-of-magnitude improved sensitivity.More modest improvements in CPT sensitivity would be provided by new colliders in the τ -charm energy range that are being proposed in China [88] and Russia [89], if they spend sufficient time operating at the J/ψ peak.
Maybe during the next sixty years, CPT violation studies will prove to be as interesting and provocative as CP studies have been during the past sixty years.

Fig. 2 :
Fig.2: a) The neutral K L -beamline at the AGS that was used for the K L →π + π − search experiment.b) The two-arm π + π − spectrometer consisted of a helium-filled decay volume followed by optical tracking spark chambers before and after momentum analyzing magnets c) The distribution of events versus the cosine of the angle between the direction of the two-track momentum sum and the beamline.The upper, middle and lower panels are for events with two-track invariant masses that are below, centered on, and above m K L , respectively.

Fig. 3 :
Fig. 3: The a) X and b) Y projections of the Niu event.Here the tracks labeled B and C have kinks at depth of

Fig. 4 :
Fig. 4: a) The W -exchange Standard Model box diagram for K 0 -K0 mixing (not shown is the one with heavy quark exchange).In the KM six-quark model, the kaon's mass-matrix CP-violating parameter ε is produced by interference between the cand t-quark contributions.b) The penguin diagram for direct-CP-violating K 2 →ππ decays.

1 st ↔ 2
nd generation: The suppression of strangeness-changing decays that was noted by Cabibbo in 1963, led to the realization that the relative strengths of the u→d and the strangenesschanging u→s transitions are modulated by factors of cos θ C and sin θ C , respectively, where θ C = 13 • is the Cabibbo angle.Thus the diagonal element V ud = 0.974 is nearly unity and almost five times larger that its adjacent entry V us = 0.225. 2 nd ↔ 3 rd generation: In 1983, the MAC experiment at the PEP collider used b-flavored hadrons which is the (textbook) expression for the muon lifetime tailored to the b-quark mass, modified to include the effect of |V cb | on the coupling strength, and multiplied by the number of accessible final states: two quark flavors, three quark colors, and three types of leptons, as illustrated in Fig.5c.The result was |V cb | ≈ 0.04, and about a factor of five smaller than |V us |,

Fig. 5 :
Fig. 5: a) A sketch of the projections of tracks from the semileptonic decay of a B meson onto a plane perpendicular

Figure 8b shows theFig. 8 :
Figure8bshows the results from the Belle experiment[73] that used 747 ξ CP =−1 B-decay candidates (mostly B→K S J/ψ) with 92% purity and 569 ξ CP =+1 B→K L J/ψ decay candidates with 61% purity.The top plot on the left side of the panel shows the proper-time distribution for the ξ CP =−1 modes minus that for ξ CP =+1 modes.The 2 nd and 3 rd panels from the top show the ξ CP =−1 and ξ CP =+1 modes separately, where the different CP modes have opposite-sign asymmetries as expected.The curves show the results of fits to the data.The bottom panel shows the asymmetry for a large sample of self-tagged (non-CP eigenstate) B decays (B 0 →D ( * )− π + , D * − ρ + , K * 0 (K + π − )J/ψ, and D * − ℓ + ν), where a non-zero asymmetry would have to be due to instrumental effects; the fit to this sample returned an asymmetry amplitude of 0.05 ± 0.04.The open circles in the plot on the right side of Fig.8bshow the time distribution for B0 -tags (q=1) in ξ CP =−1 B CP decays plus B 0 -tags (q=-1) in ξ CP =+1 decays (i.e., qξ CP =−1), with the fit results shown as a dashed curve.The black dots and solid curve show the sum of the opposite combinations (qξ CP =+1) and their fit result.The BaBar and Belle results excluded a zero value for sin 2β by 4σ, and 6σ, respectively, and their combined significance established conclusively that CP symmetry is violated in the B-B 26) Chau-Keung version ofthe CKM matrix, but with the sequence of rotations reversed, i.e.,

Fig. 9 :
Fig. 9: a) The area of the squares illustrates the magnitudes of the CKM (left) and PMNS (right) matrix elements.b)A not-to-scale illustration of the normal and (left) and inverted (right) neutrino mass hierarchy.The flavor content of each of the ν 1 , ν 2 , ν 3 mass eigenstate is indicated by different shades of gray.
was written in mid-1972, and published in the February 1973 issue of the Japanese journal Progress of Theoretical Physics, where it was basically ignored; during the following two and a half years, it received all of two citations.The paper's title is CP Violation in the Renormalizable Theory of Weak Interaction, where the Renormalizable Theory of Weak Interaction is the term they used for what we now call the Standard Model.