Progress in ENSO prediction and predictability study

ENSO is the strongest interannual signal in the global climate system with worldwide climatic, ecological, and societal impacts. Over the past decades, the researches about ENSO prediction and predictability have attracted a broad attention. With the development of coupled models, the improvement in initialization schemes


Introduction
As a critical component of global change, climate variability is potentially one of the most serious environmental issues we face today.A prominent example is the famous El Niño phenomenon, the strongest source of interannual climate variability [1] .El Niño events are characterized by abnormal warming anomalies in the eastern or central equatorial Pacific Ocean, which occurs irregularly every 2-7 years.It involves changes both in the sea surface temperature (SST) and in sea level pressure across the equatorial Pacific Ocean, so it is also referred to as the El Niño-Southern Oscillation (ENSO) phenomenon.Although ENSO originates in the tropical Pacific, its effects are not confined to regional climate but act to induce large weather and climate anomalies worldwide.It is believed that a modest predictability of global climate anomalies can be gained owing to significant impacts of the predictable ENSO [2] .Skillful ENSO prediction offers decision makers an opportunity to take into account the anticipated climate anomalies, potentially reducing the societal and economic impacts by this natural phenomenon, and assisting in the management of natural resources and the environment.
Since the 1980s, ENSO has been the focus of oceanic and atmospheric science research.With a series of international collaboration programs and initiatives such as the Tropical Oceans and Global Atmosphere (TOGA) Program and the Climate Change and Prediction (CLIVAR) Plans, significant progress of ENSO prediction has been made over the past four decades [2][3][4][5] .At present, there are more than 20 models on ENSO for real-time forecasts of half year to one year (see http://iri.columbia.edu/climate/ENSO/currentinfo/update.html).Overall, the current Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 forecast models can provide effective predictions of ENSO warm and cold events 6-12 months ahead [5] .ENSO is currently regarded as the most predictable target at the time scales of seasonal climate prediction.Such progress should be attributed to several aspects.First, the observing systems have been greatly developed with a series of meteorological and oceanic satellites launched and international observing and research programs initiated such as TOGA, GOOS (Global Ocean Observation System), and Argo etc., offering an excellent opportunitiy to initialize ENSO prediction by various oceanic and atmospheric data.Theoretically, ENSO prediction is a problem of initial value.Under the perfect model assumption, ENSO prediction skill is dependent on the initialization scheme used in the prediction models.The rapid development of data assimilation methods, from simple optimal interpolation in the 1970s-1980s, to 3dimensional variational (3Dvar) and 4-dimensional variational (4Dvar) in the 1980s-1990s, to Ensemble Kalman Filter (EnKF) and other ensemble-based methods in the 1990s, greatly advances the study of initialization, making the initial conditions of predictions as accurately as possible.Second, a hierarchy of ENSO prediction model has been developed to investigate its physical mechanisms and to improve its predictability over the last decades.These studies greatly improve model performance by improving physical parameterizations, fining spatial and temporal resolutions, and enhancing the understanding of the tropical oceanic and atmospheric processes underlying the ENSO phenomenon.Third, the theoretical predictability study, including the prediction error growth dynamics and the estimate theory of the intrinsic limit of predictability, has been well conducted and applied to Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 ENSO prediction.The predictability study greatly promotes the development of ensemble prediction system.In recent years, ENSO probabilistic prediction, based on the ensemble prediction, has attracted a broad attention and is routinely issued by some research and operational centers.
However, ENSO prediction still presents a great deal of uncertainty [6] .A typical case is the latest El Niño event from 2014 to 2016, which poses a severe challenge to the classic ENSO theory-based forecasting models.For example, the El Niño event did not occur in 2014 as anticipated by most models, whereas the 2015 El Niño, which is one of the strongest events in the history, was not predicted by almost all models at one year lead-time.Moreover, the 2015 El Niño is significantly different from the extreme events of 1997/98 and 1982/83 in the formation and warming pattern [7] .This challenge has stimulated a new round of ENSO research in the world.
The motivation of this paper is to review the recent progress in ENSO prediction and predictability studies.The predictability of the earth's climate can be categorized into two types: practical prediction and theoretical predictability.The former is related to prediction skill achieved by models, which can be expected to produce, to our best ability, the best prediction skill by model improvement, prediction initialization and ensemble construction etc., whereas the latter is to assess theoretical upper limit of ENSO prediction skill, which is also called the intrinsic predictability or potential predictability in the predictability study.The intrinsic predictability is an inherent characterization of a physical system rather than our ability to make skillful predictions in practice.The intrinsic predictability study can answer such challenging questions as whether the ENSO prediction skill can be further enhanced by the improvement in the prediction system, and if so, how much room there is for improvement [8,9] .The remainder of this paper is organized as follows.Section 2 reviews the development of ENSO prediction models, with an emphasis on the development of operational prediction systems in China.The progress in data assimilations, initialization schemes and ensemble predictions is reviewed in section 3. Section 4 discusses the optimal prediction error growth, the Spring Predictability Barrier (SPB) and the estimate of the Intrinsic Predictability Limit (IPL).Section 5 focuses on the probabilistic Prediction of ENSO.A brief summary and discussion of the major challenges follows in Section 6.

Physical basis for ENSO prediction
Since Bjerknes' seminal work [10] , the key processes responsible for interannual variability and predictability associated with ENSO have been identified.That is, the Bjerknes feedback is regarded as a major process responsible for ENSO development, involving the interactions among the SST, surface wind and thermocline.It has been recognized that subsurface thermal anomalies play important roles in ENSO-related variability and predictability in the tropical Pacific.In particular, the equatorial ocean wave dynamics and the thermal structure in the tropical Pacific enable thermal perturbations to be sustainable and to propagate around the basin, acting to have remote influences basinwide.These slowly evolving thermal anomalies in the subsurface ocean on the basin scales offer seasonal-to-interannual memory by which the coupled ocean-atmosphere system can carry on past information with time into the future.It is the existence of these long-lasting thermal memories that provides a physical basis for ENSO prediction.Thus, oceanic information observed at the subsurface is critically important to ENSO prediction and needs to be adequately incorporated into models through data assimilation, a technique used in the prediction initialization that will be discussed in next section.One of the key factors that impacts on the ENSO prediction skill is the representation of model climatological state, which is crucial in capturing right annual cycle and characterizes ENSO asymmetry [11] .The poor performance for simulating the characteristic of ENSO asymmetry in the coupled modes is probably a main reason that fails to capture the occurrence of super El Niño event, since it can have a rectification effect on the time-mean state [12,13] .Also, it is worth noting that ENSO can be significantly modulated by other forcing and feedback processes within and/or out of the tropical Pacific (See also Wang et al. 2018 in this issue).

Current status of ENSO predictions models
Methodologically, there are two kinds of models used for ENSO predictions.The first are the statistical models, which use the historical data to construct the evolution of ENSO, typically represented by Niño3 (90-150°W, 5N-5°S) or Niño3.4 (120-170°W, 5N-5°S) SST anomaly index.The statistical models include linear statistical and nonlinear statistical models.The former is constructed using linear methods such as multiple linear regression, canonical correlation, and Markov chain and so on [14,15] , whereas the latter is constructed mainly using neural network and other machine learning methods [16] .Both made success in ENSO prediction and some are still run in operation.However, the development and application of Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 statistical models have been significantly reduced due to their lacking both physical basis and the room for skill improvement.
The second kind of models are the coupled models, which have become the main tools for studying ENSO mechanisms, simulation and prediction.Since the first coupled ENSO model was developed [17,18] , various types of coupled models have been designed and used for ENSO simulation and prediction.These coupled models include simple models [19] , intermediate coupled models [18] , hybrid coupled models [20, 21] , and fully coupled general circulation models (GCMs) [2,4,5] .Currently, more than 20 models with different degrees of complexity are routinely used to make realtime forecasts of ENSO.The skillful ENSO predictions can now be made six months and longer ahead [5] .There have been several excellent review papers about progress and current status of ENSO coupled models [2][3][4][5] , and readers are referred to as these for further information.

ENSO Prediction in China
In China, there are two nationally operational systems for ENSO real-time prediction.One system is the ENSO Monitoring, Analysis and Prediction, called SEMAP2, developed in the National Climate Center (viz., Beijing Climate Center) of China Meteorological Administration (BCC/CMA) [22] .This system is based on the operational seasonal forecasting model (BCC_CSM1.1m)and the physics-based statistical prediction following the dynamical mechanisms of two types of ENSO [23] .
This system is composed of five sub-systems including the real-time monitoring of the tropical atmosphere-ocean, dynamical diagnosis, physics-based statistical prediction, model ensemble forecasting, and analogue-based correction of the model prediction [24] .This system markedly improves the operational capability of ENSO monitoring and prediction in BCC/CMA.A 20-yr independent hindcast shows a good prediction skill with the correlation skill of Niño 3.4 index reaching 0.8 at 6month lead [25] .It also has successfully predicted the 2015-16 super El Niño event at lead about six months [26] .The other operational prediction system, run in the National Marine Environmental Forecasting Center (NMEFC/SOA) of State Ocean Administration, is based on the Community Earth System Model (CESM) developed by of the US National Center for Atmospheric Research (NCAR).A nudging assimilation system for multiple oceanic data including subsurface ocean temperature is used to initialize predictions.This prediction system also has a good performance in prediction ENSO reaching 0.7 correlation skill at 6-month lead [27] .
There are also several other ENSO prediction systems run in China.One is implemented in the Institute of Oceanology, Chinese Academy of Sciences (IOCAS), called the IOCAS ICM [28] .This is an intermediate anomaly coupled model (ICM), consisting of an intermediate ocean model and an empirical wind stress model.One crucial component of the ICM is the way in which the subsurface entrainment temperature in the surface mixed layer is explicitly parameterized in terms of the thermocline variability [29] .The model is one of the coupled models that made a good prediction of the cold SST conditions in the tropical Pacific in 2010-12 [30] .Another prediction system was developed at the Institute of atmospheric physics (IAP) at the Chinese Academics of Science Sciences.It adopts the earlier version of IOCAS ICM but includes a newly-developed atmospheric and oceanic data coupled assimilation system [31] and an ensemble construction system [32] .A 20-year retrospective Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 hindcast experiment showed that the system can achieve good forecast skill up to one year, comparable with some of the best ENSO prediction models.In each month, the above four operational systems issue routinely ENSO predictions.Also, the predictions of the IOCAS ICM and SEMAP2.1 have been collected by the IRI/CPC ENSO plume product (http://iri.columbia.edu/climate/ENSO/currentinfo/update.html).
Recently, a super-ensemble of multiple-model ENSO forecast system is being developed by the Second Institute of Oceanography, State Ocean Administration of China.The system is composed of an ensemble-based coupled data assimilation system and a super-ensemble of multiple ENSO forecast models, including the LDEO5 (the Lamont-Doherty Earth Observatory version 5) model, a hybrid coupled model [21] , and two fully coupled GCMs (GFDL-CM2.1 and NCAR CESM).The system will provide both deterministic and probabilistic predictions.At present, the LDEO5 ensemble prediction system, which includes a weakly coupled assimilation system of oceanic and atmospheric observation and a stochastically optimal perturbationbased ensemble construction system [33] , has started running.A long-term hindcast experiment from 1856-2016, as shown in Fig. 1, indicates that the system can capture almost all warm and cold events at 6 month leads.Its correlation skill is comparable with the current best level, and better than the latest version of LDEO5 [34] .

An example taken for the 2015-16 El Niño event as a demonstration
Here, the latest strong 2015-16 El Niño event can be taken as an example for illustrating the current status of ENSO predictions using the state-of-the-art coupled models.Figure .2 demonstrates the models' real-time performance in predicting the Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 2015-16 El Niño event, collected in IRI.As observed, one striking feature associated with this 2015-16 El Niño event was the slow evolution of warm SST anomalies in the western tropical Pacific through 2014 and early 2015 [7] ; subsequently, the related ocean-atmosphere anomalies were coupled and amplified in spring 2015 and developed rapidly into a warm event in late spring 2015; then this warm event eventually evolved into an extreme El Niño in the following months.Looking at the 2015 El Niño prediction by these coupled models (Fig. 2), the observed SST evolution from summer through winter 2015 were adequately depicted when predictions were initialized from August 2015 (Fig. 2a).However, there existed large uncertainties of the predictions of these coupled models.For example, the predicted intensity exhibits a wide spread across these coupled models in summer and fall 2015.In particular, almost all models failed and missed the strongest warming when the predictions were initialized at early 2015 (Fig. 2b), probably due to the SPB problem, westerly wind bursts (WWB) problem or others [7] .The uncertainty is also very

Initialization and data assimilation
Since the memory for ENSO mainly resides in the ocean, the oceanic data assimilation plays a vital role in the history of ENSO prediction development.There was a long way in oceanic initialization study in the field of ENSO prediction, from simple to complex algorithms, surface to subsurface observations, and from single data to multiple data sources.Initially, attempts had been made to generate consistent initial conditions by using SST observations [35,36] .Kirtman et al. [37] used an iterative ocean initialization procedure that modifies the zonal wind stress anomalies based on the simulated SST anomalies errors.With a 3DVar assimilation scheme, Tang et al. [38] assimilated SST into an OGCM with a statistically-derived correction scheme for subsurface surface temperature increment.Keenlyside et al. [39] nudged observed SST to initialize ENSO prediction from 1969-2001 and achieved good prediction skills.Recently, Merryfield et al. [40] initialized ocean states by nudging SST for the Canadian seasonal to interannual prediction system and achieved good the prediction skill comparable with the current best level.
A lot of works got to realize the importance of the subsurface processes (e.g., entrainment and mixing) in controlling interannual SST variability in the equatorial Pacific [1] .It is well known that the variation in the sea level height in the tropical Pacific is dominated by the subsurface dynamics and thermodynamics [41] .Thus, many efforts have also been contributed to assimilate sea level gauge data and altimeter data [42] , in situ profiles [43] , and even salinity observations [44] .All these studies show that the inclusion of these new observational data into the assimilation Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 system can significantly improve the estimation of subsurface ocean states, and therefore achieve a more reliable ENSO forecast.
More recently, the approach to assimilate atmospheric observations to initialize the ENSO prediction was initiated in a more general context of the coupled model assimilation [6,31] .As demonstrated in Chen et al. [6] , the ocean-only initialization approach is not necessarily the optimal method for skillful forecast due to an "initial shock" at the transition from uncoupled to coupled run.The more natural method, coupled assimilation, allows the adjustment of both atmosphere and ocean states by assimilating either the atmospheric or oceanic observations.The concept of coupled assimilation is attractive due to the potential benefit of the delivery of a balanced and dynamically consistent estimate of the coupled atmosphere-ocean state.The added values of the coupled assimilation to the ocean-only assimilation have been proven in ENSO predictions [32,45] .

Ensemble prediction
Owing to the high degree of non-linearity and stochastic forcing of the atmosphere and ocean systems, the evolution of the future states of the atmosphere and the ocean contains large uncertainties.An important way to consider the prediction uncertainties is by the ensemble forecasting, namely, a group of predictions that are generated by perturbing a small amount of "errors" onto the initial conditions or model parameters [6] , and weighted (or arithmetic) average or probability distributions are taken as the forecast.Compared with a single forecast, the ensemble forecast can remove some unpredictable noises by the average and provides a practical tool for estimating the possible uncertainties by additional information, such as the probability distribution function (PDF) of forecast.
The development of ensemble forecast is mainly the development of perturbation methods.In addition to the random perturbation methods, the optimal perturbation methods based on dynamical constraints, such as the Breeding Vector (BV), the Singular Vector (SV), Stochastic Optimal (SO), Conditional Nonlinear Optimal Perturbation (CNOP) and so on, were developed and applied.The perturbation methods derived from the data assimilation include the EnKF, the ensemble transmission Kalman filter (ETKF) and other ensemble-based filters [46] .
Significant progress has been made in using these optimal perturbations to construct ensemble predictions and to study ENSO predictability.
Generally, the strategies used to produce optimal perturbations for ensemble predictions include perturbation of the initial conditions and the perturbation of model parameters that consider errors existing in physical/dynamical parameterizations.The optimal perturbation of initial condition was often constructed by SV, BV. climatologically-relevant singular vector (CSV) analysis and CNOP [33,[47][48][49] .The impact of model parameter uncertainties on predictions were considered in many works.For example, Zheng et al. [50] developed an ensemble perturbation method by adding some random terms to the right-hand side of the model equations of an intermediate coupled model, and these random terms or process noise sources were explicitly defined as the model errors.For the noise-free models, the stochastic noise is often considered in the framework of the stochastic optimal theory, which allows to construct a spatial-temporal coherent noise [33,51] .In Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 addition, the multiple models and multi-methods are also used to construct the super-ensemble predictions which can improve ENSO prediction skill in both deterministic measures and probabilistic measures [25,52] .

The optimal prediction error growth
An important aspect of potential predictability research is the analysis and diagnosis of the optimal growth of forecast error, which explores the dynamics of forecast error and explains the physical mechanism of forecast uncertainty.It is fundamental to the construction of ensemble prediction.The current optimal error analysis methods, as mentioned above, include SV, BV and CNOP as well as the Nonlinear Local Lyapunov Vector (NLLV) [53] etc..These optimal error growth and optimal perturbation methods were introduced to the field of ENSO prediction at the 1990s and has since been used for ENSO predictability study [54][55][56][57] .
There are a number of studies on ENSO predictability using SV and BV as mentioned above.Although these works used different intermediate complexity dynamical models, they basically obtained the consistent results.Namely that, the first singular vector is basically dominated by a west-east dipole spanning across the equatorial Pacific, with one center located in the east and the other in the central Pacific, and the fastest error growth rate (leading singular value) is little sensitive to initial conditions and optimization time.However, the error growth is seasonal dependent with the largest growth during the spring.The ENSO states can also Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 influence the error growth with the largest error growth occurring on the onset of El Niño and the smallest growth during the La Niña [47,56,58] .
While these works all used simplified or intermediate complexity coupled models, Kleeman et al. [59] proposed a method to calculate the climatologicallyrelevant singular vector (CSV) for the coupled GCM, in which the atmospheric noise is filtered by running a large ensemble of integrations.Tang et al. [48] first applied the CSV to a fully coupled GCM to investigate the error growth associated with ENSO forecasts.The results show that the singular vectors share many of the properties already seen in the simpler models.However, this particular CGCM also displays some differences from the simpler models; thus subsurface temperature optimal patterns are strongly sensitive to the phase of ENSO cycle, and at times the eastwest dipole in the eastern tropical Pacific basin [60] .
Besides the study of the initial errors, increasing efforts have been dedicated to explore the effect of the model errors on the forecast.In the linear regime, SO optimal [33] and forcing singular vector (FSV) [61] were employed to represent the influence of the stochastic noise on prediction error growth.Duan and Zhou [62] extended the FSV-approach to the non-linear system and proposed the concept of the Non-linear forcing singular vector (NFSV).It was found that the NFSV-related model errors have the largest negative effect on the uncertainties of ENSO prediction [63] .

Spring Predictability Barrier
One specifically important prediction uncertainty in ENSO prediction is SPB, which referrers as to a quick decrease of prediction skill during the boreal spring [64] .From the perspective of error growth, the SPB refers to as the phenomenon that ENSO forecasting has a large prediction error; in particular a prominent error growth, during the spring when the prediction is made before spring [65] .Quite a few studies, including linear SV (LSV) and nonlinear CNOP, found that the SPB arises from the growth of initial errors [54,55,65] .The comparison between CNOP and LSV showed that the SPB-related initial errors determined by the CNOP, although presenting spatial patterns similar to those revealed by the LSV, cover a broader region and result in a more significant SPB [65] .It is therefore concluded that the initial errors of the CNOP structure are most likely to cause the SPB.
The concept of the two types of El Niño was proposed in early 2000s [66] .One type of El Niño, named as the "central Pacific-El Niño", exhibits warming center in the equatorial central Pacific, while the other is named as the "eastern Pacific-El Niño" which show-warming center in the equatorial eastern Pacific.It was found in some works that the frequently occurred central Pacific-El Niño since 2000s increases the uncertainties of ENSO forecasting [67] .Tian and Duan [68] showed that central Pacific-El Niño also has the SPB and possess a SPB-related initial error pattern similar to the eastern-Pacific El Niño as illustrated in Yu et al. [65] .Both the easternand central-Pacific El Niño forecasting could occur SPB, but the latter has less chances to encounter a SPB [69] .Ren et al. [70] also revealed that the central-Pacific ENSO has a much weaker persistence barrier, closely related to the SPB, than the eastern-Pacific ENSO.To filter out the SPB-related initial errors, a hopeful strategy is the target observation, which aims to determine the optimal observing region for minimizing the prediction uncertainty [71] .Duan et al. [69] concluded an optimal Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 observational array, as shown in Fig. 3, in which the additional observations can be deployed for dealing with the challenge of ENSO prediction due to the diversity of El Niño events and the related SPB.

Measure of ENSO IPL
Predictability is the extent to which events can be predicted [72] , including the actual predictability and the potential predictability.The actual predictability is aiming to quantify the accuracy of model predictions against observations, measured with either deterministic or probabilistic scores.The potential predictability is often referred to as the upper limit of skill, i.e., IPL.Based on the ensemble prediction, there are a couple of metrics that can be used to quantify the IPL.Among them are variance-based measure and information-based measure, both quantifying the predictability or prediction uncertainty form different angles.
Signal-to-Noise Ratio (SNR) is widely employed in quantifying potential predictability [73] .The amplitude of signal and noise of the ENSO ensemble prediction can be approximately quantified by the variance of ensemble mean and the averaged ensemble spread over all initial conditions [74] .The prediction skill measures, such as anomaly correlation [75] or rank probability skill score [76] , are the function of SNR.The higher values of the SNR indicate less contaminations of the signal information by unpredictable random effects, and higher potential forecasting capability and higher prediction skill [77] .The amplitude of the signal is much larger than the noise in the ENSO prediction, suggesting that the predictability of ENSO is dominated by the ENSO signal [74,78] .Several information-based measures have been applied to qualify the IPL, including the relative entropy, predictive information, predictive power, and mutual information (MI).The essential of these information-based measures is that the difference between the prediction and climatology distribution, can quantify the extra potential information from the forecast.Specially, the relative entropy is always related with the strong ENSO events and decreases as the lead time increases.It has a better relationship with the correlation-based prediction skill than the predictive information and predictive power due to the fact that the signal component dominates the relative entropy metrics [74,79] .The MI is an indicator of the overall potential predictability of a dynamical like the ENSO system, obtained by averaging the relative entropy or prediction information over all predictions (Fig. 4a).
Theoretically, the MI has a strong relationship with actual prediction skill as shown in Fig. 4b.
Note that the MI-based potential predictability measures the statistical dependence, liner or nonlinear, between the ensemble mean prediction and hypothetical observation (an arbitrary ensemble member), whereas the SNR-based potential skill only measures their linear correlation, therefore underestimates the nonlinear statistical dependence.In other words, the information-based potential predictability measures should be better than the SNR-based measures in characterizing 'true' potential predictability.When the climatology and prediction distribution are both Gaussian and the prediction variances are constant, the information-based measure is equivalent to the SNR-based potential measure [80] .Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 Note that both of the above two kinds of measures do not involve observations, thereby measuring the potential predictability of the model forecast system.
In addition, the IPL of a chaotic system can also be quantitatively determined by the NLLV using observational data.Li and Ding [81] employed the NLLV approach to explore the temporal-spatial distribution of the tropical SST IPL.They found that the annual mean predictability limit is very large in the tropical central-eastern Pacific (>8 months), exceeding 10 months in the Niño3.4region.

Probabilistic Prediction of ENSO
Probabilistic forecast aiming at predicting the probability distribution for the future state of the variable can express the forecast uncertainty information.It has been argued to be more informative and valuable than the deterministic forecast [82] .
In practice, rather than a continuous probability distribution, only the probabilities of some discrete categorical events that are of particular interest are usually predicted.
The predicted probability for an event is usually estimated as the fraction of ensemble members of forecasting of this event.
Kirtman [83] is the first to demonstrate the importance of presenting ENSO predictions in the probabilistic format and verifying ENSO ensemble predictions from the probabilistic perspective.Even when the deterministic prediction skill is insignificant, skillful probabilistic prediction is still possible for the above-and belownormal events.It was argued that the probabilistic verification is an important complement to the deterministic verification.With the same methodology of Kirtman [83] , several subsequent studies [6,50] have also paid significant attention to Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 investigate the ensemble prediction skill of the ENSO from the probabilistic perspective.The receiver operating characteristic (ROC)-based diagnostics in these studies confirmed the main findings of Kirtman [83] .
However, the ROC-based analysis cannot explore the full scope of the probabilistic prediction skill since it can only reflect the resolution aspect.As argued in Yang et al. [84] , while the resolution would be strongly impacted by the intrinsic predictability of the real world, there should be, in principle, no barrier at the physical level in improving the reliability.As such, compared to the resolution, the reliability would be more sensitive to the change in the ensemble prediction system itself and would therefore be a more indicative criterion for testing the ensemble construction strategies.
More recently, along the above line of discussion, some studies have given the prominence to examine the reliability of the ENSO ensemble probabilistic predictions [33, 52] .Using the LDEO5 model, Cheng et al. [33] (2010) produced retrospective probabilistic predictions of the ENSO with using different ensemble perturbation strategies and compared their probabilistic prediction skills in terms of the reliability and the resolution.The results indicated that the reliability is very sensitive to the uncertainties sampled in initialization and in model errors (noise), whereas the resolution is not.Specifically, when either of the two uncertainties, especially the atmospheric noise uncertainty, is not sampled, the reliability suffers from a severe "overconfidence" question.When both uncertainties are sampled, this "overconfidence" is significantly reduced.The amplification of errors in ocean initial conditions can be a large source of the uncertainties [85,86] .For example, a so-called Multiple-ocean Analysis Ensemble (MAE) [87] initialization scheme to sample the structural uncertainty in ocean initial conditions can effectively improve the reliability of ENSO predictions [88] .The "overconfidence" was not merely associated with the immediate-complexity model used.It was also observed in comprehensive coupled GCMs (CGCMs) [52] .This cavity of the reliability seen in CGCM-based ensemble prediction systems was mainly attributed to the lack of sampling the uncertainty associated with the model errors in the forecast models.It was further found that as a very pragmatic approach of sampling model uncertainties, the multimodel ensemble (MME) can alleviate this reliability defect of the ENSO ensemble probabilistic predictions [52] .
The benefit of the MME approach was not only seen in the reliability.Kirtman and Min [52] also found the benefit of the MME in the resolution based on the analysis of the ROC score.Tippett and Barnston [89] further provided more comprehensive evidences for the MME's effectiveness in improving the ENSO probability forecasts.The MME advantage was found to be greater than what is expected as a result of an increase in ensemble size only.Typically, the probability forecasts for ENSO are only considered for three categories (El Niño, La Niña, or neutral), which cannot reflect more detailed information regarding ENSO phase and intensity.Very recently, Tippett et al. [90] examined the ENSO probabilistic forecast skill using the North American Multi-model Ensemble Products, with more categories being tested.The results indicated that the current MME predictions have shown the capability in providing detailed probabilistic forecasts of ENSO phase and amplitude.Compared with the probabilistic weather forecast, the study of the ENSO probabilistic prediction and verification is still at an early stage.Existing studies have demonstrated that the current ENSO probabilistic forecasting tends to be prone to an "overconfidence" and that the MME is a pragmatic yet effective approach to improve the ENSO probabilistic prediction.Further studies are required to comprehensively investigate the current status of ENSO probabilistic prediction skill in terms of the reliability and the resolution.Moreover, the ensemble generation techniques adopted in previous studies to sample the uncertainties of prediction systems are mostly too simple, usually based on random perturbations or the lagged ensemble approach, which are far from sophisticated.Therefore, it is necessary to use more refined ensemble generation techniques, for example, the SV, SO, NLLV or CNOP, to further improve the reliability and ultimately the overall skill of the ENSO ensemble probabilistic prediction.While these methods have been widely applied to the weather probabilistic forecast, and even to ENSO predictability study, they haven't been much applied to the ENSO ensemble (and especially probabilistic) prediction study.

Summary and Discussion
As recognized, the ENSO is the most predictable interannual signal in the climate system, providing the basis for the global short-term climate prediction.
Real-time prediction information is critically important to mitigation and adaptation activities in response to ENSO events and related natural disasters.Over the past few decades, comprehensive understanding of ENSO processes and its predictability has been achieved, which provides the physical basis for ENSO prediction.So far, various Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 coupled ocean-atmosphere models with hieratical complexity have been developed for ENSO simulation and prediction.The predictive skill has been improved to a level where it is possible to make successful real-time ENSO prediction at lead time of seasons and longer.
In this paper, we reviewed recent progress achieved in ENSO prediction and predictability study.We discussed several main issues responsible for the improvement in ENSO prediction skill and the advance in predictability study.First, the rapid development and huge achievement of the observing systems provide high quality oceanic and atmospheric data available.The advance in data assimilation methods and algorithms ensure the success in using observations to initialize prediction.Second, a hierarchy of ENSO prediction model has been employed with improved physical parameterizations and model resolutions.The state-of-the-art models have been widely used for operational ENSO predictions.Third, the studies of the optimal growth of forecast error provide various linear and nonlinear optimal perturbations, producing multiple approaches to construct ensemble prediction systems.The probabilistic prediction of ENSO has been run in operation in many countries, offering important information for stakeholders to make decision.
Moreover, the newly developed information theory-based framework of statistical predictability promotes the theoretical study of ENSO predictability and offers effective metrics to quantify the intrinsic predictability limit of ENSO.
It has no doubt that notable progress has been made in the study of ENSO prediction and predictability.However, several specific challenges still exist in improving the ENSO prediction skill and understanding the ENSO predictability, particularly enlightened from recent researches and predictions of 2014-16 super El Niño event [91] .Among them is the model systematic error, which is probably the most challenging issue.Focused on the birth region of ENSO, the tropical Pacific Ocean remains an area in which pronounced biases exist in model simulations compared with observations.For example, obvious discrepancies exist in ocean GCM (OGCM) simulations, including the simulated thermocline too diffuse, with a weak vertical gradient of temperature [92] .Furthermore, OGCM-based coupled simulations commonly have an unrealistic structure of interannual SST variability, with SST anomalies being underestimated over the eastern equatorial Pacific but overestimated in the central equatorial Pacific.Additionally, the real climate system in the tropical Pacific is characterized by interannual oscillations with a main period band of 4-5 years, but some coupled GCMs favor a quasi-biennial oscillation accompanied by the predominance of the westward propagation of simulated SST anomalies over the eastern and central equatorial Pacific.These discrepancies between model simulations and observations are attributed to processes that still have not well been represented in models.For example, vertical mixing and diffusion are important processes that determine how SSTs are affected by subsurface thermal conditions [93] .In ocean models, the vertical mixing/diffusion processes are generally parameterized using large-scale oceanic fields, including the K-profile parameterization (KPP)-scheme [94,95] .It is difficult to accurately determine these effects in models and large uncertainties are inevitable in representing the related subsurface effects on SSTs, which can cause large biases in SST simulations.In addition, many processes that can affect model performance in ENSO simulations are still missing or are not adequately represented in current low-resolution climate Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 models, including mesoscale processes (e.g., Tropical Instable Waves, TIWs), multisphere processes (e.g., ocean biology), Madden-Julian Oscillation (MJO), and so on [96] .Coupled models with problems in representing physical processes cause large errors in ENSO simulations.Zhu et al. [97] demonstrated that bias in representing thermodynamical process is one of factors contributing to the 2014 El Nino false alarm.Another well-known examples is double Intertropical Convergence Zone (ITCZ) problem in the tropics, which may be solved with the better mean state simulation.To achieve this goal, the importance of the of atmospheric convection scheme in ENSO prediction shouldn't be ignored [98] .These biases in simulations inevitably lead to errors in ENSO predictions, including the SPB phenomenon.
There are clear needs to improve understanding of variability and predictability associated with ENSO.The ENSO variability exhibits its diversity and asymmetry features [99] .What is not clear now is whether the predictability also varies with the different types of El Niño, which should be specifically studied.Mathematically, prediction systems themselves need to be improved not only by model parameterization processes, but also by initialization methods, ensemble-based prediction procedures and post-processing etc.For example, model components (the atmosphere, the ocean, and their interactions) need to be improved further so that relevant processes can be accurately represented in the coupled models, including various feedbacks and coupling processes that can modulate ENSO.More cares need to be taken for parameterizations of unresolved or missing processes in oceanic and atmospheric models.Also, model resolution needs to be enhanced so that processes that are missing in low resolution models can be adequately represented, including Downloaded from https://academic.oup.com/nsr/advance-article-abstract/doi/10.1093/nsr/nwy105/5123734 by guest on 11 October 2018 convection, TIWs, MJO, and so on.Moreover, representation of the ENSO periodicity in climate models also needs to be quantitatively evaluated and specifically improved to make predictions of different timescales in ENSO diversity better.
In addition to model processes-related issues, ENSO prediction skills can also be enhanced by various techniques, including initializations and prediction procedures.
Due to the important roles in predictions, for example, the subsurface thermal conditions in the ocean need to be coherently incorporated into prediction models through initialization procedures using data assimilation techniques.The more natural initialization method, coupled assimilation, should be developed and applied to provide initial conditions for the predictions.Also, the stochastic nondeterministic processes are prominent in the tropical Pacific, so it is necessary to treat ENSO as a stochastic system.It is still a challenge to realistically characterize some important stochastic processes, for example the WWB, in prediction models by adequate parameterization.It is expected that realistic representations of these stochastic processes can improve prediction skills.Since the ENSO prediction contain uncertainties, the adequate ensemble construction strategies that can efficiently represent the uncertainties associated with the initial and model errors should be well configured in developing ENSO prediction systems.On one hand, the prediction uncertainties can be measured and quantified by probabilistic prediction.The efforts to develop probabilistic prediction and verification should have practical value and guide meaning for the economic society.The probabilistic prediction study of ENSO should be greatly promoted in the future.On the other hand, due to model deficiencies, the post-calibration of ensemble mean is still needed to further improve apparent for ENSO prediction in 2014 (figure not shown).Although warm SST anomalies were observed to occur in the western equatorial Pacific at early 2014, the warm SST anomalies weakened in mid-2014 and did not develop into an El Niño event in late 2014.However, many coupled models predicted a strong El Niño in 2014, a false alarm that embarrassed the ENSO scientific community.The prediction cases in 2014 and 2015 clearly indicate that the real time prediction of ENSO remains challenging and problematic, even when the state-of-the-art coupled models are used.Further studies on understanding predictability and improving real-time predictions using the coupled models are clearly needed.