Short title (50 characters and spaces max): Spectral phenotyping of maize drought stress Spectral phenotyping of physiological and anatomical leaf traits related with maize water status

One sentence summary: Hyperspectral phenotyping has the potential to act as a surrogate for standard reference measurements of physiological and anatomical leaf traits related with water stress in maize. L.C., M.V.M., L.C. R.P. performed most of the L.C., M.V.M., and J.J.C. analyzed the L.C. Abstract (250 Advancements in phenotyping techniques capable of rapidly and non-destructively detecting impacts of drought on crops are necessary to meet the twenty-first century challenge of food security. Here, we describe the use of hyperspectral reflectance to predict variation in physiological and anatomical leaf traits related with water status under varying water availability in six maize ( Zea mays L.) hybrids that differ in yield stability under drought. We also assessed relationships among traits and collections of traits with yield stability. Measurements were collected in both greenhouse and field environments, with plants exposed to different levels of water stress or to natural water availability, respectively. Leaf spectral measurements were paired with a number of physiological and anatomical reference measurements, and predictive spectral models were constructed using a partial least squares regression (PLSR) approach. All traits were relatively well predicted by spectroscopic models, with external validation (i.e., by applying PLSR-coefficients on a dataset distinct from the one used for calibration) goodness-of-fit ( R 2 ) ranging from 0.37 to 0.89 and normalized error ranging from 12–21%. Correlations between reference and predicted data were statistically similar for both greenhouse and field data. Our findings highlight the capability of vegetation spectroscopy to rapidly and non-destructively identify a number of foliar functional traits affected by drought that can be used as indicators of plant water status. Although we did not detect trait coordination with yield stability in the hybrids used in the current study, expanding the range of functional traits estimated by hyperspectral data can help improve trait-based breeding approaches.


INTRODUCTION 1
One of the more influential, yet unpredictable, factors expected to change in future environments over the 2 next 50 years will be the intensity and frequency of precipitation events (IPCC, 2014). These changes will 3 likely have a dramatic effect on plant metabolism, growth and production (Pryor et al., 2013;Miao et al., 4 2017). Over a comparable time period, global food demand is predicted to double, requiring intensification 5 of agricultural production systems (Tilman et al., 2011). The rapid development and adoption of climate-6 resilient crop genotypes capable of maintaining comparable yield production under favorable and stress 7 conditions (i.e., with high yield stability) is important to overcome this twenty-first century challenge 8 (Mickelbart et al., 2015). Advancements in phenotyping techniques capable of rapidly assessing the effects 9 of drought on plant functional responses from a wide genetic range is necessary to understand plant traits 10 required to maintain yield stability under predicted future environmental conditions 11 Cotrozzi et al., 2017a; Miao et al., 2017). 12 Maize (Zea mays L.) represents an excellent species for developing phenotyping approaches. It is one of the 13 most important crops globally for humans and other animals (Miao et al., 2017), and while yield has 14 increased in absolute value in the last decades, mainly due to genetic gain and improved agronomic 15 practices, the negative impact of drought on maize productivity has also increased due to increased 16 frequency and intensity of drought events Lobell et al., 2014). Maize also uses more 17 water at specific developmental stages (e.g., during flowering), making the optimal water management of 18 this species challenging . Moreover, the functional responses that are associated with 19 increased yield stability and targeted by maize breeders, such as the ability to maintain photosynthesis under 20 stressfull conditions, limit transpiration, increase water-use efficiency, and regulate osmotic adjustment 21 (Ribaut et al., 2009 ) are difficult, time-consuming, and mostly destructive to measure by standard 22 approaches (e.g., those performed by classic infrared gas analysers, Scholander-type pressure chambers and 23 vapor pressure osmometers). All of these aspects make monitoring of a large number of individual plants 24 logistically impractical. that these models are validated using independent samples, after which models can be used to predict the 35 variable of interest in unknown samples on the basis of their spectral reflectance alone (Couture and 36 Lindroth, 2012; Schweiger, 2020). Importantly, leaf reflectance measurements are rapid, taking only 37 seconds, non-destructive, and relatively inexpensive. Although the purchase of a spectrometer can range 38 from hundreds to tens of thousands of US dollars, the expenses to run such instruments are minimal 39 compared with the cost and manitaince of other analytical instrumentation (e.g., HPLC). A spectral approach 40 also provides the potential to assess considerably more individual plant traits in situ and in vivo over multiple 41 time periods than standard reference measurements alone (e.g., those performed with classic 42 ecophysiolgoical techniques or wet chemistry). In addition, this approach can help to monitor plant function 43 over large geographic regions if scaled to remote sensing collections from air-or spaceborne platforms 44 (Cotrozzi et al., 2017b). 45 Although the use of hyperspectral reflectance (by both imaging or non-imaging sensors) is broadly regarded 46 as a promising phenotyping approach in agriculture (Weber et  number of traits, focusing on the maximum rates of ribulose bisphosphate (RuBP) carboxylation (V cmax ) and 52 electron transport (J max ). Similarly, while plant water status has been the focus of a number of studies 53 utilizing foliar optical properties in the last decades (e.g., Hunt et al., 1987;Peñuelas et al., 1993;Gao, 1996 In this study, we tested the capability of reflectance spectroscopy to characterize physiological and 68 anatomical responses of six maize hybrids with different sensitivities to drought, determined by previously 69 established differences in yield stability. We collected spectral leaf signatures from maize hybrids growing 70 under both controlled greenhouse conditions and field conditions. Our objectives in this study were to (i) 71 develop statistical models to estimate leaf gas-exchange, greenness, water status, thickness, and stomatal 72 density, and (ii) compare the performance of models built using only greenhouse samples with models 73 including also field samples. Further, we tested relationships among physiological and morphological 74 responses of maize hybrids and their yield stability under drought conditions. 75

Prediction of leaf traits 77
To optimize model performance, we initially examined numerous models containing different wavelength 78 ranges, characterized by specific absorption features in known literature that are directly or indirectly related 79 to specific traits (Supplemental Table S1). Final models for gas-exchange traits utilized the wavelength range 80 500-900 nm for net CO 2 assimilation (A), transpiration (E), stomatal conductance (g s ), intercellular CO 2 81 concentration (C i ), instantaneous water-use efficiency (WUE i ), intrinsic water-use efficiency (WUE in ) and 82 leaf temperature (T l ) (Supplemental Table S1). Cross validated models accurately characterized A, E, g s and 83 T l (Table 1; Fig. 1a,b,c,g). Performance metrics of cross validated modes of C i , WUE i and WUE in models 84 were slightly lower than those of A, E, g s and T l (Table 1; Fig. 1d,e,f). Standardized PLSR coefficients and 85 the variable important to the projection (VIP) measures exhibited similar profiles for both sets of models. 86 Specifically, VIP highlighted important wavelengths around 550 nm and from 650 to 750 nm ( Fig. 2a-g). 87 While statistically significant (r = 0.52, P >0.001, n = 61), the correlation between WUE in calcuated using 88 either observed or predicted values of A and g s was lower and the bias higher than estimation accuracy of 89 modelling WUE in directly (Fig. 2f). 90 The most optimal leaf chlorophyll concentration (Chl SPAD ) PLSR model used the 600-900 nm spectral 91 region, and cross-validated models estimated chlorophyll concentration well (Supplemental Table S1; Table  92 4 1; Figure 1h). Similar to gas-exchange traits, Chl SPAD standardized coefficients and VIP values were most 93 pronounced in the 650-750 nm spectral region (Figure 2h). 94 The best model performance for leaf water potential (Ψ w ) was found using the spectral full-range (400-2400 95 nm), whereas best model performance for leaf osmotic potential (Ψ π ) and leaf osmotic potential at full turgor 96 (Ψ π100 ) utilized the wavelength range 1400-2400 nm. Cross-validated models predicting Ψ w , Ψ π and Ψ π100 97 preformed reasonably (Table 1; Fig. 1i,j,k). Standardized coefficients and VIP profiles for Ψ w were relatively 98 flat, showing some peaks around 600, 1500 and 1900 nm, whereas Ψ π and Ψ π100 were more variable, with 99 important wavelengths around 1900 nm (Fig. 2i,j,k). Final models for relative water content (RWC) and 100 succulence (Suc) utilized the 950-2400 nm region (Supplemental Table S1). Predictive models accurately 101 characterized RWC and Suc (Table 1; Fig. 1l,m). Both RWC and Suc standardized coefficients and VIP 102 values were most pronounced in the 1350-1500 nm and 1900-2100 nm spectral regions (Fig. 2l,m). There 103 was no relationship between the the estimation of Ψ π100 calculated using either observed or predicted values 104 of Ψ π and RWC (r = 0.11, P = 0.40, n = 61). No field measurements were collected for Ψ π100 , RWC or Suc; 105 therefore, models were developed only from greenhouse-collected data. 106 Specific leaf area (SLA) was accurately predicted with a PLSR model that utilized wavelengths from 1400 to 107 2400 nm (Supplemental Table S1). Cross-validated models estimated SLA very accurately (Table 1; Fig.  108 1n). SLA standardized coefficients and VIP values highlighted important wavelengths around 1500 nm, from 109 1850 to 2050 nm, and around 2280 nm (Fig. 2n). 110 The best performance for the total (sum of abaxial and adaxial surfaces) stomatal density (TSD) model was 111 found using the wavelength range 1400-1800 nm (Supplemental Table S1) and cross-validated models 112 yielded sub-optimal estimations (Table 1, Figure 1o). TSD standardized coefficients and VIP values were 113 highly variable throughout the whole spectral-range used ( Figure 2o). No field measurements were collected 114 for TSD; therefore, models consisted only of data from the greenhouse. Best models for abaxial and adaxial 115 stomatal density were less accurate than TSD (average R 2 : 0.17 and 0.10, respectively, using the 1400-1800 116 nm range). 117 Averaged fit statistics (R 2 , RMSE, bias, NRMSE) for external validations were similar to those registered for 118 cross-validation for A, E, g s , C i , WUE i , WUE in , T l , Suc, SLA and TSD, whereas they were slightly lower for 119 Chl SPAD (R 2 : 0.61 vs 0.44 for cross-and external validation, respectively), Ψ w (R 2 : 0.63 vs 0.40), Ψ π (R 2 : 0.60 120 vs 0.34), Ψ π100 (R 2 : 0.53 vs 0.40) and RWC (R 2 : 0.90 vs 0.65, Table 2). 121 Using A, Ψ w , Ψ π and SLA as testing traits, the performances of models built that included field and 122 greenhouse data or only greenhouse data were comparable for calibration and cross-validation (Supplemental 123  Table S1). For external validation, the performance of the models built using both greenhouse and field 124 samples was similar whether validations were performed using a dataset containing both greenhouse and 125 field samples (as reported above, For all leaf traits, modelling performance of Lasso using the full spectral range (400-2400 nm) was lower or 131 comparable to PLSR. Using A, g s , C i and Ψ π as testing traits, the Lasso modelling performance decreased 132 when using the smaller spectral regions used in the final PLSR modelling (Supplemental Table S2). 133

Correlations among leaf traits and leaf traits with spectral indices 134
Similar correlations were found either using only greenhouse samples or both greenhouse and field samples 135 together (Supplemental Tables 3 and 4). All significant and strong correlations (P < 0.05, r > 0.4) among leaf 136 traits of greenhouse samples using reference values were confirmed by using the predicted values, except 137 5 RWC with T l and Ψ π100 , where the relationships between reference and predicted data did not hold. We found 138 significant and strong correlations among reference and predicted field data, with the exception of some 139 correlations related with Ψ w and Ψ π (Ψ w with T l and Ψ π ; Ψ π with SLA). Overall, we found more and stronger 140 correlations using predicted data than when using reference measurements (Supplemental Tables S3 and S4). 141 Using the reference values only, few significant correlations found in the greenhouse were confirmed in the 142 field (A with E and g s , E with g s , C i with WUE in , WUE i with T l , and Ψ w and Ψ π , Supplemental Table S3). 143 Using reference values, greenhouse measurements of A were positively related with the normalized 144 differential vegetation index (NDVI) and a scaled photochemical reflection index (sPRI, r of 0.41 and 0.52, 145 respectively, P < 0.05, n = 155), but not the normalized differential water index (NDWI, r of 0.24, P > 0.05, 146 n = 155). Using reference measurements collected in the field, however, A was not related with NDVI , 147 sPRI, or NDWI (r of 0.18, -0.25, and 0.05, P > 0.05 n = 36). Greenhouse reference values of RWC, Suc and 148 Ψ w were positively related to NDWI; although statistically significant, the relationships for all three variables 149 with NDWI were weak (r of 0.22, 0.26, and 0.17, respectively, P < 0.05, n = 173 for RWC and Succ and n = 150 169 for Ψ w ). 151

Leaf trait responses under different drought conditions 152
A three-way ANOVA, including yield stability, water treatment, and stage (  Figure S1). We found significant water treatment effects for all leaf traits 155 except WUE in , T l and TSD (Table 3). In general, A, E, g s , WUE i , Chl SPAD , Ψ w and RWC decreased under 156 mild drought (MD) (-75%, -75%, -79%, -22%, -5%, -45% and -10%, respectively) and even more under 157 severe drought (SD) (-86%, -85%, -88%, -32%, -12%, -64%, -20%, respectively). Overall, Ψ π and Suc 158 decreased approximately 12% and 6%, respectively, and C i increased ca. 65% under MD and SD. Ψ π100 and 159 SLA increased only under SD (+11% and +9%, respectively, Supplemental Figure S1). We also found 160 significant stage effects for all leaf traits except Ψ w (Table 3), with A, E, g s , WUE i , WUE in , Chl SPAD , Ψ π , 161 RWC, Suc and SLA generally decreasing (-39%, -30%, -35%, -38%, -25%, -2%, -5%, -12%, -24% and -162 16%, respectively) and C i , T l , Ψ π100 and TSD increasing (+85%, +13%, +9% and +14%, respectively) at the 163 later V10 developmental stage (Supplemental Figure S1). The magnitude of response, however, varied 164 across water treatments and was different for different genotypes. 165 We found no significant yield stability × water treatment or yield stability × stage interactions, whereas we 166 found significant water treatment × stage interactions for A, E, g s , C i , WUE i , WUE in , Ψ w , Ψ π100 and RWC 167 (Table 3). Early in development (V6), A, E and g s were decreased more under SD than under MD, whereas 168 at the later (V10) stage, the effects were similar under MD and SD. A similar behavior was observed for Ψ w , 169 which had the lowest values at the first developmental stage under SD. C i and WUE i decreased under water 170 deprivation only at the V10, and the degree of effect of the stress was similar in both MD and SD treatments. 171 WUE in was increased in MD and SD plants at the first developmental stage but decreased in both treatments 172 at the second developmental stage. Ψ π100 increased in V10 plants under SD. The decrease of RWC was 173 greater under SD than MD only at the V10 stage (Supplemental Figure S1). Only Chl SPAD showed a 174 significant yield stability × water treatment × stage interaction. Genotypes with high yield stability 175 maintained high Chl SPAD values under MD at the V6, but not V10, stage (Table 3, Supplemental Figure S1). 176 A second three-way ANOVA, including genotype, water treatment, and stage (  Figure S1, Supplemental Table S5). We also found 184 significant genotype × water treatment interactions for A, E, Chl SPAD , Ψ w and Ψ π100 (Table 4). Under MD, 185 PI6011361 and PI543842 had higher values of A, E and Chl SPAD than the other genotypes, whereas no 186 genotypic differences for these traits were observed under SD. PI543842 and PI601438 had a greater 187 decrease in Ψ w under SD, whereas no genotypic differences were found under MD. Although genotypic 188 differences of Ψ π100 were found for controls with higher values in PI559936 and PI6011361 and lower values 189 in PI601438 and Ames27193, no differences among genotypes were found under either MD or SD 190 (Supplemental Figure S1). Significant genotype × stage interactions were found for Chl SPAD , Ψ w , SLA and 191 TSD (Table 4). PI6011361 showed numerically the highest Chl SPAD values at both stages of analysis, but it 192 was significantly different from only Ames27193 at V6, while it was significantly different from PI55935,193 PI559936 and PI601438 at V10. At V6, Ames27193 and PI6011361 demonstrated higher Ψ w values than 194 PI601438, whereas no genotypic differences in Ψ w were observed at V10. SLA was higher in PI559935 than 195 PI559936, PI601438 and Ames27193, whereas no genotypic differences were observed at V10. No 196 genotypic differences were observed at V6 in terms of TSD, while at V10 only PI601438 and Ames 27193 197 remained at the levels reported at V6 as all the other genotypes increased (Supplemental Figure S1). A 198 significant genotype × water treatment × stage interaction was found only for WUE in and Chl SPAD (Table 4, 199 Supplemental Figure S1). PI6011361 (drought-sensitive) and PI559936 (drought-sensitive) were the only 200 genotypes where WUE in remained high at V10, and specifically so under MD and SD. PI6011361 (drought-201 sensitive) showed an ability to maintain optimal Chl SPAD values at V6 under water stress, whereas PI543842 202 (drought-tolerant) showed this response at V10. The greatest reductions in Chl SPAD under MD were observed 203 in Ames27193 (drought-tolerant) at V6 and in PI601438 (drought-sensitive) at V10. 204

DISCUSSION 205
We describe a non-destructive approach by which plant responses to water availability can be monitored 206 using reflectance spectroscopy. By combining reflectance measurements, standard physiological and 207 anatomical measurements and statistical modelling, we demonstrate the potential of using spectral data to 208 estimate maize leaf traits that often change under drought stress. This documents the potential of 209 spectroscopy to estimate a large number of drought-related leaf features in maize and also highlights the 210 possibility to estimate water potential traits in C4 plants. 211 Photosynthesis is a key plant process that is affected by drought, primarly through stomatal and mesophyll 212 limitations (Pinheiro and Chaves, 2010). Standard measurements of light-and CO 2 -saturated photosynthesis, 213 however, are logistically challenging, potentially taking 20-30 minutes per leaf as the leaf has to acclimate to 214 the cuvette conditions. Hyperspectral approaches present a rapid alternative to standard reference Here we also show that the range of traits estimable (e.g., gas-exchenge) can be expanded using 229 spectroscopy to better understand specific physiological responses to water stress. 230 7 We found that the most important spectral region for prediction of SPAD-based chlorophyll estimates was 231 600-900 nm. Wavelengths between 700 and 750 showed the highest VIP values reported for predicting 232 SPAD-based chlorophyll, highlighting the stronger relationship between SPAD values and the red-edge, 233 compared with other traits. 234 A, E, and g s were well predicted, followed by WUE i and Chl SPAD , and then by C i and WUE in . These 235 outcomes suggest that physiologically processes derived from calculations of other measurements (i.e., 236 predictions were used in calculation of processes) lost prediction accuracy, compared with processes that 237 were directly predicted (e.g., A, E, g s ). The performance of the model predicting A was confirmed by the 238 strong and positive relation with NDVI and sPRI, two spectral indices related to photosynthetic activity 239 (Gamon et al., 1995(Gamon et al., , 1997. Using predicted values to calculate WUE in (WUE incalc ), compared with 240 predicting WUE in directly, as a test case (WUE incalc vs WUE in ), the preditction accuracy further decreased 241 when using predicted values to calculate WUE incalc (i.e., predicted A and g s ) opposed to directly predicting 242 WUE incalc using spectral data alone. This outcome is likely due to the propogation of error from the 243 individual predictions in the calculations. 244 Interestingly, we found excellent prediction performance for T l (R 2 : 0.89, for cross-validation). The 245 capability of reflectance spectroscopy to detect T l changes has been previously reported across a number of 246 C3 species exposed to variable air temperatures (e.g., Serbin 2012). The wavelength range utilized by Serbin 247 (2012) included a much larger range (400-2400 nm), contrasted with the narrower range (500-900 nm) we 248 utilized. The differences between the two outcomes is likely a consequence of the different treatments 249 utilized in the two studies: temperature opposed to water stress. The ability of photosynthetic and accessory 250 pigments to safely dissipate excess light energy through the xanthophyll cycle (Demmig-Adams and Adams, 251 1996) likely contributes to the ability of spectral data to accurately predict temperature in the current study.

252
A further explanation in the discrepancy between wavelength regions from the current study and those from 253 Serbin (2012) is likely due to the different photosynthetic pathways, foliar anatomical structure, and 254 experimental treatments used. Moreover, the SWIR region is strongly sensitive to water content, given the 255 prominent water absorption features in this spectral region (Curan 1989), and the inclusion of different 256 watering treatments in the current study may have altered the influence of this spectral region on predicting 257 leaf temperature, shifting focus on excess energy dissapition via shifts in pigment profiles. 258 Measurements of water status can be broadly divided into either the amount of leaf water or leaf energy 259 status (Jones, 2007). Reflectance spectroscopy likely utilizes information from both approaches because 260 vegetation reflectance is influenced by the amount of water as well as by the composition and concentration 261 of osmolytes that affect variation in Ψ π , and ultimately in Ψ w (Cortozzi et al., 2017a). We found that three 262 spectral regions best predicted the investigated water-related traits. Unexpectedly, Ψ w , which is currently the 263 most widely used parameter to estimate the water status of plants exposed to drought (González-Fernández 264 et al., 2015), was best predicted using the full spectral range (400-2400 nm). This is in contrast with region best predicts Ψ w . However, Ψ w is the sum of a number of components, including dissolved solutes 267 (Ψ π ), the pressure potential (Ψ p , equal to the hydrostatic pressure) and the gravitational potential (Ψ g , 268 ignorable except in tall trees), and is also influenced by photosynthestic regulation (Jones, 2007). Thus, Ψ w is 269 the outcome of the coordination of a number of underlying factors, each of which can potentially interact 270 with spectral regions differently. We feel the inconsistency among other studies and ours in determining the 271 best spectral region for predicting Ψ w is likely due to the anatomical differences between C3 and C4 species, 272 photosynthetic pathways, and severity and range of stress invesitgated. 273 For other modeled water status-related traits, the best predicting models were derived from the spectral 274 region dominated by water content and outside of wavelengths commonly associated with pigments. As we . While we also found statistically 281 significant relationships between RWC, Suc and Ψ w with NDWI, a spectral index widely used to estimate 282 plant water content (Gao, 1996), the relationships were weak and had substantially less explaniatory power 283 than the PLSR predicted estimates we generated, likely due to the limited inclusion of spectral data with 284 strong water absorption features. We found that the wavelength region that best predicted Ψ π and Ψ π100 was 285 primarily the SWIR region and excluded the minor water absorption features centered at 970 nm and 1200 286 nm. By eliminating the NIR portion of the spectrum, we potentially amplified the contribution of osmolytes 287 to the prediction of Ψ π and Ψ π100 in maize leaves. Multiple studies ( wavelength regions important for predicting non-structural carbohydrates and other foliar osmolytes using 290 spectroscopy align or closely overlap with the more important wavelengths we found for optimal model 291 prediction of Ψ π and Ψ π100 . This finding suggests that when SWIR data are available, they can provide a 292 more accurate representation of foliar water content than indices alone. Scaling this approach to airborne 293 data, however, may be challenging because of the noise generated by atmospheric water in the spectral 294 region of strong foliar water absorption. 295 In the current study, estimation of Ψ w was weak for severely drought-stressed plants (Ψ w <-2.0 MPa), 296 converse to similar data in live oak reported by Cotrozzi et al. (2017a). Given the sub-optimal model 297 performance, we encourage caution when interpreting results from a narrow range of values of Ψ w , Ψ π and 298 Ψ π100 . Regardless of this limitation, and considering that measurements of these traits with standard methods 299 (pressure chamber and vapor pressure osmometer) have several constraints in that they are destructive, user- spectroscopy as a rapid and non-destructive approach to unbiasely estimate water, increasing data collection 302 efforts over larger spatial and temporal scales. using spectral data collected from diverse inbred and hybrid lines grown at ambient and elevated ozone 308 concentrations in the field. Standardized coefficients and VIP metrics for SLA highlighted the importance of 309 the wavelengths related to the water content of leaves. As we expected, the weakest predictive ability among 310 the models presented in this study was for stomatal density. For the nature of the trait, we found it difficult to 311 estimate stomatal density from leaf reflectance, as suggested by the variability in the model coefficient and 312 VIP weighting. Stomata are localized only on the epidermal surfaces of leaves, and while the interaction of 313 light with the epidermis contributes to overall reflection profiles, reflectance measurements are also affected 314 by the leaf tissue as a whole, including the palisade and cuticular layers, and this relationship likely impeded 315 stomatal density prediction from spectral data. 316 An emergent outcome of this study is the advancement of conceptual approaches of chemometric modelling 317 by examining the influence of multiple modelling approaches and different reference measurement 318 collections on model outcomes. Comparing different modelling approaches, we found that performance of 319 PLSR was always higher or comparable with the performance of the Lasso approach. The reason for the 320 higher performance of PLSR than Lasso likely has to do with the trait response being modelled. Responses 321 that contain singular relationships with spectra (e.g., predictors can converge to a few highly important 322 wavelengths, such as pigments) can be modelled after removing a large portion of the spectrum, as done by 323 Lasso through penalizing intercorrelated variables. We also found that Lasso lost prediction accuracy when 324 smaller spectral ranges are used, opposed to starting with full spectrum data. Physiological processes and 325 other phytochemical compounds may need larger portions of the spectrum for successful modelling because 326 the necessary absorption features are contained within numerous and different spectral regions associated 327 with the components parts of these processes. The contributions of smaller coefficients, which are included 328 in PLSR but removed in Lasso, likely improve prediction accuracies in PLSR because they provide related 329 information to the coefficient of absorption features that individually are sub-optimal, but cumulatively 330 contribute more than individual wavelengths alone. 331 Our measurements of spectral and reference measurements from both controlled and field environments 332 revealed three ouctomes. First, greenhouse-based models allowed us to capture a greater proportion of trait 333 variation than using only field-grown plants because plants were exposed to a greater and wider range of 334 stress conditions. Second, the prediction accuracy of models built using both greenhouse and very few field 335 measurements was similar with predictions made using spectra on either greenhouse or field samples, as long 336 as the trait values in field samples were within the prediction range of models built using greenhouse 337 collections. Third, the prediction of field responses was dramatically compromised if using predictive 338 models created using spectral data collected from only greenhouse collections. These outcomes demonstrate 339 that field collections are a necessary compliment to controlled environment studies if the intent is to use 340 models for phenotyping field plant stress responses. 341 We found that few of the physiological and anatomical traits measured in this study were correlated with 342 yield stability for the specific maize hybrids used in the present study. Only the regulation of chlorophyll 343 content, determined as SPAD, varied between genotypes with high or low yield stability conditions. Indeed, 344 we commonly found that a genotype within each drought-sensitivity class (i.e., tolerant or sensitive) 345 exhibited inconsistent physiological responses to water stress. Both drought-sensitive and drought-tolerant 346 genotypes exhibited higher photosynthesis, transpiration rates and Chl SPAD under moderate drought stress, 347 and some drought sensitive genotypes had the lowest Ψ w values under water stress. We found a significant 348 three-way interaction (yield stability class by water stress by ascession) for only Chl SPAD and intrinsic water 349 use efficiency with two drought-sensitive genotypes able to maintain this status at optimal levels under water 350 stress at later developmental stages. The lack of a relationship between the traits contributing to a statistically 351 significant response and yield stability is likely due to the speed of the onset of stress and differences in 352 water availability in the containers, compared to field conditions where yield stability was assessed. 353 However, we found that the physiological and anatomical responses measured in the maize hybrids in 354 response to water stress were well predicted using hyperspectral data, as almost all the statistical outputs for 355 the correlations obtained with reference measurements were confirmed in both greenhouse and field samples. 356 This is in agreement with those of others for spectral predictions of maize traits under differential Here we demonstrate that reflectance spectroscopy provides a rapid, non-destructive approach to accurately 361 quantify physiological and anatomical functional traits associated with water relations in maize using a 362 single spectral measurement. Importantly, this suggests that the predictions from spectral data can be used in 363 place of standard reference collections. Moreover, this approach can dramatically increase data collection 364 from a larger number of individual genotypes and plants, in both controlled and field conditions, than 365 reference measurements alone. The increase in data volume collected using hyperspectral data, coupled with 366 similarity in the outcomes of statistical analyses used to measure responses across treatments, should 367 promote the implementation of more complex experimental designs that can provide greater insight into the 368 genetic, environmental, and gene-by-environment interactions that enhance and repress agricultural 369 production. While logisitical challenges (e.g., solar angle or atmospheric interference) exist, leaf-level 370 spectroscopy can be effectively used as ground reference or training input for airborne-based platforms and 371 scaled to field and landscape levels ( tenth leaf (i.e. the youngest fully-expanded leaves at the two stages of analysis) of each plant, we collected 399 measurements in the following order: gas-exchange, leaf greenness, reflectance and water potential, then leaf 400 portions were collected for the determination of other water status traits and thickness, as well as imprinted 401 slides for stomatal density (for a total of 180 leaf samples; Supplemental Figure S2). These specific traits 402 were selected because they are commonly investigated to evaluate the effects of drought on plants. The collection of data in both greenhouse and field environments was performed to achieve the second main 426 goal of the present study (i.e., compare the performance of models built using only greenhouse samples with 427 models including also field samples) as schematized in Supplemental Figure S3. Further details are reported 428 below. 429

Gas exchange and chlorophyll content 430
Net CO 2 assimilation (A), transpiration (E), stomatal conductance (g s ), leaf intercellular CO 2 concentration 431 (C i ) and leaf temperature (T l ) were determined using a LI-6400XT portable photosynthesis system equipped 432 with a 6400-02B LED light source (Li-Cor, Inc., Lincoln, NE, USA), operating at 400 ppm CO 2 433 concentration and saturating light conditions (1700 µmol m -2 s -1 PAR). Instantaneous (WUE i ) and intrinsic 434 (WUE in ) water-use efficiency were calculated as A/E and A/g s , respectively. A SPAD 502 meter (Minolta, 435 Osaka, Japan) was used to determine leaf greenness (Chl SPAD ). Three measurements per leaf were made, and 436 the mean of these measurements was recorded. 437

Collection of leaf spectra 438
Full range (350−2500 nm) reflectance profiles of maize leaves were collected using a SVC-1024i 439 spectroradiometer (Spectral Vista Corporation, Poughkeepsie, NY, USA) using a leaf-clip with an internal 440 halogen light source attached to a plant probe. Integration time (i.e., the length of time that the detectors are 441 allowed to collect photons before passing the accumulated charge to the A/D converter for processing) was 442 set at two seconds. Measurements were made on three and five areas of the leaf adaxial surface for each 443 greenhouse and field leaf, respectively, with one measurement per area, and all measurements were 444 combined to produce an average leaf spectrum. The relative reflectance of each leaf was determined from the 445 measurement of leaf radiance divided by the radiance of a white reference panel internal to the leaf clip, 446 measured every 12 spectral collections. 447 TW are fresh, dry and turgid weights, respectively. Leaf Suc was calculated as FW/DW. The Ψ π100 was 457 calculated as Ψ π ×RWC. 458

Water status and leaf thickness 448
After determination of TW, the leaf portion used for determining RWC was scanned prior to drying, and its 459 leaf area (LA) was determined using the ImageJ×1.38 software (National Institutes of Health, Bethesda, MD, 460 USA). Specific leaf area (SLA) was calculated as LA/DW. 461

Stomatal density 462
A leaf surface imprint method was used to determine stomatal density (SD; Weng et al., 2012). Abaxial (Ab) 463 and adaxial (Ad) epidermal cell outlines of leaves were imprinted onto cyanoacrylate droplets that were 464 placed onto glass slides. Images were taken under 10× magnification using a Nikon-OptiPhot2 microscope. 465 Stomata were counted in an area of 0.474 mm 2 . For each slide, four independent leaf areas were counted and 466 averaged for each biological replicate. TSD was calculated as Ab SD+Ad SD. 467

Model calibration and validations 468
Using both greenhouse and field samples, we generated models to predict leaf traits from untransformed 469 reflectance profiles using PLSR (Wold et al., 2001). When predictor variables are highly correlated, as is the 470 case with hyperspectral data, classical regression techniques can produce unreliable coefficients and error 471 estimates (Grossman et al., 1996). In contrast to standard regression techniques, PLSR reduces a large combined into a linear model predicting leaf traits based on leaf spectral profiles. 478 Model performance was evaluated by conducting 500 randomized permutations of the data sets using 80% of 479 the data for calibration (i.e., training) and the remaining 20% for cross-validation (i.e., testing) with random 480 resampling for each permutation. For each permutation, we tracked the R 2 , the overall error rate (RMSE), the 481 percentage of error relative to the data range (NRMSE) and bias to assess model performance when applied 482 to the validation data set. These randomized analyses generated a distribution of fit statistics allowing for the 483 assessment of model stability as well as uncertainty in model predictions. We further determined the strength 484 contribution of PLSR loadings by individual wavelengths using the VIP selection statistic. The VIP statistic 485 evaluates the importance of individual wavelengths in explaining the variation in both the response and 486 predictor variables, where larger weightings confer greater value to contribution of individual wavelengths to 487 the predictive model (Wold et al., 2001;Chong and Jun, 2005). Using A, Ψ w , Ψ π and SLA as testing traits, 488 we also compared the performance of models built with both greenhouse and field samples and models 489 including only greenhouse data (Supplemental Figure S3b). 490 We additionally performed external validation by applying PLSR-coefficients on a dataset distinct from the 491 one used for calibration and cross-validation, including again both greenhouse and field samples (ca. 40% of 492 the full dataset). Five-hundred randomized permutations were conducted, and relations between predicted 493 and observed values were tested by regression analysis. Fit statistics (R 2 , RMSE, bias, NRMSE) were again 494 used to assess model estimation accuracy. Using A, Ψ w , Ψ π and SLA as testing traits, we also compared the 495 estimation efficacy for these traits in field samples between the models built with both greenhouse and field 496 samples and models including only greenhouse data (Supplemental Figure S3b). 497 Before building the final models, we developed preliminary models to identify wavelength regions 498 associated with the trait of interest and to identify poorly predicted outliers on either the reference or target 499 measurements. Final wavelength regions included individual wavelengths at 1 nm intervals, thus maintaining 500 high-density spectral information in regions associated with traits, but excluding wavelengths that contribute 501 error to the predictions likely because of lack of absorption features. Prediction residuals were used to 502 identify potential outliers. Spectral data of outliers were further examined for errors, detectable from elevated 503 reflectance in the VIS wavelengths or spectral jumps in the NIR region that occur when the leaf clip is not 504 fully closed. Reference data of outliers were also examined for extremes in the data distribution (Couture et 505 al., 2016(Couture et 505 al., , 2013Cotrozzi et al., 2017b;Marchica et al., 2019). The same approach was followed to remove 506 outliers from the data set used for external validations. Outliers removed accounted for 12-13% of the initial 507 data. 508 To test another multivariate statistical approach, we also analyzed the same data sets using least absolute 509 shrinkage and selection operator (Lasso, Tibishirani, 1996. The aim of the Lasso approach is to 510 improve prediction accuracies by minimizing, or shrinking, coefficients to to be less than a fixed, upper-511 bounded value and removes coefficients of intercorrelated variables by setting them to zero, providing 512 variable selection. The same as our PLSR approach, Lasso model performance was evaluated by conducting 513 500 randomized permutations of the data sets using 80% of the data for calibration and the remaining 20% 514 for cross-validation; and for each permutation, we tracked R 2 , RMSE, NRMSE and bias statistics. The 515 modelling approach and data analyses were performed using the 'pls' (Mevik et al., 2016) and 'glment' 516 (Friedman et al., 2010) packages in R (www.r-project.org) for PLSR and Lasso, respectively. 517

Spectral indices 518
Some widely used spectral indices thought to be related with plant health and water status were also 519 calculated: photochemical reflectance index (PRI), an indicator of photosynthetic radiation use efficiency, 520 ( (R 857 -R 1241 )/(R 857 +R 1241 ) (Gao, 1996). R x indicates reflectance at x nm wavelength. 525

Statistical analyses 526
We first determined the effects of yield stability under drought, water treatment, stage and their interactions 527 on leaf traits of greenhouse samples, by using a three-way analysis of variance (ANOVA) following the 528 model 529 Y ijk =µ+Ys i +W j +S k +YsW ij +YsS ik +WS jk +YsWS ijk +e ijk . 530 Then, we similarly determined the effects of genotype, water treatment, stage and their interaction on leaf 531 traits of greenhouse samples, using a three-way ANOVA following the model 532 In these models, µ represents the mean, Ys is yield stability under drought level i, G is genotype level i, W is 534 water treatment level j, S is stage level k, and e ijk represents the error term. 535 Relationships among predicted leaf traits from greenhouse, field or greenhouse and field samples were 536 evaluated using Pearson's correlations. Pearson's correlations were also used to test the relations among A 537 with NDVI, sPRI and NDWI, as well as Ψ w , RWC and Suc with NDWI. For relationships among A and 538 spectral indexes, data were separated into either greenhouse or field data; for relationships among Ψ w , RWC, 539 Suc, and NDWI only greenhouse data were available. Statistical analyses (included regression analyses for 540 external validations of models) were performed in JMP 13.2 (SAS Institute Inc., Cary, NC, USA). The 541 normality of data was preliminary tested by the Shapiro-Wilk W test. Effects with P<0.05 were considered 542 statistically significant. 543