A photometric stereo-based 3D imaging system using computer vision and deep learning for tracking plant growth

Abstract Background Tracking and predicting the growth performance of plants in different environments is critical for predicting the impact of global climate change. Automated approaches for image capture and analysis have allowed for substantial increases in the throughput of quantitative growth trait measurements compared with manual assessments. Recent work has focused on adopting computer vision and machine learning approaches to improve the accuracy of automated plant phenotyping. Here we present PS-Plant, a low-cost and portable 3D plant phenotyping platform based on an imaging technique novel to plant phenotyping called photometric stereo (PS). Results We calibrated PS-Plant to track the model plant Arabidopsis thaliana throughout the day-night (diel) cycle and investigated growth architecture under a variety of conditions to illustrate the dramatic effect of the environment on plant phenotype. We developed bespoke computer vision algorithms and assessed available deep neural network architectures to automate the segmentation of rosettes and individual leaves, and extract basic and more advanced traits from PS-derived data, including the tracking of 3D plant growth and diel leaf hyponastic movement. Furthermore, we have produced the first PS training data set, which includes 221 manually annotated Arabidopsis rosettes that were used for training and data analysis (1,768 images in total). A full protocol is provided, including all software components and an additional test data set. Conclusions PS-Plant is a powerful new phenotyping tool for plant research that provides robust data at high temporal and spatial resolutions. The system is well-suited for small- and large-scale research and will help to accelerate bridging of the phenotype-to-genotype gap.

Reviewer #2: Well done. As a note, please inspect your calculations for curvature. In figure S4 there are locations where there exist gradients for the X and Y components of the normal. As curvature is a change in the normal vector along the surface, it seems reason that these regions have a change in the angle for the normal and therefore a higher curvature. However, these regions are not highlighted in Figure S5. For example, the top most vertical leaf has a isobar-ridge running vertically along the X and Z component (gradient running along the horizonal direction). I would expect the regions with the greatest gradient of the normal component(s) to be highlighted in Figure S5. Perhaps, however, this is a matter of both scale and color mapping. Again, well done.
-We thank the reviewer for checking this (to be specific, Fig 4 and Fig 5 in Supplementary Information S1). We have double checked the calculations and found them to be correct. The reviewer is correct to notice the change in surface normal direction for the top leaf; however, this does not appear in the curvedness image. This can be explained as the change in surface normal direction is quite gradual despite being large when compared to the leaf veins of other leaves. The investigated surface curvedness is a local parameter; thus, the designed software enables the user to define the size of this area. As specified in the figure legend (Fig. 5), a kernel size of 15 was used in our case; however, it was too small to clearly distinguish the leaf vein of the top leaf, while it was adequate for the other parts of the rosette. phenotyping both in fundamental research and agriculture [7][8][9][10]. Reflecting its considerable 62 promise, effort has been directed toward automated ground vehicles (AGVs) [11,12], satellite 63 [13], drone [14] and gantry-style platform imaging of field plants [15], and automated    objects were placed on laser cut wedges to allow imaging at a range of known angles ( Fig. 2A).

179
The projected areas were estimated using 2D and 3D data obtained from PS-Plant. The 3D data 180 enabled us to estimate the object inclination angles, which were compared to the ground truth 181 (Fig. 2B). Using 3D data, the area was estimated accurately up to 45° with a Mean Relative   Arabidopsis leaf blades typically have a convex shape when observed from above. Therefore, 207 when the leaves were not inclined (i.e. at 0°), the estimated angles were still higher than zero 208 as they were calculated from the varying SN values across each leaf blade surface.  Interestingly, our data showed that leaf rhythmicity appears to be anticipatory up to 16 DAG, 235 after which it was strictly diurnal. As older plants have a higher proportion of mature leaves, 236 that are no longer elongating, our data suggests that these leaves still exhibit rhythmic 237 movements but they are driven by the daily light-dark cycle rather than the circadian oscillator.   PS-Plant produces a range of different data: from grayscale images to SN maps (e.g. Fig. 3). 356 We trained the RNN and R-CNN architectures from initial random weights, while R-CNN was The type of PS data used did not significantly influence SBD or FBD scores, suggesting that 368 accuracy of RGB-based models was not affected by the different types of PS-based data. The 369 most accurate leaf segmentation results were achieved with models based on the R-CNN 370 architecture using pre-trained weights ( Fig. 6 Supplementary Information S4). The best results were achieved with a 398 particle filter based on leaf instance centroid location and velocity across the time-series images 399 (Fig. 7). Leaf overlap remained a limitation, as an occluding leaf was sometimes assigned the 400 label of an occluded leaf. However, erroneous labelling was found to be infrequent and  Fig. S4). However, we chose to use the latter (ii) as 414 the PB was not always visible due to leaf occlusions or the petiole being too small to be 415 distinguished (e.g. maturing leaves or leaves grown in low temperature).  to MT and LT plants (Fig. 8B). However, leaves that emerged after 11 DAG (i.e. leaf 4) had 428 an even more dramatic growth response to increased temperatures. For example, the blade area 429 for leaf 1 and 4 at 17 DAG was 40% and 130% higher in HT compared to LT, respectively.

430
Similarly, the mean surface inclination of leaf blades was higher in HT (Fig. 8C). The latter 431 result was also consistent with our findings for whole rosette surface inclination at higher 432 temperatures ( Fig. 3; 5; Supplementary Fig. S3).

434
We then calculated parameters associated with diurnal movement for individual leaf blades 435 (Fig. 8D). We targeted immature leaf blades as their movement patterns were clearer and more 436 consistent compared to maturing leaf blades. Period or phase measurements from immature 437 leaf blades were generally similar between growth conditions and comparable to values for 438 whole rosettes (Fig. 5). In contrast, measurements of immature leaf blade amplitude were 439 significantly enhanced at MT and HT and generally higher than values for whole rosettes. This with whole rosette data ( Fig. 5D; Supplementary Fig. S3B) In this paper, we have introduced an adaptable and low-maintenance platform for affordable, sheet (44 × 44 cm) and positioned at a height of 40 cm above the imaging plants (Fig. 1B, C).

510
The camera was positioned centrally in the sheet and the LEDs were positioned around the 511 camera at 45º angle increments. The LEDs were tilted at a 30º angle to illuminate the area 512 under the camera field of view (Fig. 1B). The base of the rig was painted matt black to limit AsusTek Computer Inc., Taiwan) was used to control LED illuminations, and acquire, store 515 and process images using GUI software written in Python. Details on rig assembly and the 516 LED controller design are outlined in Supplementary Information S2.

518
Leaf movement rhythm analysis 519 The leaf movement rhythm analysis was performed using the mean inclination angles (whole 520 rosette or individual leaf blade) as an input for BioDare2 beta (https://biodare2.ed.ac.uk/).   including span (the velocity of 'span + 1' recent frames), search radius (the furthest distance 932 (in pixels) an object may travel between frames), frame memory (the maximum number of 933 frames a seen/tracked object that is absent will be remembered) and filter (the minimum 934 number of frames an object must be seen/tracked to be included). The following particle filter