Unlocking the potential of orphan legumes

Orphan, or underutilized legumes, are domesticated legumes with useful properties, but with less importance than major world crops due to use and supply constraints. They play a significant role in many developing countries to provide food security and nutrition to consumers, as well as income to resource-poor farmers. They have been largely neglected, however, by both researchers and industry due to their limited economic importance in the global market. Orphan legumes are better adapted than the major legume crops to both extreme soil and climatic conditions with high tolerance to abiotic environmental stresses such as drought. They may also produce, as a stress response, compounds with pharmaceutical value. Orphan legumes are therefore a likely source of important traits for introduction into major crops to aid in combating the stresses associated with global climate change. The modern large scale genomics techniques are starting to be applied to many of these previously understudied crops with the first successes reported in the genomics area. However, greater investment of resources and manpower are necessary if the potential of orphan legumes is to be unlocked and applied in the future.


Introduction
Orphan crops are a diverse set of minor crops that tend to be regionally important but not traded around the world and have received little attention by research networks. Developing countries however rely on these crops to alleviate protein and micronutrient deficiencies associated with the predominance of dietary calories from rice and wheat, which are researched heavily by private corporations (IPGRI 2002). Orphan or underutilized crops, including orphan legume crops, are staple food crops in many developing countries, but their economic importance in global markets is limited (Naylor et al., 2004). They are primarily grown by traditional farmers, especially by women, to provide families with food security of high nutritional value (Nelson, 2016). These species may be widely distributed beyond their centers of origin but tend to occupy special niches in the local production and consumption systems. They are important for the subsistence of local communities, yet remain poorly documented and neglected by the mainstream research and development activities. Although it is difficult to precisely define which attributes make a crop either underutilized or an orphan, they often display a linkage with the cultural heritage of their places of origin, are poorly documented as to their cultivation and use and are adapted to specific agro-ecological niches and marginal land with weak or no formal seed supply systems. They can also be used as animal feed and in other agricultural applications generating income to resource-poor farmers (Tadele, 2009). However, due to the lack of economic importance, they have been neglected by both the international scientific community and the industry when compared to commodities such as rice, corn and wheat (Foyer et al., 2016). This neglect has also resulted in a lack of genetic improvement resulting in inferior yield in terms of both quality and quantity. They have been investigated as a future food crop predominantly by scientists in developing countries with the help of international organizations, for example the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT).
The Kirkhouse Trust, which focuses on crop improvement research using modern molecular methods to improve locally important legume crops also includes as important orphan legumes, with a potential to be developed into a crop, horse gram (Macrotyloma uniflorum), moth bean (Vigna aconitifolia), Dolichos (Lablab purpureus), and marama bean (Tylosema esculentum) (http://www.kirkhousetrust.org/orphanlegumes.html#.V0AvnfmLTIU). Cowpea is an example for an orphan legume crop with a wide spectrum of importance including both human and animal health, and many cowpea-based products have also found their way into the cosmetic, food, textile and pharmaceutical industry due to the legume"s therapeutic properties (Singh and Basu, 2012). Cowpea and other orphan legumes, such as "Dolichos" bean, which is an ancient legume crop (Murphy and Colucci, 1999), are also valuable to humans with limited access to animal protein. Cowpea seeds have a high protein content (25% of dry weight) and the protein content of cowpea leaves annually consumed in Africa and Asia is equivalent to 5 million tons of dry cowpea seeds (Steele et al., 1985). Cowpeas and other orphan legume crops are further a rich source for vitamins and minerals (Singh et al., 2003) and the high lysine content of cowpea is a supplement for cereal-based diets low in lysine (Lambot, 2002). The importance of legumes in human nutrition will likely grow due to the world population increase that will require a substantial demand for additional protein.
Orphan crops are generally also more adapted to the extreme soil and climatic conditions, existing in many parts of the world, than the major world food crops. Orphan legumes therefore provide valuable information for traits useful for survival under harsh environmental conditions such as drought. However, despite this potential, drawbacks for crop development and more cultivation of these orphan legumes frequently include low seed set and the lack of efficient harvesting techniques. Marama bean for example has a limited seed yield, low propagation rates and is a ground creeper covering large areas (Figure 1) making it difficult to harvest (Keegan and Van Staden, 1981 To our knowledge, a review has not yet been written under the aspect why certain characteristics render orphan legumes exceptional and if such characteristics can be transferred to major legume crops. We focus here on characteristics of marama bean and grass pea, as two representatives of orphan legumes that survive under less water availability and may also to produce pharmaceutically interesting compounds as a stress response. In addition, for orphan legumes "omics" are already, or will be soon, available. In this review, the existing knowledge and recent developments in "omics" analysis will be highlighted. More extensive research in these "omics" areas will possibly allow orphan legumes to more rapidly join the world's league of major food crops.

Tolerating drought stress
Higher thermal and drought tolerance allows orphan legumes to grow in hot and dry climates under rain-fed conditions on marginal land. Such growth under drought and on marginal land is either barely achievable, or not achievable at all, with major food crop plants (Muehlbauer et al., 1985;Tilahun and Schubert, 2003;Thomas et al., 2004;Munoz-Perea et al., 2006).
There is an interest in investigating the basis of their high tolerance to drought and heat with a view to possibly transferring specific genes coding for unique characteristics to major food crop species. A plant can respond to drought by modifying its morphology and physiology as an adaptive mechanism. These drought responses include reduction of leaf expansion, closing of stomata, shifting the ratio of shoot to root biomass and changing root characteristics (Bultynck et al., 2004;Franks, 2013;Koevoets et al., 2016). Legume performance under drought stress is also closely related to root system development as well as the root system distribution (Newman and Mosser, 1988;Gaur et al., 2008). Greater root depth and root biomass allows a more efficient extraction of any available soil moisture (Blum et al., 2011;Fenta et al., 2014) and root growth in legumes increases during the vegetative growth period but slows down when seed filling begins (Gregory, 1988;Abdelhamid, 2010). Root hydraulic conductivity, depending on the diameter and the distribution of the meta-xylem vessels, has also been found to provide better drought tolerance in legumes. For example, the root length density is much lower in the grain legume chickpea than in barley, but better hydraulic conductivity allows the legume to more efficiently absorb water (Thomas et al., 1995).
Certain tepary bean lines also have deeper roots with the greatest root mass in the deepest soil profile. Such characteristics result in improved adaptation for more efficient water uptake.
The bean also has small leaves to reduce water use and lower stomatal conductance as adaptive characteristics for water conservation (Mohameda et al., 2002).
However, the question that still needs to be answered is what is the basis of the ability to withstand severe environmental stress conditions? The existing literature provides information about measurement of drought tolerance by applying standard morphological and physiological tools to identify more drought-tolerant orphan legumes but without identifying any specific unique traits. Although orphan legumes respond in many ways like other crop plants to drought, the two examples considered here, the marama bean and grass pea, illustrate why orphan legumes might be exceptional in their ability to survive in stressful environments.
The marama bean, in which we have particular interest, has the potential to become a future crop well adapted to arid conditions in Africa, since the bean can survive under unfavorable conditions which exist for example in the Kalahari desert (Powell 1987;Nepolo et al., 2009). Marama bean protein and oil content is comparable to soybean (Glycine max) and groundnut (Arachis hypogaea) (Bower et al., 1988). Marama grows rapidly under non-stress conditions with runners extending along the ground (Mitchell et al., 2005), is highly branched and produces a great number of leaves ( Figure 1). The marama bean does not possess nodules for nitrogen fixation. This possibly avoids nitrogen restriction given the known sensitivity of nodules and nitrogen fixation to drought conditions (Serraj et al., 1999). The marama bean is not particularly drought-tolerant, despite growing in the desert, since the vegetative growth is affected by drought. However, marama has a tuber serving as a water reservoir. The plant, as a possible drought avoider, can therefore survive on both stored water and assimilates in the tuber ( Figure 1). This allows rapid vegetative re-growth under more favorable conditions (Travlos and Karamanos, 2008). There is also evidence that marama is endowed with several drought avoidance mechanisms, which, in parallel with osmotic adjustment, enable it to survive under very harsh conditions (Karamanos and Travlos 2012).
Grass pea is cultivated worldwide and is one of the cheapest sources of dietary protein in the developing world (Enneking, 2011) and is an excellent example of a drought-adapted orphan legume with high water use efficiency (Sekhon et al., 2010). No legume other than grass pea has, in fact, ever served as a staple food. The grass pea is among the priority crops in Kew"s Millennium Seed Bank and the Global Crop Diversity Trust project "Adapting Agriculture to Climate Change". A primary objective of this "Adapting Agriculture to Climate Change" project is to collect and protect the genetic diversity of a portfolio of plants with traits required for adapting the world's most important food crops to climate change.
Such adaptation is considered as key component securing the world's future food production (Dempewolf et al. 2014). For more than 100 million people in Asia and Africa, the grass pea is a traditional crop, which is easy to cultivate under stressful growth conditions (Vaz Patto and Rubiales, 2014). The grass pea is an 'insurance crop' still producing reliable yields when all other crops fail due to drought. In contrast to marama bean, grass pea possess a combination of traits that allow it to both escape drought and to avoid dehydration. The plant first matures early to escape drought, a characteristic that can avoid any terminal drought.
Dehydration is avoided through both a reduction in the green leaf area, to keep a higher water status for maintaining photosynthesis, and a reduction in stomatal conductance. Grass pea reduces flower and pod production under stress and concentrates its resources in a small number of surviving pods (Gusmao, 2010). Another important feature to note is that the threshold soil water content at which seed set is reduced coincides with that at which the leaf stomatal conductance and photosynthetic rate start to decrease (Kong et al., 2015). However, any physiological responses due to drought only start when that certain soil water threshold has been passed (Zaman-Allah et al., 2011). Further, the decrease in photosynthesis as well as the cessation of flower and pod production is more prominent in grass pea than in the grain legume chickpea (Kong et al., 2015). Thus, despite chickpea maintaining a better flower and pod production than grass pea under drought, such drought-exposed chickpea pods do not produce any seeds under such conditions, while grass pea does produce seed yield.

Producing useful compounds
Plants, in order to survive stressful environments, can synthesize a range of secondary metabolites and proteinaceous inhibitors in response to an environmental stress (Bora, 2014). Marama bean is an example for an orphan legume producing, as a unique characteristic, a serine protease inhibitor, and 10.5% of total protein can consist of this inhibitor (Nadaraja et al., 2010). A possible function of the inhibitor is as an anti-feedant. However, the possibility that the inhibitor also functions in controlling proteolytic events in marama bean, associated with growth in an arid area with little water availability, has not yet been studied (Kunert et al., 2015). Such protective compounds can also significantly contribute to human and animal health. The marama serine protease inhibitor strongly prevents elastase activity. As part of the chymotrypsin-like clan, human elastase is involved in various inflammatory disorders including pulmonary emphysema, sepsis, arthritis, nephritis and certain skin diseases (Boonart et al., 2010). Marama serine protease inhibitor therefore has the potential to be therapeutic not only for these diseases but also as a skin cream additive. Overall, these non-food properties ultimately add value to the marama bean and provides strong support to further develop marama into an important food and cash crop in arid areas. The marama tuber also has interesting starch characteristics. The starch content of the tuber can be up to 10% on a fresh weight basis, compared with potato tubers which have for example 15% (Nepolo et al., 2015).
Evaluation of the physical properties of marama starch by rapid visco-analysis, indicated a high viscosity with a potential to form strong gels, a property which might be of interest to the food industry.
Grass pea is not only highly resistant to pests and diseases but also to abiotic environmental stresses. Grass pea produces the non-protein amino acid β-N-oxalyl-L-α,βdiaminopropionic acid, or ODAP, in addition to phenolic and flavonoid compounds and antioxidants (Rao et al., 1964;Enneking, 2011;Talukdar, 2012). The ODAP content increases in response to stress and the compound can scavenge hydroxyl radicals produced by oxidative stress (Gongkea et al., 2001). Despite this potential, ODAP has a major drawback when grass pea is consumed by humans since ODAP is a neurotoxin and causes paralysis of the lower body known as "lathyrism" (Singh and Rao, 2013). Lathyrism develops when grass pea seeds are consumed exclusively as a primary protein source for a prolonged time, although no lathyrism occurs when grass pea is consumed only as part of a normal diet and not as the only component of the diet (Singh and Rao, 2013). There is evidence that ODAP can be detoxified in humans but not in animals , but the disease still exists in Eastern Africa (Eritrea and Ethiopia). The development of low-toxin varieties would make a significant contribution to increase the impact of grass pea for food security of millions of people. Unfortunately, conventional plant breeding techniques for the development of such low ODAP seeds has still not been successful (Girma and Korbu, 2012). This lack of breeding success might be, however, overcome by the application of the newly developed gene editing capabilities of CRISPR/Cas systems for specifically modifying the ODAP pathway once identified (Cong et al., 2013).
The negative perception of grass pea has recently also changed, because grass pea is also the only known dietary source for L-homoarginine (Singh and Rao, 2013) and for the properties of L-ODAP. L-homoarginine provides benefits for treatment of cardiovascular diseases and to overcome hypoxia associated with cancer tumor development (Singh and Rao, 2013). L-homoarginine acts as a substrate for endogenous nitric oxide (NO) production. NO, a signaling molecule, has an important role in cardiovascular diseases and hypertension control (Jyothi and Rao, 1999). Since L-homoarginine is also a poor substrate for arginase, the compound also persists longer in circulation than arginine. L-ODAP might also stabilize the hypoxia inducible factor-1 (HIF-1), which changes the transcriptional response of tumors under hypoxia (Ziello et al., 2007). In vitro studies with human neuroblastoma cells have already demonstrated that L-ODAP can stabilize (nuclear translocate) HIF-1 and prevent the consequences of hypoxia (Ke and Costa, 2006;Jammulamadaka et al., 2011). L-ODAP also activates protein kinase C . Although known kinase C activators can be neurotrophic with neuroprotective properties (Nelson and Alkon, 2009), most kinase C activators are apoptotic. Therefore, there is a constant search for non-apoptotic kinase C activators. If L-ODAP does have non-apoptotic characteristics, the compound might have potential for application in neurological disorders such as Alzheimer's disease.

Omics-enabled improvement of orphan legumes
The relationship between the genotype and phenotype is complex, which makes the prediction of the phenotype based on the genotype challenging. The level of information needed to implement genomic-assisted breeding has steadily increased as the cost of acquisition of nucleic acid sequence information has rapidly decreased. The idea that a genome sequence would be an enabling first step a few years ago (Varshney et al., 2009 and2012b;Young and Bharati, 2012) has been, however, replaced by the realization that a single genomic sequence is only a snapshot of part of the diversity within a species. This relationship is further complicated when the experimental material is composed of land races, which most orphan legumes are, rather than well-documented inbred lines. These land races have different combinations of useful genes (or alleles) making the identification of those to target for selection and incorporation into new varieties more difficult. In addition, a robust phenotyping platform is essential to ensure that the evaluation of the genotype and environment interaction can be adequately performed and ensure that the same phenotype is actually being studied.
One of the orphan legumes with the most applied molecular data is cowpea. The genomic information for cowpea has been used in the evaluation of quantitative trait loci (QTL) for a number of traits and for identifying diversity of land races. The development of an integrated genetic map using molecular markers was based on recombinant inbred lines (Timko and Singh, 2008). These molecular markers, and their subsequent extension have been the basis of the implementation of QTL identification and subsequent marker assisted selection for a number of traits. An important resource in the identification of QTL, especially using SSR markers, has been the development of a series of recombinant inbred lines. This plant material can be phenotyped and DNAs isolated from the individual phenotyped plants.
Any of the traits that are segregating from the initial parents can then be followed and be associated with particular regions of the chromosomes. The traits that have been characterized to date include those for flowering time (Andargie et al., 2013), domestication (Andargie et al., 2011), and floral scent compounds (Andargie et al., 2014). These studies were only made possible by the previous work in developing the recombinant inbred lines that were then phenotyped, but this particular population only captured a small amount of the natural variation in cowpea. For a number of crop species, association mapping populations are being developed that capture the maximum diversity such as the multi-parent advanced generation inter-cross (MAGIC) populations in rice (Bandillo et al., 2013). Four multi-parent populations were created containing desirable traits for biotic and abiotic stress tolerance, yield, and grain quality in order to fine map QTLs for multiple traits and to use the highly recombined lines in breeding programs.
In sunflower, a similar association mapping population consisting of 288 lines has been described by Mandel et al. (2011). In this case a diverse collection of cultivated sunflower lines was genotyped with simple-sequence repeat (SSR) markers distributed across the complete genome and the data were then used to identify subsets of lines that captured maximum diversity present in the initial population. These types of association mapping populations need to be constructed and characterized for the important orphan legumes incorporating as much of the germplasm diversity as possible. Such populations will then enable the identification of loci underlying important stress tolerances within these crops.
Ultimately, they will permit the isolation of the genes underlying these traits, for their possible incorporation into other crop plants. However, the development of such association mapping populations for the orphan legumes will require a long, dedicated, funded international collaborative effort, first to develop the appropriate marker set. Next, the genotyping of large populations of land races needs to be done to reveal the extent of genetic diversity followed by the inter-crossing of the selected lines and the extraction of the inbred material for phenotyping. As noted in the next section, the use of SSR and/or single nucleotide polymorphisms for diversity analysis and association mapping may be superseded by genotyping by sequencing with the falling costs of next generation sequencing data.
The genomics contributions to the development of advanced improvement of orphan legumes have been initiated through the increased availability of ever-decreasing DNA sequencing costs. The combination of high throughput DNA sequence from the Illumina platform combined with the long read information from PacBio data and new linking technologies, such as those from Dovetail genomics (https://dovetailgenomics.com), has placed the possibility of obtaining high quality genomic assemblies with long contiguous segments within reach of most plant organisms (Table 1). Once the basic genome has been assembled, extensions of the characterization of the genome space can be achieved at relatively low cost through the implementation of reduced representation libraries, such as those used in the genotyping by sequencing approach (Elshire et al., 2011), although the genotyping by sequencing approach is not contingent on a reference genome being available.
This approach increases the precision of identifying differences across the whole genome more rapidly and at a lower cost than earlier DNA marker technologies. For species that are essentially outcrossing, which is the case for marama bean, the levels of heterozygosity will  Takundwa et al., 2012 ?undetermined as yet further complicate the identification of useful alleles in developing genomic selection schemes.  Ehlers and Hall, (1997) An alternative method of reducing the complexity of the genome is to focus on the expressed portion of the genome. RNA sequencing has been a route for identifying new genetic markers for some orphan legumes such as Bituminaria bituminosa (Pazos-Navarro et al., 2011). The availability of new tools for both the de novo assembly of transcriptomes and the alignment to newly assembled genomes will allow the identification of differential gene expression in response to stress. However, since the transcriptome is developmentally controlled, the anatomical site of any stress tolerance needs to be identified so that the appropriate tissues can be sampled. Interpretation of such RNA-Seq studies is still dependent on an understanding of the genetic underpinnings of the stress tolerance in order to unravel the causative versus consequential transcriptional variations. Finally, the transcriptome represents the expressed portion of the genome while the proteome is a reflection of that subset of the transcriptome that is translated into proteins. Proteomics in orphan legumes is still rare, with grass pea being one of the few with such studies (Rathi et al., 2015).
Differential proteomics to identify stress-responsive molecules would require a significant investment in time and resources, so that the development of genetic stocks that would be most informative for such studies is a pre-requisite.

Learning from orphan legumes and where to go
Orphan legumes are an interesting source for the study of stress tolerance, such as drought tolerance, since drought is one of the consequences of the predicted climate change. However, drought-tolerant orphan legumes probably do not have unique traits, but rather have specific combinations of common traits, with some more prominent than others. They can also possess traits unusual for a legume, such as producing a tuber. In their sum all these traits contribute to the final greater level of drought tolerance. Developing an understanding of both the specific traits and the importance of combination of traits might ultimately help us to improve the selection of, or even the engineering of, plants to recreate the most efficient combinations of such traits in many species. Major food crops simply have lost some of these traits or combinations of traits, due to selection for best performing cultivars under a given growth condition, resulting in a loss in both genetic diversity and valuable traits (Li et al., 2013;Valliyodan et al., 2016). Therefore, identifying a single gene for providing tolerance against a stress like drought affecting many plant processes is rather unlikely. The majority of already genetically engineered plants carrying a single gene to provide better drought tolerance exhibited simply a delayed onset of drought stress effects (Lawlor, 2013).
Orphan legumes are still an untouched treasure trove for compounds not only providing possible stress tolerance to the plant but also capable of contributing to human health. As nutraceuticals they might serve in functional food to manage diseases and disorders associated with a changed life-style, or as a basis for new drug development in areas of cardiovascular physiology, hypoxia as well as cancer (Rao, 2011;Singh and Rao, 2013;Van Wyk et al., 2016). Grass pea is an excellent example with potential for a "functional food" by being the only known source for L-homoarginine. This compound might prevent hypertension, a major risk factor for heart attacks and strokes. However, rather limited research has been so far conducted to identify and characterize other potent stress response compounds produced in orphan legumes.
Finally, the characterization of orphan legumes on the "omics" level is still in its infancy. They remain unexplored on the genomic, transcriptomic as well as on the proteomic level although the efforts such as the African Orphan Crops Initiative (http://africanorphancrops.org) are starting to fill the genomics information gap. In spite of the rapid developments of new "omics" capabilities, it will still be necessary for an enhanced investment into the molecular characterization of these orphan legumes to explore their full potential. A first good example how "omics" can contribute is represented by pigeon pea. This orphan legume crop is mainly grown by poor farmers and is known as the "poor people's meat" because of its high protein content. Pigeon pea is still the exception regarding "omics" due to the recent completion of its genome sequence and therefore having the potential to enter soon the "world's league" of major food crops (Varshney et al., 2011). Pigeon pea is also among the most drought-tolerant and nutritious orphan legume crops (for an overview see: Odeny, 2007). This legume withstands drought due to its deep roots and the osmotic adjustment in the leaves. The unique polycarpic flowering habit further sheds under stress any reproductive structures (Mligo and Craufurd, 2005). An advanced understanding of the pigeon pea genome might have a significant impact on improving crop productivity, by more easily identifying genes associated with these specific characteristics rendering pigeon pea highly drought-tolerant plant. So far, over 40,000 pigeon pea genes have been identified and several hundreds of these genes are associated with drought tolerance.
Many of orphan legume crops will soon have a genome sequence (www.Africanorphancrops.org) and they might no longer be described as those that had been bypassed by the molecular genetic revolution, but might still be classified as orphan crops due to the paucity of resources assigned to their development. By focusing resources on these orphan legumes, important genes for both increasing stress tolerance and for human health can be uncovered and subsequently implemented in many new areas. The expertise to extend this initial information surging into practical applications will need, however, human capacity building in the areas molecular technologies and bioinformatics, as well as the infrastructure to continue to build on the preliminary information. For marama bean, the authors are currently resourcing some of these needs by incorporating relevant research projects within the undergraduate curriculum at Case Western Reserve University to advance the characterization and improvement of this orphan legume into a useful crop. Within this framework, whole genome sequence has been assembled from Illumina reads that has a sufficiently robust assembly to be able to identify genes of interest (Cullis, unpublished).
However, we have already used the initial assembled contigs to identify marama cysteine proteases and cystatins with unique amino acid compositions based on the conserved regions of these proteins (Cullis and Kunert, unpublished). These novel proteins can be already expressed and their properties evaluated to determine if they have potential for use either in other crop improvement, or in human health applications or both. We are also investigating the presence of genes necessary for nodulation, and other mycorrhizal associations, to understand why this legume does not nodulate, but is still able to thrive in poor soils. The next steps in the project will include a survey the diversity of marama bean across southern Africa to construct and analyze multi-parent advanced generation inter-cross (MAGIC) populations for marama bean to serve as important resource for the isolation and transfer of genes for stress resilience that will facilitate crop improvement (Pandey et al., 2016). This population will be grown and phenotyped and genotyped to identify the possible set of lines for undertaking association mapping for important domestication, yield and stress tolerance traits.
However, as with all plant breeding projects, the time interval between initiation and the development of the experimental material requires patience and continual funding. However, since we are using this project to involve a succession of undergraduate and graduate students in a course-based research experience, the expectation is that the effort will be successful in the long term. The germplasm improvement and release to resource-poor farmers in the region will be the ultimate demonstration of the success of this approach. Such an approach could benefit all the orphan legume crops as an improvement strategy, and the recruitment of students in the developed countries would help alleviate, in the short term, the lack of human capital to analyze the data arising form such genomics projects.