U.S. Legume Crops Genomics Workshop
30-31 July 2001
Hunt Valley, Maryland
White Paper
Convened by:
H. Roger Boerma, Distinguished Research Professor, University of Georgia, 111 Riverbend Road, Athens, GA 30602 (phone, 706-542-0927; email, rboerma@uga.edu)
Judy St. John, Associate Deputy Administrator, Agricultural Research Service, Crop Production, Product Value and Safety, Room 4-2204, George Washington Carver Building, 5601 Sunnyside Avenue, Beltsville, MD, 20705 (phone, 301-504-6252; email, jsj@ars.usda.gov)
Jennifer Yezak Molen, consultant, AgSource, Inc., 600 Pennsylvania Avenue, SE, Suite 320, Washington, DC 20003 (phone, 202-969-8902; email, jyezak.molen@gordley.com)
On 30-31 July 2001 twenty-six legume scientists with knowledge of structural and functional genomics, DNA markers, transformation, bioinformatics, and legume crop improvement participated in a workshop hosted by the United Soybean Board, the National Peanut Foundation, the USA Dry Pea and Lentil Council, and the USDA-ARS. Over the course of two days, the scientists reached consensus on priority genomics research on U.S. legume crops. The following high-priority areas of critical research were identified:
Genome Sequencing of Strategic Legume SpeciesII. ROLE OF LEGUMES IN AGRICULTURE
Legumes, together with cereals, have been fundamental to the development of modern agriculture. Since the dawn of civilization, many legume species have been instrumental in supplying human food (e.g., soybean, common bean, pea, peanut, lentil, chickpea), edible oils (peanut, soybean), and animal fodder and forage (alfalfa, clover). Legumes are second only to grasses in importance for human and animal dietary needs. Worldwide, legumes are grown on about 15% of the arable land (270-300 million hectares). They provide 33% of humankinds nutritional nitrogen requirements. Last year in the U.S., soybean and alfalfa production was 72 and 84 million metric tons, respectively. Their total estimated direct value was $20 billion (soybean = $13 billion, alfalfa = $7 billion). An equally important added value for U.S. agriculture is that legumes in symbiosis with soil bacteria fixed some 17 million metric tons of atmospheric nitrogen worth about $8 billion. This unique ability of legumes reduces the dependence of farmers on expensive chemical fertilizer, reduces our dependence on petroleum products, and improves soil and water quality. Because crop legumes can fix nitrogen, they are critical for subsistence farms in developing countries that do not normally have access to nitrogen fertilizer. In these areas, legumes frequently provide 66% of the nutritional needs for humans and are especially important as a substitute for animal protein.
One of the driving forces behind sustainable agriculture and protection of the environment is effective management of nitrogen in farming systems. Intensive farming practiced in developed countries is predicated on using large amounts of nitrogen fertilizer. This practice has led to significant deterioration of water, soil, and air quality. In addition, it is estimated that production of 1 metric ton of ammonia requires a consumption of approximately 1,185 cu. meters of natural gas. As the worlds population approaches 10 billion within the next half century, nitrogen needs for increased crop production will exacerbate current environmental problems. Increased cultivation of legumes will be required to ameliorate environmental degradation, reduce the depletion of nonrenewable resources, and provide adequate nitrogen for the population. Cultivation of legume crops results in a significant reduction in the use of nitrogen fertilizer. It has been estimated that growing a nitrogen-fixing legume in rotation with some crops decreases the required application of nitrogen fertilizer by 40%. Moreover, nitrogen volatility into the atmosphere and nitrogen leaching into groundwater are reduced by cropping legumes. In addition, the nitrogen fixed by legumes is equivalent to sequestering a further 800 million metric tons of CO2. Estimates indicate that in the U.S., simple rotation of a legume with corn could replace 12 to15% of the nitrogen fertilizer needs by corn resulting in an on-farm savings in excess of $500 million. Legumes clearly play a major role in protecting human health, farm profitability, and mitigating environmental problems.
III. SCIENTIFIC STATUS OF LEGUME GENOMICS
Since Mendel, crop legumes have been the focus of intensive genetic studies to improve yield, quality, resistance to biotic and abiotic stress, and extend geographic range. As a result, selected legume species have well-studied genetic systems characterized by classical biochemical and physical markers, cytogenetic analysis, chemically induced mutations, and DNA marker-based genome linkage maps. Yet in many of these well-defined systems, comprehensive genetic analysis is limited due to the large size of the genomes of legume crops. Furthermore, few of the basic tools required for modern genome analysis including polymerase chain reaction (PCR) based DNA markers, expressed sequence tag (EST) databases, or bacterial artificial chromosome (BAC) libraries have been applied to most legume species. No crop legume has an integrated genetic, physical, and transcript map. Furthermore, efficient transformation has recently been developed in only two legume species. It is of paramount importance that a concerted research initiative be directed towards the development of tools that will permit application of modern genome analysis and manipulation technology to the genetic improvement of crop legumes.
The genomes of most crop legumes are large and relatively complex. For example, soybean is an ancient polyploid and alfalfa an autotetraploid with genome sizes of ~1200 megabases (Mb) and ~1600 Mb, respectively (for comparison, the human genome is about 3000 Mb). The genome size of cultivated peanut is 2800 Mb and pea is 4000 Mb. Such large sizes significantly complicate the development of ordered physical maps of the genome, as well as the identification and location of important genes. The large genome sizes make complete sequencing financially tenuous. Syntenic relationships within botanical families make it possible to use plant species with much smaller genomes to facilitate understanding of those with large genomes. For example, the recent complete sequencing of the smaller genomes of Arabidopsis thaliana (128 Mb) and rice (425 Mb) provide the platform for genome analysis of more complex species such as canola, broccoli, corn, and wheat. Information from the Arabidopsis and rice genomes is rapidly being translated across more complex species to enhance disease and pest resistance, yield, and compositional quality of the seed.
In order to expedite and simplify genome analysis of crop legumes, it has been proposed that parallel analysis of a legume with a smaller genome be considered. Recently, studies sponsored by the NSF Plant Genome Program have shown that the barrel medic, Medicago truncatula, would be an ideal candidate for parallel analysis with crop legumes. Barrel medic is a diploid, has a small genome (~450 Mb), rapid generation time, is self-compatible, and appears to have synteny with alfalfa and also to some degree with pea and soybean. Comparative analysis with other legume crops will provide additional advantages through complementation of genetic knowledge available in the different legume species. Soybean is a major crop, with significant prior study of genetics and crop and seed physiology. Common bean (Phaseolus) benefits from relatively well-developed genetic studies and ample polymorphism within the cultigen. Peanut possesses an unique reproductive physiology, which can contribute to a greater understanding of crop reproductive biology.
The 26 scientists (referred to as Legume Crop Working Group or LCWG) reached consensus on the following six areas of critical genomics research. The order of listing is not intended to indicate order of importance or relative priority.
Over the past decade, biological research has been transformed by the ability to sequence entire genomes of living organisms. The application of this technology to human medicine is revolutionizing the pharmaceutical industry. A similar but largely untapped potential exists in the agricultural sciences. In the case of legumes, many important traits have been identified at the genetic level by breeders and geneticists. Whole genome sequencing will reveal the molecular blue print that underlies these valuable characters.
The LCWG recommends that genome sequencing projects be initiated on a few strategically chosen legumes. The species recommended by the working group are intended to span the phylogenetic diversity of these crop species, and to provide a platform for the methodical exploration of legume genomics. We recommend (1) sequencing the gene-rich regions of three warm season grain legumes, Phaseolus vulgaris (common bean), Glycine max (soybean), and the phylogenetically more distant Arachis hypogaea (peanut) and (2) determining the complete genome sequence of Medicago truncatula (barrel medic) as a specific reference for the cool season legume species (e.g. pea, lentil, and alfalfa) and as a structural model for other crop legume genomes.
The impact of this research will be to integrate genetic and functional information across legume crops. The resulting knowledge would enable the more precise development of improved crop cultivars by classical and molecular breeding methods, and it would greatly accelerate the pace of research to determine the molecular basis of traits of biological and economic importance.
Physical maps for carefully selected representatives of the legume family such as soybean (Glycine spp.), medic (Medicago spp.), peanut (Arachis spp.), and common bean (Phaseolus spp.) are important primary research tools, and also key stepping-stones toward longer-term goals. A necessary prerequisite for taking advantage of physical maps is the development of BAC libraries in a broad range of legume species. Physical mapping is an economical means to plot the order of most genes in a genome, and to reveal gene-rich regions. A detailed physical map is central to identifying specific candidate genes that may account for an agriculturally important trait and development of breeder-friendly markers for effective selection of such traits. Physical maps advance the integration of maps for different crops, leveraging investments in genomic research. In the short term the physical map provides a framework to organize partial sequence information. In the long term, rigorous physical maps comprised of several inter-related data types provide the robust framework needed to assemble a complete genomic sequence.
Transcript maps can be derived using physical maps that define the position of the known legume genes (ESTs) and their paralogs. Transcript maps will identify the gene-rich regions of different legume crops and allow their comparison. These transcript maps will result in the integration of genetic and functional information across legume crops and provide significant cost and efficiency benefits for the sequencing efforts.
Current and planned structural genomic efforts will create a wealth of legume gene sequences. Efforts must be made to move beyond this structural information to examine gene function. The LCWG recommends research on the application of functional genomic tools (e.g., DNA microarrays, proteomics, metabolomics) to the major legume crop species. These tools have tremendous potential to aid understanding of quantitative and value-added traits and biotic and abiotic stress resistances. These studies should utilize the biodiversity available within each legume species.
The majority of plant species, including legume species, lack efficient methods for gene transfer which is critical for gene functional analysis. Although several promising efforts are underway to overcome this limitation in legumes, it is not clear at present which of these approaches is most efficient. Therefore, a series of studies to evaluate these methods and ultimately choose one or more for large scale, concerted efforts should be initiated. In addition for those species with a working transformation system, the LCWG recommends the commitment of funds for the development of methods for the evaluation of gene function such as gene knockouts or gene down-regulation systems.
DNA markers are some of the most powerful tools to come out of the genomics revolution. DNA markers form the foundation of genetic linkage mapping. They are the basis of marker-assisted breeding, enable genome comparisons between different species, and provide tools for assessing molecular variation within and between species. The LCWG recognizes the value of DNA markers to all crop legume species and the vital need to leverage knowledge about DNA markers from better-characterized crops to others that are less well studied. In certain instances when knowledge of better-characterized species does not provide adequate genome coverage of a specific legume crop, the development of species-specific markers will be required.
To take advantage of DNA marker technology, a core set of at least 1000 sequence tagged sites or STSs that are universal among all legume species should be developed. For legume crops with large chromosome numbers and low levels of DNA polymorphism, such as soybean and peanut, more than 1000 STSs will be needed to provide meaningful comparative maps. These STS markers will begin with strategically chosen PCR-based markers that have already been developed, especially in pea, soybean, and barrel medic. Eventually, the STS core set will grow by mining legume sequence data in order to find highly conserved sequences shared by all legumes.
The legume STS core set will immediately provide powerful tools for trait mapping and marker-assisted breeding in all legume species, including those with few marker resources available today. The STS core will also interconnect the genetic maps of different species, revealing cases where genome organization is highly conserved and where rearrangements have occurred. The STS core will also simplify the important task of fingerprinting germplasm collections and analyzing molecular evolution within the legume family.
Biodiversity is the raw material for the genetic improvement of all crops. If we are to use the genetic diversity naturally present in germplasm it is important to determine where the greatest diversity is and how it can be applied in crop breeding. A number of crop species (pea, common bean, cowpea) have been identified as genetically diverse, whereas others (chickpea, soybean, peanut) possess a narrow genetic base. Sequencing of alleles in one or more of the genetically diverse crops may provide important diversity for crop improvement. Additional studies on phylogenetic relationships among legumes using a variety of genome markers should be pursued to refine and improve our understanding of legume phylogeny.
A comparison of the DNA sequences or their expression among closely related species or genotypes within a species will facilitate the identification of genes for important traits. This will require the identification of carefully selected genotypes and the application of genomic tools such as DNA sequencing and microchips for expression analysis. Opportunities for this research exist in many legume crops, but particularly in Phaseolus beans, for which a detailed phylogeny exists.
In pathogens, mutations in genes occur that allow them to cause disease in their crop host. Conversely, new variations in genes are generated in host plants that allow them to recognize and initiate defense responses to resist the invasion by pathogens. A joint analysis of host and pathogen diversity with genomic tools will determine how genes for resistance are generated, why some genes are more stable than others, or why some host plants have a broader spectrum of resistance. A similar research approach is recommend by the LCWG for legume crops and their symbionts.
Most agronomic traits are conditioned by families of related genes. These multigene families originated by duplication of an original gene and sequence divergence of the copies of the gene. During the divergence, the different copies can acquire different traits, such as adaptation to different environmental stresses, production of new compounds, and expression in different organs. Structural genomic and expression studies will tell us how legumes differ among themselves and how they generate new traits.
The USDA/ARS collects and maintains germplasm collections for major crop legume species. These collections likely contain important traits currently undiscovered or undefined. The LCWG recommends intensive evaluation of existing germplasm collections using the tools of structural and functional genomics to identify new genes conditioning economically important traits and to characterize the U.S. germplasm collections more completely.
Large-scale genomic efforts can be leveraged across species. Genomic data from the various legume species are currently maintained in a variety of species-centric databases. Tools must be developed to integrate, analyze, and deliver the data from many species.
Examples of the data types to be maintained in such an integrated resource include, but are not limited to, EST and genomic sequences, expression profiling information, map-based genetic traits, and breeding and biological diversity data. The integration of ongoing map/trait-based activities with sequence-centric databases will provide the research community with scalable, sustainable data handling and analytical abilities.
Because it is likely that data collection and curation will occur in `the field' and not at the site of the data integration, it is important that the centralized database be able to access the species-centric databases, acquire the required data interactively, and generate reports in a user-friendly, web-accessible, and graphical manner. Such an integrated resource will benefit all legume species by making genetic and genomic information developed in a wide range of species available to all other species researchers via a web-based interface. An integrated database provides a platform for integrated data analyses. Such a merged database would facilitate pan-legume data mining. This would provide a synergistic utilization of the shared data.
The six areas of priority research that are described in this report represent common needs for the major legume crops grown in the U.S.. Together, they will advance the status of legume research in a synergistic manner, complementing and building upon the specific strengths of each legume crop. The results of this collaborative effort will be a greater understanding of the genomes of each crop species and the development of the crop-specific genomic tools and technology needed to accelerate the rate of genetic gain for U.S. legume crops. Before departing the workhop, the scientists developed a "Plan of Action" to assure the results of the workshop were widely distributed and that an organization was created to enhance communications among legume researchers and promote opportunities to fund the consensus research (see attachment).
After the 26 scientists had developed their areas of consensus research, grower leadership representing the various legume crops and commodity association staff (representing soybean, peanut, pea, lentil, and common bean) reviewed the results of the workshop and discussed approaches to cooperate in seeking funds to accomplish the research. The grower leaders and association staff agreed their respective legume crops are in a stronger position to achieve support for a plan if it is presented and supported by each commodity organization in its entirety.
|
NAME |
ADDRESS |
STATE |
PHONE |
FAX |
E-MAIL ADDRESS |
|
Albert G. Abbot |
Department of Genetics And Biochemistry 122 Long Hall Clemson University |
Clemson, SC 29634 |
864-656-3060 | 864-656-6879 | aalbert@clemson.edu |
|
William D. Beavis |
National Center for Genome Resources 2935 Rodeo Park Drive East |
Santa Fe, NM 87505 |
800-450-4854 | 505-995-4432 | wdb@ncgr.org |
|
Charles Brummer |
1204 Agronomy Iowa State University |
Ames, IA 50011 |
515-294-1415 | 515-294-6505 | brummer@iastate.edu |
|
Mark Burow |
Texas Agricultural Experiment Stn. Route 3, Box 219 |
Lubbock, TX 79401 |
806-746-6101 | 806-746-6528 | mburow@tamu.edu |
|
Tom Clemente |
E324 Beschle Center University of Nebraska Lincoln |
Lincoln, NE 68588-0665 |
402-472-1428 | 402-472-3139 | tclemente1@unl.edu |
|
Douglas R. Cook |
University of California One Shields Avenue 206 Robbins |
Davis, CA 95616 |
530-754-6561 | 530-754-6617 | drcook@ucdavis.edu |
|
Perry Cregan |
USDA-ARS-BARC-West Soybean Genomics and Improvement Lab B006, Room 100 |
Beltsville, MD 20705 |
301-504-5070 | 301-504-5728 | creganp@ba.ars.usda.gov |
|
Leland Ellis |
USDA/ARS |
Beltsville, MD 20705 |
301-504-4788 | 301-504-4725 | lce@ars.usda.gov |
|
Paul Gepts |
Dept. of Agronomy and Range Science University of California |
Davis, CA 95616 |
530-752-7323 | 530-752-4361 | plgepts@ucdavis.edu |
|
David Grant |
USDA-ARS G304 Agronomy Hall Iowa State University |
Ames, IA 50011 |
515-294-1205 | 515-294-2299 | dgrant@iastate.edu |
|
David A. Lightfoot |
Dept. of Plant Soil and General Agriculture Southern Illinois University |
Carbondale, IL 62901-4415 |
618-453-1797 | 618-453-7457 | ga4082@siu.edu |
| Greg May |
The Noble Foundation Plant Biology Division P.O. Box 2180 |
Ardmore, OK 73402 |
580-221-7391 | gdmay@noble.org | |
|
Phillip Miklas |
USDA-ARS 24106 N. Bunn Road |
Prosser, WA 99350-9687 |
509-786-9258 | 509-786-9277 | pmiklas@tricity.wsu.edu |
|
Henry T. Nguyen |
Dept. of Plant and Soil Science Texas Tech University |
Lubbock, TX 79409-2122 |
806-742-1622 | 806-742-2888 |
henry.nguyen@ttu.edu |
|
Wayne Parrott |
3111 Plant Sciences Bldg. Dept. of Crop & Soil Sciences University of Georgia |
Athens, GA 30602 |
706-542-0928 | 706-542-0914 | wparrott@uga.edu |
|
Andrew Paterson |
111 Riverbend Rd. University of Georgia |
Athens, GA 30602 |
706-583-0162 | 706-583-0160 | paterson@uga.edu |
|
Robert S. Reiter |
Monsanto 3302 SE Convenience Blvd. |
Ankeny, IA 50021-9424 |
515-963-4211 | 515-963-4242 |
robert.s.reiter@monsanto.com |
|
Ernest F. Retzel |
420 Delaware St. SE MMC 43 650 Childrens Rehabilitation Center University of Minnesota |
Minneapolis, MN 55455 |
612-626-0495 | 612-626-6069 | ernest@ahc.umn.edu |
|
Deborah Samac |
1991 Upper Buford Circle Room 495 |
St. Paul, MN 55108 |
612-625-1243 | 651-649-5058 | debbys@puccini.cdl.umn.edu |
|
Lynn Senior |
Syngenta Biotechnology, Inc. 3054 Cornwallis Road |
Raleigh Triangle Park, NC 27709 |
919-597-3041 | 919-541-8585 | lynn.senior@syngenta.com |
|
Randy C. Shoemaker |
G401 Agronomy Hall Iowa State University |
Ames, IA 50011 |
515-294-6233 | 515-294-2299 | rcsshoe@iastate.edu |
|
Gary Stacey |
M409 Walters Life Science Bldg. University of Tennessee |
Knoxville, TN 37996-0845 |
865-974-4041 | 865-974-4007 | gstacey@utk.edu |
|
H. Thomas Stalker |
Box 7620 Department of Crop Science NC State University |
Raleigh, NC 27695 |
919-515-2647 | 919-515-7959 | hts@unity.ncsu.edu |
|
Lila Vodkin |
384 ERML 1201 W. Gregory University of Illinois |
Urbana, IL | 217-244-6147 | l-vodkin@uiuc.edu | |
|
Norm Weeden |
Montana State University Dept. of Plant Sciences and Plant Pathology ABS 303 |
Bozeman, MT 59717 |
406-994-7622 | 406-994-7600 | nweeden@montana.edu |
|
Nevin Dale Young |
495 Borlaug Hall University of Minnesota |
St. Paul, MN 55108 |
612-625-2225 | 612-625-9728 | neviny@tc.umn.edu |
The conveners wish to thank Barbara Upston, Management Consulting Associates, for her highly effective workshop facilitation and Barbara Zapp, USDA-ARS, for her tireless and highly efficient technical support. The conveners were assisted in the management and organization of the workshop by four species coordinators: Charlie Brummer (alfalfa and clovers), Randy Shoemaker (soybean), Tom Stalker (peanut), and Norm Weeden (common bean, pea, dry bean, lentil).
Financial support for the workshop was provided by the United Soybean Board, Dry Pea and Lentil Council, National Peanut Foundation, and USDA-ARS.