Abstract
The integration of advanced technologies into breeding programs in the 21st century can result in a powerful step change in crop productivity when aligned with components of genetic gain. Genetic gain depends upon four factors: accuracy, selection intensity, genetic variation, and time. It is a useful starting point, as it articulates the parameters breeders operate as part of the crop improvement process. This review article has compiled advanced breeding technologies such as phenomics, genotyping and se-quencing platforms, genome editing, and double haploid, which can be applied to each component of the genetic gain equation. In addition, it has explained the strategies, opportunities, and limitations in order to support breeders in making wise decisions in regard to the technologies and therefore increase efficiency with the breeding programs.
-
Key words: Genetic gain, Accuracy (A), Selection intensity (I), Genetic variation (G), Time (T)
INTRODUCTION
Crop breeding has long relied on cycles of phenotypic selection and crossing that develop agronomically superior genotypes through genetic recombination. In the last two decades, breeding technologies have dramatically evolved in several areas, and plant breeders now have numerous innovative technologies to apply in their quest for crop improvement (
Cobb et al. 2019;
Hickey et al. 2019).
For example, the development of automated high-throughput genotyping and phenotyping platforms has enabled evaluation of larger populations, thus increasing selection intensity and improving selection accuracy (
Rasheed et al. 2017). The rise of third generation se-quencing platforms allows for gene-level resolution of agronomic variation, which facilitates gene discovery, trait dissection, and predictive breeding technology (
Nepolean et al. 2018). With the ready availability of genomic data for breeders today, genomics plays an increasingly important role in all aspects of crop breeding, such as quantitative trait loci (QTL) mapping and genome-wide association studies (GWAS) (
Wallace et al. 2018). Current advances in ge-nomics and bioinformatics provide opportunities for ac-celerating crop improvement (
Hu et al. 2018). Applying these technologies with advanced statistics, whole-genome based selection approaches such as genomic selection (GS) have come to further accelerate the breeding cycle and improve the genetic gain in the breeding program (
Voss-Fels et al. 2019). Conventional crop breeding is based on genetic variations that occur naturally and is used to select plants with improved traits. Currently, genome editing techniques are available for creating customized genotypes (
Kawall 2019). The long generation times of crops, which typically allow for only one or two generations per year and have served as a key limiting factor for plant breeding, has been alleviated by double haploid technology reducing the generation times (
Ren et al. 2017).
Genetic gain in a crop breeding program depends on the following equation: (A × I × G) / T, where A is the accuracy of the observed phenotype in relating to the true phenotype and genotype, I is selection intensity, G is the genetic variation, and T is the time of the breeding cycle (
Lush 1937;
Eberhart 1970). Genetic gain is built from a strong genetic base and is not difficult to achieve if crop per-formance is measured and selection is practiced in a con-sistent and sustained manner. In addition, genetic gain cannot be achieved if A, I, or G are zero. Investment im-pacts the efficiency of genetic gain, and breeders need to optimize the investment per unit. There is no single factor attributed to the genetic gain components; genetic gain is considerably more complicated than this, with components closely related to and improving each other (
Cobb et al. 2019).
This review article is intended to demonstrate a selected technology of each component as an example to help breeders make smart investments by using the genetic gain equation as a guide.
Accuracy: how accurately does the observed phenotype relate to the true phenotype and genotype?
Accuracy indicates how well the measured phenotype for a particular trait relates to the line’s true phenotype for that trait and genotype. Accuracy will be lower if errors in measuring the trait are high, or if the environment has a large effect on the trait.
Reliable high-throughput phenotypic technologies are considered important tools for genetic gain in breeding programs. With the dramatic advancement over the last two decades in high-throughput phenotyping technologies, re-search in this area is entering a new era called phenomics (
Pieruschka and Schurr 2019;
Zhao et al. 2019). The out-door environment—where most field crops are grown, such as corn, soybean and rice—is much more variable than the laboratory setting. Field-based phenotyping (FBP) is a critical tool for crop breeding through genetics, as it is the expression of the relative effects of genetic and en-vironmental factors, as well as their interaction on agro-nomic traits such as yield potential and tolerance to abiotic/ biotic stresses (
Yang et al. 2020). Therefore, novel techno-logies should provide advantages with respect to levels of throughput, applicability under field conditions, adoptabil-ity in the breeding process, and the heritability of the determined traits. Currently, the most common FBP plat-forms use ground-based handheld phenotyping or an un-manned aerial vehicle (UAV) combined with a wide range of cameras, sensors, and high-performance computing to capture deep phenotyping data and monitor crop perform-ance in time (throughout the crop cycle) and space in field environments (
Fritsche-Neto and Borém 2015).
UAV-based phenotyping is becoming the most popular high-throughput tool for phenotyping in the field environ-ment, as it meets the demands of spatial, spectral, and temporal resolutions (
Jang et al. 2020;
Yang et al. 2020). Remote sensing with UAV has shown great potential for high-throughput phenotyping and will enhance work in crop functional genomics and crop breeding. The sensors that the UAVs are equipped with typically consist of infrared thermal imagers, light detection and ranging (LIDAR), multispectral cameras, and hyperspectral sensors (
Jang et al. 2020). The applications of these include i) crop biomass estimation based on visible imaging, ii) crop phy-siological status monitor, such as chlorophyll fluorescence based on visible near-infrared spectroscopy and high-resolution hyperspectral imaging, iii) plant water status detection based on thermal imaging and, iv) crop fine-scale geometric traits analysis based on LIDAR point clouds (
Zhao et al. 2019).
Jang et al. (2020) reviewed the various UAV platforms in detail, such as aerial vehicle types, software for hardware calibration and image processing, and various sensor types depending on research purposes of image capture.
Fig. 1 describes the general end to end workflow of the high-throughput phenotyping method-ology using remote sensor-equipped UAV.
Some of the limitations related to the use of UAV are also discussed. These include i) the fly time and the load capacity may be limited, ii) local flight laws and regu-lations should be strictly followed and flight safety should be carefully implemented and, iii) reasonable flight alti-tudes should be investigated to support fit-for-purpose image capture (
Jang et al. 2020).
Selection intensity: what percentage of lines are advanced?
Selection depends on the proportion of lines that are selected relative to the total number of lines. Anything that can be done to maximize the number of lines tested will increase genetic gain.
The utilization of molecular markers in plant breeding can maximize selection efficiency via marker-assisted selection and genomic selection (GS) (
Wallace et al. 2018;
Voss-Fels et al. 2019). Genotyping platforms of molecular markers provide opportunities to accelerate the develop-ment of cultivars with desired yield potential, quality, and enhanced adaptation to the target environments. There is a rapidly rising trend in genome-wide, high-throughput genotyping platform development in over 25 crop species (
Rasheed et al. 2017).
Table 1 shows the most predominant genome-wide chip/array for corn, soybean, and rice breed-ing programs (
Ganal et al. 2011;
Song et al. 2013;
Chen et al. 2014;
Unterseer et al. 2014;
Lee et al. 2015;
Singh et al. 2015). These platforms provide robust SNP calling, with high call rates and cost-effectiveness per data point when genotyping large numbers of SNPs and samples. These benefits are suitable for GS to maximize selection effici-ency. However, there are a few known drawbacks; i) chip/ arrays are non-flexible, ii) cost per data point is reduced but the overall cost to genotype per sample is still high and, iii) prior genomics knowledge is required and thus results in inherent ascertainment bias in the fixed SNP panel (
Rasheed et al. 2017).
The current genomics landscape has been revolutionized due to the next-generation sequencing (NGS) technologies. These provide much sequencing information with great improvements in genome coverage, as well as reduced data generation time and costs. Within the last few years, geno-typing by sequencing (GBS) has been gaining attention from breeders for using a sequencing technology for geno-typing (
Campa and Ferreira 2018). It generates low-pass whole genome sequencing data and is expected to mini-mize inherent ascertainment bias in the fixed SNP chip/ array. GBS has a noticeable disadvantage, which is derived from low coverage of NGS reads. It occurs in a large percentage of missing data points and may lead to misiden-tification of homozygotes from heterozygotes. Therefore, an imputation step is required for downstream applications using haplotype database and, reference genome(s) could often be beneficial to improve accuracy of genotype calling (
Hu et al. 2018).
Elbasyoni et al. (2018) demonstrated the comparison of marker-based genotyping (90K Illumina Infinium chip) and sequence-based genotyping to assess the accuracy of genomic prediction for GS in hexaploidy wheat. The ex-periment was conducted using 299 materials for heading date, plant height, days to physiological maturity, and grain yield. The prediction accuracies of genomic estimated breeding values were estimated using three models: Bayes C, Bayesian Lasso, and regression based linear unbiased prediction. The results show that GBS is comparable to or better than high-density chip for prediction accuracy for all four traits. In addition, statistically significant differences in prediction accuracy for grain yield were detected. The authors inferred that these differences may have derived from the ascertainment bias inherent in the high-density SNP chip, as the SNP panels were developed from different genetic backgrounds in the studied plant materials, whereas GBS is a genetic background-free methodology (
Elbasyoni et al. 2018).
The cost of sequencing is getting significantly lower and, GBS has become increasingly popular in crop genetics. In order to fully implement GBS in breeding operations at a large scale, there are still several challenges that need to be tackled. The first is library construction, which is still labor intensive and has a cost that, unlike the cost of sequencing, is not getting lower (
Rasheed et al. 2017). Other challenges include absence of perfect bioinformatics tools for data imputation models, higher complexity in data analysis and storage, and intensive computing power often being re-quired (
Hu et al. 2018). The future landscape of genotyping in breeding programs is debatable; either sequencing will eventually replace all marker-based genotyping platforms, or only a partial replacement will occur. We may see the answer in the next few years.
Genetic variation: how much true genetic variation exists among the lines?
The amount of genetic or true breeding value varies among lines for a particular trait. Breeders focus on useful genetic variation.
The major goal of crop breeding is to develop and improve crops by altering specific traits to accelerate their yield and quality. The foundation of the development of new cultivars in conventional breeding is genetic variations that either occur naturally due to spontaneously emerging mutations and meiotic recombination or can be induced by mutagenesis using EMS chemicals or radiation (
Hickey et al. 2019). These processes generate natural variation but are undirected and non-specific. This means that the out-come (i.e. the exact genomic region where a new variation is created) is not predictable or determinable.
Genome editing is a site-directed process that enables the editing of DNA sequences by using various techno-logies, such as clustered regularly interspaced short pal-indromic repeats (CRISPR), transcription activator-like effector nuclease (TALEN), or zinc finger nuclease (ZFN) (Reviewed in
Kawall 2019). These targeted technology tools make the entire genome accessible for any desired alteration, enabling multiple beneficial traits to be edited into an elite background within one generation. Therefore, the tools can be used in breeding to develop varieties with new genetic recombinations that were not possible to create before. In addition, direct improvement of elite varieties by genome editing does not introduce agronomically dele-terious alleles that often occur with natural crossing and recombination. Genome editing via TALEN and ZFN is a cloning based methodology, with these tools being expen-sive and time-consuming. Unlike them, CRISPR with Cas9 nuclease (CRISPR/Cas9) is guided by a programmable RNA to the genomic sites of interest (
Zhang et al. 2018). The RNA based sequence specificity could enable low-cost and fast implementation to the various genomic sites and provide mutagenesis at high frequencies.
With CRISPR/Cas9 tool, breeders are no longer depend-ent on unpredictably occurring genetic variations. In add-ition, these tools allow for highly flexible editing of the genome. Site-directed Cas9 nucleases together with in-dividually synthesized gRNA can replace alleles in low-recombinogenic regions of the chromosomes quickly, easily, and precisely, leading to a combination of genetic material that would hardly occur naturally (
Chen et al. 2019). In addition, many crop species are polyploid, such as wheat and canola (
Wolter et al. 2019). The gRNA can recognize all complementary target sequences present in the genome, regardless of how many gene copies are pre-sent. This means that it is possible to modify multiple or all gene copies carrying the same target sequence in an or-ganism at the same time. These modifications cannot be achieved through conventional breeding techniques (
Chen et al. 2019;
Wolter et al. 2019).
The genetic bottlenecks impose on our current crop varieties by the long breeding selection process. Conse-quently, they have removed most of the beneficial genetic diversity available for breeding and make further improve-ment of elite varieties by conventional breeding techno-logies a cumbersome process (
Hickey et al. 2019). There-fore, the CRISPR/Cas9 tool is highlighted as a promising tool to create genetic diversity for breeding in an unprece-dented way and on a large scale.
Time: how quickly can breeders go from parent to offspring and then back to parent again?
Breeding cycle is the time when offspring become parents. Genetic gain is increased when cycle time is de-creased.
In 1959, Professor Edward H. Coe (University of Missouri) discovered the line produces haploid plants (Stock 6) that contain just half the DNA of normal corn. Haploids become valuable when scientists double and use them to produce homozygous breeding lines (
Mcmahon 2017). These homozygous lines are 100% inbred lines that would otherwise have to be produced by repeated forced self-pollinations. The haploid method allows breeders to produce inbred lines within two generations, while conventional breeding takes six to eight generations of inbreeding by selfing (
Fig. 2). Today, all corn-breeding companies use haploids to shorten the time required to produce parent lines by several years (
Chang and Coe 2009). Reduced time and increased efficiencies for breeders to develop new hybrids have the potential to deliver high yielding varieties for farmers at a faster pace (
Kawall 2019). The DH (double haploid) technology requires two main steps to produce DH lines in corn: haploid induction and subsequent genome doubling (
Ren et al. 2017).
Fig. 2 describes the process of DH production in a simplified diagram compared to the conventional breed-ing (
Mcmahon 2017).
Haploid production by intraspecific hybridization is the primary method in corn, with the haploid inducer Stock 6 producing 2-3% maternal haploids when outcrossed as a male (
Chang and Coe 2009).
Kelliher et al. (2017) found that haploid induction in corn is a postzygotic character attributed to a frame-shift mutation in MATRILINEAL (
MTL) (also called
ZmPLA1 and
NLD), which is a sperm-specific phospholipase. MTL is responsible for nearly 70% of the haploid-induction trait. The editing capability of CRISPR/Cas9 is a valuable property in this regard and could be combined with double-haploid (DH) production. A 4-bp (CGAG) insertion in the fourth exon of
ZmPLA1 in Stock 6 compared to the B73 reference genome was shown to be the cause of the haploid induction phenotype using CRISPR/Cas9 gene editing (
Liu et al. 2017). In the
ZmPLA1 knockout lines, the average haploid induction rate (HIR) is close to the HIR of Stock 6, which demonstrates that the
ZmPLA1 knockout method can be used to create haploid inducers. MTL is highly conserved in cereals, and these two findings may enable the development of intraspecific in vivo haploid inducer lines in crops efficiently (
Kelliher et al. 2017).
In order to double the genomes in large scale DH pro-duction, artificial doubling using chemical doubling agents (such as colchicine) has been widely utilized. These dupli-cate the genomes by binding to tubulins to inhibit micro-tubule polymerization, resulting in doubling rates of 10-30% (
Kleiber et al. 2012). Due to the agent toxicity, natural doubling methods such as spontaneous genome doubling have been studied in several species. In corn, the frequency of spontaneous haploid genome doubling (SHGD) has ranged from less than 5% to greater than 50%, with the rate also varying depending on the genetic background. Several studies have reported on the genetics and mechanism of SHGD in corn using bi-parental population, fine mapping, and genome wide association mapping, as reviewed in
Boerman et al. (2020). Several significant sequence vari-ations were associated with genes that have proposed functions related to meiosis, microtubule organization, and cell division. Among all candidate genes suggested so far,
qshgd1 on chromosome 5 contributing a single large addi-tive effect seems promising for integration in the DH tech-nology (
Ren et al. 2017). However, their mechanism and putative impact on SHGD needs to be confirmed and validated carefully before full implementation.
CONCLUSION
Advances in phenomics, high-throughput genotyping platforms, sequencing technology, genome editing tools, double haploid technology and, machine learning/deep learning in the artificial intelligence domain (not addressed in this review) could allow breeders to make efficient data driven decisions in the breeding programs (
Cobb et al. 2019). In particular, an interdisciplinary approach with these technologies is essential to identify and resolve breeding challenges to accelerate genetic gain. The genetic gain equation could serve as a useful mental framework for considering investment priorities, as it refines theory down to the parameters that a breeding program aims to achieve. In addition, applying these technologies to increase the rate of genetic gain delivered in the fields of farmers will require careful attention to their impact on the components of the genetic gain equation.
Fig. 1Workflow of high-throughput phenotyping platform using unmanned aerial vehicle with sensor equip-ment (modified from Jang et al. 2020). +DSM: digit-al surface model.
Fig. 2The comparison of conventional breeding and double haploid (DH) breeding and, the benefit of the DH breeding in corn (Courtesy of Syngenta Thrive magazine, 2017, issue 3).
Table 1List of high-throughput chip and array genotyping platforms for breeding programs in major field crops.
References
- Boerman A, Frei U, Lübberstedt T. 2020. Impact of spon-taneous haploid genome doubling in maize breeding. Plants. 9: 369
- Campa A, Ferreira JJ. 2018. Genetic diversity assessed by genotyping by sequencing (GBS) and for phenological traits in blueberry cultivars. PLoS One. 13: e0206361
- Chang MT, Coe EH. Kriz AL., Larkins BA., 2009. Doubled haploids. editors. Molecular genetics ap-proaches to maize. Biotechnology in Agriculture and Forestry, Vol 63. Springer. Berlin, Heidelberg, Germany: pp. 127-142.
- Chen W, Gao Y, Xie W, Gong L, Lu K, Wang W, et al. 2014. Genome-wide association analyses provide genetic and biochemical insights into natural variation in rice meta-bolism. Nat. Genet.. 46: 714-721.
- Chen K, Wang Y, Zhang R, Zhang H, Gao C. 2019. CRISPR/Cas genome editing and precision plant breeding in agriculture. Annu. Rev. Plant Biol.. 70: 667-697.
- Cobb J, Juma R, Biswas P, Arbelaez J, Rutkoski J, Atlin G, et al. 2019. Enhancing the rate of genetic gain in public-sector plant breeding programs: lessons from the breeder's equation. Theor. Appl. Genet.. 132: 627-645.
- Eberhart SA. 1970. Factors affecting efficiencies of breeding methods. Afr. Soils.. 15: 655-680.
- Elbasyoni I, Lorenz A, Guttieri M, Frels K, Baenziger P, Poland J, et al. 2018. A comparison between genotyping-by-sequencing and array-based scoring of SNPs for genomic prediction accuracy in winter wheat. Plant Sci.. 270: 123- 130.
- Fritsche-Neto R, Borém A. 2015. Phenomics: how next-generation phenotyping is revolutionizing plant breeding. Springer Press. Switzerland.:
- Ganal MW, Durstewitz G, Polley A, Bérard A, Buckler ES, Charcosset A, et al. 2011. A large maize (Zea mays L.) SNP genotyping array: development and germplasm genotyping, and genetic mapping to compare with the B73 reference genome. PLoS One. 6: e28334
- Hickey L, Hafeez A, Robinson H, Jackson S, Leal-Bertioli S, Tester M, et al. 2019. Breeding crops to feed 10 billion. Nat. Biotechnol.. 37: 744-754.
- Hu H, Scheben A, Edwards D. 2018. Advances in integrating genomics and bioinformatics in the plant breeding pipe-line. Agriculture. 8: 1-20.
- Jang GJ, Kim J, Yu JK, Kim HJ, Kim Y, Kim DW, et al. 2020. Review: Cost-effective unmanned aerial vehicle (UAV) platform for field plant breeding application. Remote Sens.. 12: 1-20.
- Kawall K. 2019. New possibilities on the horizon: genome editing makes the whole genome accessible for changes. Front. Plant Sci.. 10: 1-10.
- Kelliher T, Starr D, Richbourg L, Chintamanani S, Delzer B, Nuccio ML, et al. 2017. MATRILINEAL, a sperm-specific phospholipase, triggers maize haploid induction. Nature. 542: 105-109.
- Kleiber D, Prigge V, Melchinger AE, Burkard F, San Vicente F, Palomino G, et al. 2012. Haploid fertility in temperate and tropical maize germplasm. Crop Sci.. 52: 623-630.
- Lee YG, Jeong N, Kim JH, Lee K, Kim KH, Pirani A, et al. 2015. Development, validation and genetic analysis of a large soybean SNP genotyping array. Plant J.. 81: 625- 636.
- Liu C, Li X, Meng D, Zhong Y, Chen C, Dong X, et al. 2017. A 4-bp insertion at ZmPLA1 encoding a putative pho-spholipase A generates haploid induction in maize. Mol. Plant. 10: 520-522.
- Lush J. 1937. Animal breeding plans. Iowa State College Press. Ames, IA, U.S.A..
- Mcmahon K. 2017. Double haploid induction speeds up plant breeding process. Syngenta Thrive. 3: 10-12.
- Nepolean T, Kaul J, Mukri G, Mittal S. 2018. Genomics-enabled next generation breeding approaches for developing system-specific drought tolerant hybrids in maize. Front. Plant Sci.. 9: 1-16.
- Pieruschka R, Schurr U. 2019. Plant phenotyping: past, pre-sent, and future. Plant Phenomics. 2019: 1-6.
- Rasheed A, Hao Y, Xia X, Khan A, Xu Y, Varshney R, et al. 2017. Crop breeding chips and genotyping platforms: progress, challenges, and perspectives. Mol. Plant. 10: 1047-1064.
- Ren J, Wu P, Trampe B, Lübberstedt T, Chen S. TianX2017. Novel technologies in doubled haploid line development. Plant Biotechnol. J.. 15: 1361-1370.
- Singh N, Jayaswal P, Panda K, Mandal P, Kumar V, Singh B, et al. 2015. Single-copy gene based 50 K SNP chip for genetic studies and molecular breeding in rice. Sci. Rep.. 5: 1-9.
- Song Q, Hyten D, Jia G, Quigley C, Fickus E, Nelson R, et al. 2013. Development and evaluation of SoySNP50K, a high-density genotyping array for soybean. PLoS One. 8: e54985
- Unterseer S, Bauer E, Haberer G, Seidel M, Knaak C, Ouzunova M, et al. 2014. A powerful tool for genome analysis in maize: development and evaluation of the high density 600 K SNP genotyping array. BMC Genomics. 15: 1-15.
- Voss-Fels K, Cooper M, Hayes B. 2019. Accelerating crop genetic gains with genomic selection. Theor. Appl. Genet.. 132: 669-686.
- Wallace J, Rodgers-Melnik E, Buckler E. 2018. On the road to breeding 4.0: unraveling the good, the bad, and the boring of crop quantitative genomics. Annu. Rev. Genet.. 52: 421-444.
- Wolter F, Schindele P, Puchta H. 2019. Plant breeding at the speed of light: the power of CRISPR/Cas to generate directed genetic diversity at multiple sites. BMC Plant Biol.. 19: 1-8.
- Yang W, Feng H, Zhang X, Zhang J, Doonan J, Batchelor W, et al. 2020. Crop phenomics and high-throughput pheno-typing: past decades, current challenges, and future per-spectives. Mol. Plant. 13: 187-214.
- Zhang Y, Massel K, Godwin ID, Gao C. 2018. Applications and potential of genome editing in crop improvement. Genome Biol.. 19: 1-11.
- Zhao C, Zhang Y, Du J, Guo X, Wen W, Gu S, et al. 2019. Crop phenomics: current status and perspectives. Front. Plant Sci.. 10: 1-16.