search for




 

Positively Selected Orthologous Genes Identified in Sesame (Sesamum indicum) by Deep Resequencing
Plant Breeding and Biotechnology 2019;7:24-33
Published online March 30, 2019
© 2019 Korean Society of Breeding Science.

Jie Yu1,2, Myeong-Hyeon Min1, Sang-Ho Chu3, Kyu-Won Kim3,*, Yong-Jin Park1,3,*

1Department of Plant Resources, College of Industrial Science, Kongju National University, Yesan 32439, Korea, 2State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei 230036, China, 3Center of Crop Breeding on Omics and Artificial Intelligence, Kongju National University, Yesan 32439, Korea
Corresponding author: *Kyu-Won Kim, sh.kyuwon@gmail.com, Tel: +82-41-330-1213, Fax: +82-41-330-1209
Received January 21, 2019; Revised February 20, 2019; Accepted February 20, 2019.
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract

Sesame (Sesamum indicum L.) is the queen of oil seed crops and is cultivated widely in tropical and subtropical areas. The availability of the sesame genome sequence presents unprecedented opportunities for studying its genetics, genomics, and evolution. In this report, we conducted a genome resequencing-based identification of sesame orthologs; in total, 26,379 coding sequences (CDSs) were isolated. Using a reciprocal best hit, we ultimately identified a total of 639 orthologs sets after one-to-one orthologs extraction across seven Pentapetalae plant species. These orthologs were considered to be the most credible between the two species, and in sesame. Furthermore, we performed a branch model-based maximum likelihood estimation of dN/dS of the orthologs, resulting in the identification of 198 evolutionarily accelerated orthologs and 66 positively selected genes (P-value and FDR < 0.05). An enrichment analysis and protein interaction network suggested 19 genes with important functions of the orthologs specific in sesame development and domestication. The method we used here provides a case study for identifying orthologous genes between sesame and other plants species that are distributed in equilibrium phylogenetically, which can be used in other plants.

Keywords : Sesame, Orthologs, Resequencing, dN/dS, Evolution
INTRODUCTION

Sesame (Sesamum indicum L.) is one of the crop has been cultivated for the longest time by human and now wildly cultivated in tropical and subtropical zones (Bedigian 2003). It is a diploid species with 13 pairs of chromosomes (2n = 2x = 26) (Bremer et al. 2009), which belongs to the Sesamum genus of the family Pedaliaceae and the order Lamiales. Sesame is called “queen of the oil seeds” because of the highest oil content (~55% in the seeds) among oilseed crops, including peanut, soybean, and rapeseed (Anilakumar et al. 2010). Among the oilseed crops, sesame has a high oil content and a small diploid genome (~259.73 Mb) (Wang et al. 2014; Wang et al. 2016), making it an attractive species for genetic studies. Sesame is known to be the core lineage of Sesamum in Lamiales using phylogenetic analysis with chloroplast genome (Yi and Kim 2012). It was estimated to have diverged from the Lamiales species Utricularia gibba ~98 million years ago, and from the tomato potato lineage ~125 MYA (Wang et al. 2014).

Orthologs are homologous sequences that started to diverge through a speciation event (as with paralogs and duplication events). Consequently, the better we approximate the evolution of such sequences, the better ortholog predictions will be. In general, phylogenetic history can be parsed by examining nucleotide sequences, instead of amino acid sequences, and then separating non-synonymous substitutions (resulting in amino acid substitutions) from synonymous substitutions (silent or neutral substitutions). Selection pressure can be estimated by scaling the ratio of non-synonymous substitutions to synonymous substitutions to that expected with neutral divergence. This calculation is normalized as the number of non-synonymous substitutions per potential non-synonymous sites (dN) divided by the number of synonymous substitutions per potential synonymous sites (dS), referred to as dN/dS or Ka/Ks (Hurst 2002). Extant proteins have been selected for optimal function over millions of years, the dN value of their genes becomes depressed; thus, dN/dS < 1 indicates “purifying” selection. When dN/dS = 1, the gene may be experiencing neutral selection. While dN/dS > 1, the gene may be undergoing “positive selection” resulting adaptive evolution, where amino acid replacements are advantageous and synonymous substitutions become stabilized at a high rate (McPartland et al. 2007; Dalloul et al. 2010). Positive selection of Embryophytes plants occurred in a wide variety of gene families with different functions, including in the self in-compatibility locus, in defense against pathogens, in embryogenesis, in cold acclimation, and in electron-transport (Roth and Liberles 2006). In Brassica juncea, 13 GSL-metabolism-related genes and 22 lipid-metabolism-related genes were differentially selected between the vegetable- and oil-use subvarieties of B. juncea, which indicating that positive selection of some genes could be a major impact on plant domestication (Yang et al. 2016). An EID1 gene, responsible for a phase delay, has been under positive selection during tomato domestication or improvement (Muller et al. 2016), revealing the key position of positively selected genes during the domestication.

In this study, we developed a detailed strategy (Supplementary Fig. S1) using an evolutionary-based method to mine orthologs between different species from a genome resequencing project. Using this method, we identified orthologs in sesame, including five sesame accessions from a core set (RDA-Genebank, Republic of Korea) and several other plant species. Complemented with a branch model-based maximum likelihood estimation of dN/dS (Kim et al. 2013; Ahn et al. 2014) of the orthologs in these several species that were distributed in equilibrium phylogenetically, we identified evolutionarily accelerated genes in sesame. The methods in this study provide a case study for identifying orthologous genes in sesame and other plant species.

MATERIALS AND METHODS

Plant material and whole genome resequencing

Five sesame accessions (Table 1) from a core set stored in National Genebank of the Rural Development Administration, Republic of Korea (RDA-Genebank) were selected for current study (Park et al. 2014). We conducted a field experiment during the 2015 sesame growing season at Kongju National University experimental farm. Young leaves from a single plant were sampled and stored at −80°C prior to genomic DNA extraction using the DNeasy plant mini kit (Qiagen, Germany). At least 5 μg of genomic DNA was used for each accession to construct paired-end sequencing libraries with insert sizes of approximately 500 bp according to the manufacturer’s instructions (Illumina Inc., USA). We sequenced the genomes of the collected sesame varieties in deep coverage (~19.8×) on an Illumina HiSeq 2500 Sequencing System Platform (Illumina Inc., USA) with paired-end 2 × 100 bp read length.

Genotype calling and CDS extraction

Raw sequences were first trimmed using the SICKLE tool (https://github.com/najoshi/sickle) to remove low-quality reads. Then, high-quality reads were aligned against the sesame reference genome (ver. 2.0) (Wang et al. 2016) using the Burrows-Wheeler Aligner (BWA; ver. 0.7.5a) software (Li and Durbin 2009). Reads that did not meet BWA quality criteria or did not match the reference genome were removed. PCR duplicate reads were also removed using PICARD tool (ver. 1.88; http://broadinstitute.github.io/picard/). Applied with the Genome Analysis Toolkit (GATK; ver. 2.3.9 Lite) (McKenna et al. 2010), regional realignment and quality score recalibration were first carried out and then single nucleotide polymorphisms (SNPs) and insertion/deletion polymorphisms (indels) were called with ≥ 3× of read depth coverage. Coding sequences (CDSs) of the five sesame core accessions were extracted, based on the annotations of the sesame reference genome using the variant call format (VCF) file generated from genotype calling of the five accessions.

Ortholog identification in sesame

A branch model-based phylogenetic tree was constructed with the TimeTree database (Hedges et al. 2015), and several Pentapetalae plant species (Solanum tuberosum, Solanum lycopersicum, Populus trichocarpa, Brassica rapa, Arabidopsis thaliana, and Vitis vinifera) that distributed in equilibrium phylogenetically with sesame were selected for comparison with S. indicum. The coding sequences of Solanum tuberosum were downloaded from NCBI (http://www.ncbi.nlm.nih.gov) and the other six Pentapetalae species were downloaded from Ensembl Plants (release 30; http://plants.ensembl.org). Ortholog information for the other Pentapetalae plants was also downloaded from BioMart of Ensembl Plants, satisfying orthologs confidence = 1. Since orthologous relationship between S. tuberosum and other Pentapetalae species had not been established, we performed a reciprocal best hits test between S. indicum and S. tuberosum, the closest species to S. indicum among six Pentapetalae species, using the options, E-value < 10 −5, 50 < coverage (%) < 150, and identity (%) > 60 (Brudno et al. 2003; Schwede et al. 2003; Katoh et al. 2005).

Branch model-based evolutionarily accelerated gene mining

To identify rapidly evolving genes in sesame, we conducted a branch model-based maximum likelihood estimation of dN/dS values among seven Pentapetalae species, including S. indicum. The coding sequences of all the seven Pentapetalae species and the five accessions of S. indicum were aligned using PRANK (ver. 150803) software (Löytynoja and Goldman 2005), which is known for accurately align insertion and deletion (Fletcher and Yang 2010). Poor alignments were eliminated using Gblocks (ver. 0.91b) (Castresana 2000). Selected positive genes for S. indicum and each of the five accessions were tested using a maximum likelihood estimation of dN/dS ratio with the phylogenetic tree (Fig. 1) (Yang 2007). Maximum likelihood estimations of dN/dS ratios were determined with branch-site model A with codon frequency F3x4 using PAML (ver. 4.8a) software (Yang 2007). The log-likelihood ratio test was used for determining P-values, and multiple testing errors were controlled using the false discovery rate (FDR).

GO enrichment, pathway, and network analysis

All the identified evolutionarily accelerated genes were subjected to Gene Ontology (GO) enrichment, pathway, and network analyses. GO enrichment analyses were performed using AgriGO (http://bioinfo.cau.edu.cn/agriGO/) with Arabidopsis thaliana as the background species (Du et al. 2010). The P-values for the considered enrichment GO terms were < 0.05 and the minimum number of mapping entries was set to 5. Plant metabolic pathway enrichment analysis was carried out online using the Plant Gene Set Enrichment Analysis Toolkit (GSEA, http://structuralbiology.cau.edu.cn/PlantGSEA/analysis.php), based on the corresponding genes in A. thaliana (Yi et al. 2013). Network analyses were also conducted for the genes by integrating with the Biological General Repository for Interaction Datasets (BioGRID, http://thebiogrid.org/) (Chatr-Aryamontri et al. 2015), which can provide genetic and protein interaction data from model organisms. Sesame orthologs were mapped onto interactome data using cross-reference tables assembled from identifiers of A. thaliana and were displayed with Cytoscape software (Shannon et al. 2003).

RESULTS

Whole-genome resequencing and genotype calling

We conducted resequencing of five core set accessions in this study and generated a total of 297.8 million reads (~42 Gb) (Table 1). After removing low-quality reads, ~38 million clean reads were retained as high-quality data and were used for further analysis. About 80% of the clean reads of each accession were mapped to the sesame reference genome, resulting in an average genome coverage of ~19.8 (Table 1). We performed genotype calling of the five accessions and, in total, ~1.96 million SNP/indels were identified (Supplementary Table S1). Of them, ~86% (~1.69 million) were high-quality variations (sites without missing and heterozygotes). This suggested a large amount of high-quality data, resulting reliable results in the rest of the study. By complementing with the reference genome of sesame, and vice versa, we could see more genetic variations across the sesame genome, which would promote genetic and genomic studies of sesame to be increasingly active. For example, in this study, with the high-depth resequencing and genotype calling data, ~1.69 million high-quality variations were generated across the genome. However, an uneven distribution of these SNPs/indels was observed across some chromosomes (Supplementary Fig. S2). In contrast, several chromosomes with shorter lengths (chromosomes 1, 4, 5 and 13) had more SNPs/indels than the chromosomes with longer lengths (chromosomes 3, 6, 8, 9, and 10). This suggested that some genes in the chromosomes with high-density variations might play more important roles in the development and evolution of sesame.

Ortholog assignments

Based on the annotation information of the sesame reference genome, we identified 26,379 CDSs in each of the five accessions among a total of 27,148 protein-encoding genes in the sesame reference genome. These CDSs were mined by mapping to the sesame reference genome with the positions of each gene CDS. In this way, all the SNPs/indels could be included in the CDS specific to each accession if some SNPs/indels existed specifically in one accession. To analyze genes of S. indicum by comparing orthologs with other species in a monophyletic group, the six Pentapetalae species were carefully selected to be distributed in equilibrium phylogenetically with S. indicum.

However, the orthologs among S. indicum and the other six Pentapetalae species have not been identified, whereas the orthologs between the other six Pentapetale species were identified. Thus, we made entire orthologs including S. indicum and the other six Pentapetalae species by identifying orthologs between S. indicum and its closest neighbor among the other six Pentapetalae species, S. tuberosum. The orthologs sets between S. indicum and S. tuberosum were identified by reciprocal best hits (RBH). Candidate orthologs of sesame were then intersected with the orthologs of the other six species, downloaded from BioMart of Ensembl Plants. As a result, 8,281 pair wise orthologs sets were identified between S. indicum and S. tuberosum. Finally, 639 orthologs sets remained after extracting one-to-one (1:1) ortholog sets across all seven Pentapetalae species (Supplementary Table S2). In present study, only 1:1 orthologs were used for further dN/dS tests. One-to-many, many-to-many, and many-to-one orthologs were excluded to eliminate influences from genes that were duplicated one or more times in an ancestor of the current or other species after a speciation event. The 1:1 ortholog relationship is used in many comparative genomics studies, especially in the identification of functional orthologs (Bandyopadhyay et al. 2006); in evolutionary, comparative, and systematic studies in plants (Wu et al. 2006); and the mapping of biological pathways (Mao et al. 2006). We also used 1:1 orthologs which are suitable for comparative method based on proportion of functional changes (Sabeti et al. 2006).

Clearly, it would be easier to identify orthologs between two species if the duplication history of the genes in the two genomes was known (relative to their speciation event). Unfortunately, this evolutionary process is unknown in many cases. Thus, to find the most probable 1:1 orthologs assignment between two genomes should to be carried out based on sequence similarity. When predefined orthologs sets of a species group exist without target species, the strategy that identify pairwise orthologs between target species and its closest species in members of the predefined orthologs sets and construct whole orthologs sets had been used in genome project (Dalloul et al. 2010).

dN/dS test to identify specific, evolutionarily accelerated genes

Given the ortholog sets detected, we conducted a branch model-based maximum likelihood estimation of dN/dS in the seven species (total 12 accessions). Considering the insertion and deletion, PRANK was used to align the CDSs of all seven species. After removing poor alignments, 623 of 639 alignments remained. Under a phylogenetic model tree (Fig. 1), which indicated the tested branches for adapted genes (overall branch of S. indicum and each branch of each accession), we investigated the maximum likelihood estimation of dN/dS ratios with branch-site model A with codon frequency F3x4. In total, 198 evolutionarily accelerated orthologs were identified in S. indicum, including the five accessions, using a maximum likelihood estimation of the dN/dS ratio (Supplementary Table S3) with a value greater than 1. The difference in the maximum likelihood ratio was calculated as: LR = 2 (ln ML (dN/dS = 1)/ln ML (dN/dS)). LR was then compared against the χ2 distribution with one degree of freedom to test whether the dN/dS ratio was significantly > 1. The significance level was set at 0.05. Genes that passed this test are reported as positively selected. After adjusting the threshold with a P-value and FDR < 0.05, 66 positively selected genes were retained, and were used in further analyses.

Based on the sesame reference genome annotation, we mapped the 198 orthologs and 66 positively selected genes to the chromosomes of the sesame genome (Fig. 2). The accelerated orthologs showed a relatively even distribution across the chromosomes, with 15, 11, 19, 21, 12, 23, 11, 17, 20, 10, 13, 21, and 5 orthologs assigned to chromosomes 1 to 13, respectively. Except chromosome 13, on which only positively selected genes were located, the other chromosomes contained both positively selected orthologs and other accelerated orthologs. The distribution of these accelerated orthologs across the genome suggested that the branch model method we used here was reliable and these genes may play important roles in the development and domestication of sesame.

Enrichment and network analysis of accelerated genes

We performed a functional GO enrichment test for the accelerated genes and found 126 GO terms in three categories (Biological Process, Molecular Function, and Cellular Component) with 19 have P-values < 0.05. In these terms, the cytoplasmic part and cytoplasm in Cellular Components have highest numbers of genes (37 and 41, respectively), while the response to oxidative stress and DNA metabolic processes in Biological Processes only covered five genes (Fig. 3A). However, for the Molecular Function category, no significant GO term was found (Supplementary Table S4). Using the Plant Gene Set Enrichment Analysis web server, the genes were enriched to curated gene sets (including PlantCyc gene sets, Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets, Plant Ontology gene sets, and References collected gene sets). Except for six genes (SIN_1022276, SIN_1017317, SIN_1002603, SIN_1002195, SIN_1015149, and SIN_1004833) that did not map some gene sets, most of the other (~50) contributed to the Plant Ontology gene sets (Supplementary Fig. S3). The genes were related to the development of plant sperm cells, cytoplasmic parts, stems, carpels, leaves, roots, and seeds, indicating the important roles of these accelerated genes in sesame or other plants.

We then constructed a protein interaction network using the 66 positively selected genes as the nodes and protein-protein interaction data from the BioGRID database (http://thebiogrid.org/download.php) as the edges. The network contained 139 nodes and 400 edges, as visualized using Cytoscape with A. thaliana as the reference. Colors from green to red indicating dN/dS values from low to high in the network and the 19 positively selected genes with dN/dS values > 1 were indicated in Fig. 4. Detail information and annotation for these 19 genes are listed in Table 2. Many putative functions were characterized among these genes, including zinc ion binding, response to stress, membrane related, transferase activity, cytokinin-activated signaling pathway, fatty acid biosynthetic process, oxidation-reduction process, and abscisic acid-activated signaling pathway. It indicated that these positively selected genes could play important roles in the domestication of sesame for its specific characteristics like drought tolerance and high oil content.

In addition, we also mapped these genes to KEGG pathways (http://www.genome.jp/kegg/pathway.html). Apart from some regular pathways, we found one gene (SIN_1024502) that mapped to Arabidopsis orthologs in fatty acid metabolism and fatty acid elongation, which was also highlighted in the network analysis, and may contribute to the fatty acid biosynthetic process and oxidation-reduction process in sesame. The gene SIN_1022581(AT3G15150, AtMMS21), which encodes E3 SUMO-protein ligase, may regulates drought tolerance in Arabidopsis thaliana (Zhang et al. 2013) was also highlighted here. Since sesame usually has a relative high level of drought resistance, this gene could be contributed to drought in sesame and positively selected in the domestication. Another gene SIN_1026256, which is a ortholog of Arabidopsis SLAC1 that associated with stomatal closure, was also likely responsible for the difference of drought tolerance (Imai et al. 2015). The gene SIN_1002052 (AT5G08650), which is the homolog of translation factor GUF1, is critical for chloroplast protein synthesis under suboptimal conditions in Arabidopsis thaliana (Ji et al. 2012). Overall, these results suggested that the genes identified may play important roles in the development and domestication of sesame.

DISCUSSION

In this study, we used a branch model method to identify evolutionarily accelerated orthologs in sesame based on deep resequencing. In case of a shortage of gene annotation in sesame, we first assigned the 1:1 orthologs between sesame and six other Pentapetalae species that were distributed in equilibrium phylogenetically with S. indicum. In total, 639 1:1 orthologs were characterized, and 198 accelerated orthologs were identified after a branch model-based maximum likelihood estimation of dN/dS. Among these accelerated orthologs, 66 genes were positively selected under a significance level test. Functional enrichment of the positively selected genes using several methods suggested that these genes may play important roles in plant development during the sesame domestication. To our knowledge, this case study provides the example for orthologs identification in species that lack gene annotations.

Supplementary Information
ACKNOWLEDGEMENTS

This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (No. NRF-2014R1A1A2057073).

Figures
Fig. 1. Branch model phylogenetic tree of 7 Pentapetalae plants and 5 Sesamum indicum accessions for the evolution analysis. Numbers in the brackets indicating the tested branches for adapted genes, (1) is for Sesamum indicum and (2)–(7) are for each core accessions. The grey color indicating the species tree.
Fig. 2. The location and distribution of all the accelerated orthologs and positively selected genes across the chromosomes of sesame. Each red or blue vertical bar represented one orthologs, and the red indicated the positively selected ones, and the blue indicated the other orthologs. LG01-LG13 represents the chromosomes, and the bold horizontal line indicates the length of each chromosome.
Fig. 3. Functional enrichment analysis of positively selected orthologs. Yellow bar is accelerated selected genes, blue bar is the Arabidopsis thaliana reference genome genes. GO enrichment analyses were performed using AgriGO () with Arabidopsis thaliana as the background species.
Fig. 4. Cytoscape protein-protein interaction network of the positively selected genes using Arabidopsis thaliana as reference (data from BioGRID). The color from green to red indicating the dN/dS value ranging from low to high. Total 19 positively selected genes that with dN/dS value more than 1 were marked and displayed in .
Tables

Summary of the accessions and resequencing status based on the sesame reference genome.

AccessionsKNUS124KNUS158KNUS160KNUS161KNUS162
NameHeukseonSADOVO 1NevadaAnsanMilsung
EcotypesS. indicum L.S. indicum L.S. indicum L.S. indicum L.S. indicum L.
OriginKoreaKoreaKoreaKoreaKorea
Flower colorLight redLight redPinkWhiteWhite
Total raw reads51,928,27668,070,22860,951,16053,119,46863,775,198
Clean reads42,036,75153,923,69949,859,36742,786,00750,639,996
Mapped reads33,967,71342,565,41440,064,70634,053,74240,022,382
Mapping depth17.29722.12920.87517.83320.82
Mapping rate (%)80.878.9480.3679.5979.03

Positively selected orthologs that were associated in the network analysis.

SesameA. thalianaP-valueFDRdN/dSAnnotation
SIN_1016274AT4G155450.0106020.0355797.05644Transmembrane transporter activity; integral component of membrane
SIN_1013043AT3G114600.003290.0173944.79731Zinc ion binding; mitochondrion; cytidine to uridine editing
SIN_1022318AT2G187600.0052250.02263313.07842DNA or RNA binding; response to stress; cellular processes
SIN_1006130AT2G242203.49E-113.45E-095.48865Purine nucleobase transport; integral component of membrane
SIN_1014290AT1G557401.53E-283.03E-261.12951Carbohydrate metabolic processes; cytoplasm; hydrolase activity
SIN_1023518AT4G359100.0009670.0087076.29376Transferase activity; lateral root formation; DNA or RNA binding; protein urmylation
SIN_1015526AT1G636600.0081830.0305443.61322GMP synthase (glutamine-hydrolyzing) activity; ATP binding
SIN_1024931AT1G021600.0006680.007786999Unknown molecular functions
SIN_1022581AT3G151501.62E-070.000008182.0115Cytokinin-activated signaling pathway; SUMO transferase activity; ligase activity
SIN_1025029AT2G443600.0008970.00845965.66277Unknown molecular functions
SIN_1013015AT2G363050.0008340.008256464.5285CAAX-box protein maturation; hydrolase activity
SIN_1023551AT3G137700.0027610.0173213.71444Mitochondrion; zinc ion binding
SIN_1024502AT3G457700.0033380.0173948.71415Fatty acid biosynthetic process; oxidation-reduction process
SIN_1022837AT2G024100.0137850.042647998.9973Unknown molecular functions
SIN_1009102AT1G098500.0028580.01732111.21159Extracellular; hydrolase activity
SIN_1026256AT1G124800.0014180.0108029.73025Abscisic acid-activated signaling pathway
SIN_1002052AT5G086500.0041580.020584159.029GTPase activity; nucleotide binding
SIN_1022276AT3G246600.0110710.03653619.87913Protein serine/threonine kinase activity
SIN_1002603AT3G557600.0030920.017394126.5731Chloroplast stroma

References
  1. Ahn H, Kim K-W, Kim HJ, Cho S, Kim H. 2014. Differential Evolution between Monotocous and Polytocous Species. Asian-australas. J Anim Sci. 27: 464-470.
  2. Anilakumar KR, Pal A, Khanum F, Bawa AS. 2010. Nutritional, medicinal and industrial uses of sesame (Sesamum indicum L.) seeds-an overview. Agriculturae Conspectus Scientificus (ACS). 75: 159-168.
  3. Bandyopadhyay S, Sharan R, Ideker T. 2006. Systematic identification of functional orthologs based on protein network comparison. Genome Res. 16: 428-435.
    Pubmed KoreaMed CrossRef
  4. Bedigian D. 2003. Evolution of sesame revisited: domestication, diversity and prospects. Genet Resour Crop Evol. 50: 779-787.
    CrossRef
  5. Bremer B, Bremer K, Chase M, Fay M, Reveal J, Soltis D, et al. 2009. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG III. Bot J Linn Soc. 161: 105-121.
    CrossRef
  6. Brudno M, Do CB, Cooper GM, Kim MF, Davydov E, et al., NISC Comparative Sequencing Program 2003. LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res. 13: 721-731.
    Pubmed KoreaMed CrossRef
  7. Castresana J. 2000. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Molr Biol Evol. 17: 540-552.
    Pubmed CrossRef
  8. Chatr-Aryamontri A, Breitkreutz B-J, Oughtred R, Boucher L, Heinicke S, Chen D, et al. 2015. The BioGRID interaction database: 2015 update. Nucleic Acids Res. 43: D470-D478.
    Pubmed KoreaMed CrossRef
  9. Dalloul RA, Long JA, Zimin AV, Aslam L, Beal K, Blomberg LA, et al. 2010. Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis. PLoS Biol. 8: e1000475.
    Pubmed KoreaMed CrossRef
  10. Du Z, Zhou X, Ling Y, Zhang Z, Su Z. 2010. agriGO: a GO analysis toolkit for the agricultural community. Nucleic Acids Res. 38: W64-W70.
    Pubmed KoreaMed CrossRef
  11. Fletcher W, Yang Z. 2010. The effect of insertions, deletions, and alignment errors on the branch-site test of positive selection. Mol Biol Evol. 27: 2257-2267.
    Pubmed CrossRef
  12. Hedges SB, Marin J, Suleski M, Paymer M, Kumar S. 2015. Tree of life reveals clock-like speciation and diversification. Mol Biol Evol. 32: 835-845.
    Pubmed KoreaMed CrossRef
  13. Hurst LD. 2002. The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends Genet. 18: 486.
    Pubmed CrossRef
  14. Imai H, Noda Y, Tamaoki M. 2015. Alteration of Arabidopsis SLAC1 promoter and its association with natural variation in drought tolerance. Plant Signal Behav. 10: e989761.
    Pubmed KoreaMed CrossRef
  15. Ji DL, Lin H, Chi W, Zhang LX. 2012. CpLEPA is critical for chloroplast protein synthesis under suboptimal conditions in Arabidopsis thaliana. PLoS One. 7: e49746.
    Pubmed KoreaMed CrossRef
  16. Katoh K, Kuma K, Toh H, Miyata T. 2005. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33: 511-518.
    Pubmed KoreaMed CrossRef
  17. Kim KW, Burt DW, Kim H, Cho S. 2013. Identification of differentially evolved genes: an alternative approach to detection of accelerated molecular evolution from genome-wide comparative data. Evol Bioinform Online. 9: 285-299.
    Pubmed KoreaMed CrossRef
  18. Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 25: 1754-1760.
    Pubmed KoreaMed CrossRef
  19. Löytynoja A, Goldman N. 2005. An algorithm for progressive multiple alignment of sequences with insertions. Proc Natl Acad Sci USA. 102: 10557-10562.
    Pubmed KoreaMed CrossRef
  20. Mao F, Su Z, Olman V, Dam P, Liu Z, Xu Y. 2006. Mapping of orthologous genes in the context of biological pathways: An application of integer programming. Proc Natl Acad Sci USA. 103: 129-134.
    Pubmed KoreaMed CrossRef
  21. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20: 1297-1303.
    Pubmed KoreaMed CrossRef
  22. McPartland JM, Norris RW, Kilpatrick CW. 2007. Tempo and mode in the endocannaboinoid system. J Mol Evol. 65: 267-276.
    Pubmed CrossRef
  23. Muller NA, Wijnen CL, Srinivasan A, Ryngajllo M, Ofner I, Lin T, et al. 2016. Domestication selected for deceleration of the circadian clock in cultivated tomato. Nat Genet. 48: 89-93.
    Pubmed CrossRef
  24. Park JH, Suresh S, Cho GT, Choi NG, Baek HJ, Lee CW, et al. 2014. Assessment of molecular genetic diversity and population structure of sesame (Sesamum indicum L.) core collection accessions using simple sequence repeat markers. Plant Genet Resour. 12: 112-119.
    CrossRef
  25. Roth C, Liberles DA. 2006. A systematic search for positive selection in higher plants (Embryophytes). BMC Plant Biol. 6: 12.
    Pubmed KoreaMed CrossRef
  26. Sabeti PC, Schaffner SF, Fry B, Lohmueller J, Varilly P, Shamovsky O, et al. 2006. Positive natural selection in the human lineage. Science. 312: 1614-1620.
    Pubmed CrossRef
  27. Schwede T, Kopp J, Guex N, Peitsch MC. 2003. SWISS-MODEL: An automated protein homology-modeling server. Nucleic Acids Res. 31: 3381-3385.
    Pubmed KoreaMed CrossRef
  28. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. 2003. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13: 2498-2504.
    Pubmed KoreaMed CrossRef
  29. Wang L, Xia Q, Zhang Y, Zhu X, Zhu X, Li D, et al. 2016. Updated sesame genome assembly and fine mapping of plant height and seed coat color QTLs using a new high-density genetic map. BMC Genomics. 17: 31.
    Pubmed KoreaMed CrossRef
  30. Wang L, Yu S, Tong C, Zhao Y, Liu Y, Song C, et al. 2014. Genome sequencing of the high oil crop sesame provides insight into oil biosynthesis. Genome Biol. 15: R39.
    Pubmed KoreaMed CrossRef
  31. Wu F, Mueller LA, Crouzillat D, Pétiard V, Tanksley SD. 2006. Combining bioinformatics and phylogenetics to identify large sets of single-copy orthologous genes (COSII) for comparative, evolutionary and systematic studies: a test case in the euasterid plant clade. Genetics. 174: 1407-1420.
    Pubmed KoreaMed CrossRef
  32. Yang J, Liu D, Wang X, Ji C, Cheng F, Liu B, et al. 2016. The genome sequence of allopolyploid Brassica juncea and analysis of differential homoeolog gene expression influencing selection. Nat Genet. 48: 1225-1232.
    Pubmed CrossRef
  33. Yang Z. 2007. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 24: 1586-1591.
    Pubmed CrossRef
  34. Yi DK, Kim KJ. 2012. Complete chloroplast genome sequences of important oilseed crop Sesamum indicum L. PLoS One. 7: e35872.
    Pubmed KoreaMed CrossRef
  35. Yi X, Du Z, Su Z. 2013. PlantGSEA: a gene set enrichment analysis toolkit for plant community. Nucleic Acids Res. 41: W98-W103.
    Pubmed KoreaMed CrossRef
  36. Zhang S, Qi Y, Liu M, Yang C. 2013. SUMO E3 ligase AtMMS21 regulates drought tolerance in Arabidopsis thaliana. J Integr Plant Biol. 55: 83-95.
    Pubmed CrossRef


June 2019, 7 (2)
Full Text(PDF) Free
Supplementary File

Cited By Articles
  • CrossRef (0)

Social Network Service
Services
  • Science Central