search for


Flanking Sequence and Copy-Number Analysis of Transformation Events by Integrating Next-Generation Sequencing Technology with Southern Blot Hybridization
Plant Breeding and Biotechnology 2017;5:269-281
Published online December 1, 2017
© 2017 Korean Society of Breeding Science.

Yang Qin, Hee-Jong Woo, Kong-Sik Shin, Myung-Ho Lim, Hyun-Suk Cho, and Seong-Kon Lee*

National Institute of Agricultural Science, Rural Development Administration, Jeonju 54874, Korea
Correspondence to: Seong-Kon Lee,, Tel: +82-63-238-4708, Fax: +82-63-238-4704
Received September 4, 2017; Revised October 19, 2017; Accepted October 23, 2017.
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

With the continual development of genetically modified (GM) crops, it has become necessary to develop detailed and effective molecular characterization methods to select candidate events from a large pool of transformation events. Relative to traditional molecular analysis methods such as the polymerase chain reaction (PCR) and Southern blot hybridization, next generation sequencing (NGS) technology for whole-genome sequencing of complex crop genomes had proven comparatively useful for in-depth molecular characterization. In this study, four transformation events, including one in Bacillus thuringiensis (Bt)-resistant rice, one in resveratrol-producing rice, and two in beta-carotene-enhanced soybeans, were selected for molecular characterization. To merge NGS analysis and Southern blot-hybridization results, we confirmed the transgene insertion sites, insertion construction, and insertion numbers of these four transformation events. In addition, the read-coverage depth assessed by NGS analysis for inserted genes might provide consistent results in terms of inserted T-DNA numbers in case of complex insertion structures and highly duplicated donor genomes; however, PCR-based methods can produce incorrect conclusions. Our combined method provides an effective and complete analytical approach for whole-genome visual inspection of transformation events that require biosafety assessment.

Keywords : T-DNA, flanking sequence, copy number, NGS, Southern blot hybridization, Read coverage

The increasing world population and degradation of ecological environments have created higher demands for crop production and food supplies. From the first commercialized genetically modified (GM) Flavr Savr tomato released in 1994, increasingly more GM crops have been developed and released (Redenbaugh et al. 1992), and 2 billion cumulative hectares of biotech crops have been successfully cultivated globally for foods, feeds, and other industrial supplies during past two decades (James 2015). Desirable transgenic lines possess excellent agronomic performance and transgene expression. Subsequent molecular characterization of the transgene integration site, transgene structure, and transgene copy number determine whether a transgenic line can be considered for biosafety assessment (OECD 2010).

Traditional molecular methods for identifying T-DNA insertion sites and insertion numbers generally involve the polymerase chain reaction (PCR) and Southern blot hybridization. Inverse PCR (Ochman et al. 1988), thermal-asymmetric interlaced PCR (TAIL-PCR; Liu et al. 1995) and adapter-ligated PCR (O’Malley and Ecker 2010) are commonly used for detecting T-DNA flanking sequences. However, using these approaches, it is often difficult to obtain accurate information regarding the T-DNA integration site when confronted with T-DNA complex-inserted structures and clustering or duplication of the insertion-site sequence. In addition, Southern blot hybridization is a universally accepted technique to determine T-DNA copy numbers for transgenic events. To obtain regulatory approval of a candidate commercial transgenic event, biosafety assessment documentation based on Southern blot hybridization is necessary for each transgenic element (Codex Alimentarius Commission 2003; OECD 2010). Southern blot hybridization is not only very time-consuming and labor-intensive, but is also not sensitive enough to detect the occurrence of small sequence insertions or deletions (Yang et al. 2013).

The advent of next-generation sequencing (NGS) technologies opened a new era of genomics and molecular biology. In recent years, NGS-based molecular characterizations have been widely applied for studying many transgenic events (Daniela et al. 2013; Lepage et al. 2013; Park et al. 2015; Guo et al. 2016). In comparison with PCR, Sanger sequencing and Southern blot hybridization, NGS has been proven effective for detecting incomplete and multiple integration events (Zhang et al. 2012). Kovalic et al. (2012) successfully achieved full molecular characterization of a GM soybean by combining NGS and bioinformatics instead of employing a Southern blot-based method. An improved technology of targeted capture sequencing coupled with NGS was explored and shown to be a robust event-sorting tool (Zastrow-Hayes et al. 2015). In this study, NGS technology was applied for characterizing T-DNA insertion sites and copy numbers for four transgenic events in rice and soybean plants, having one, two, or multiple copies of inserted T-DNA. By combined NGS and Southern blot analysis of events involving complex T-DNA insertion structures and different donor species, we could effectively determine T-DNA integration sites and structures, and predict T-DNA numbers. We believe that these findings indicate that the combined method may be useful for the molecular characterization of events requiring biosafety assessment.


Plant materials

Four transformation events, including one in the Bacillus thuringiensis (Bt)-resistant rice line C1-8-8, one in the resveratrol-producing rice line Iksan515, and two in the beta-carotene-enhanced soybean lines 9-1-2 and 10-19-1 were used as test materials in this study. The C1-8-8 marker-free Bt-resistant transgenic rice line was developed by T-DNA cassette insertion into a conventional Korean rice variety ‘Dongjin’ with a FLP/FRT-mediated spontaneous auto-excision system, as described by Woo et al. (2015), and the transformation vector after excision is represented schematically in Fig. 1A. The resveratrol-producing rice line Iksan515 has undergone one transformation event and was molecularly characterized by Qin et al. (2013) as the conventional Korean rice ‘Dongjin’ variety modified with an inverted-repeat insertion of two copies of a transformation vector, which we confirmed shown in Fig. 1B. Two beta-carotene-enhanced soybean lines (9-1-2 and 10-19-1) were developed by insertion of a T-DNA bicistronic system (phytoene synthase-2A-carotene desaturase) into the conventional Korean soybean variety ‘Kwangan’ (Qin et al. 2015). The transformation vector is shown schematically in Fig. 1C. T-DNA cassettes of all four transgenic lines were inserted by Agrobacterium-mediated transformation into the donor varieties. The detailed plasmid constructions, total lengths, promoter directions, and enzyme sites were confirmed, as shown in Fig. 1. The known DNA sequences of each transformation vector also served as references for NGS analysis, as described below.

Genomic DNA isolation and quantification

Genomic DNA for NGS and Southern blot hybridization was extracted from fresh expanding leaves of all test materials, as well as the conventional donor rice and soybean controls. Total genomic DNA was isolated using a phenol/chloroform method, as described below. Approximately 0.5 g of ground leaf powder was mixed with 1.2 mL lysis buffer (50 mM Tris HCl [pH 8.0], 10 mM NaCl, 50 mM EDTA [pH 8.0]) and incubated with shaking for 1 hour at room temperature. The aqueous phase was extracted by centrifugation at 12,000 rpm for 15 minutes at 4°C, added to equal volumes of phenol: chloroform (1 : 1), and mixed by hand inversion for 3 minutes, followed by centrifugation at 12,000 rpm and 4°C for 5 minutes. The same step was repeated once by adding an equal volume of chloroform to the aqueous phase, with 3 minutes of inversion. The samples were centrifuged at 12,000 rpm for 5 minutes at 4°C and the aqueous phase transferred to a clean tube, followed by mixing with the same volume of isopropanol for 30 minutes at room temperature. After centrifugation at 12,000 rpm for 15 minutes at 4°C, the pellet was washed in 80% ethanol and incubated on ice for 10 minutes. The sample was centrifuged at 6,000 rpm for 5 minutes at 4°C, and the pellet was dried for 5 minutes. The DNA was dissolved in TEN buffer (5 mM Tris HCl, pH 8.0, 10 mM NaCl, 0.5 mM EDTA, pH 8.0) and treated with RNase at 37°C for 30 minutes. Two sequential extractions were performed by adding an equal volume of phenol or phenol: chloroform (1 : 1) to the aqueous phase, followed by mixing and centrifugation at 12,000 rpm for 5 minutes at 4°C. The DNA was pelleted by mixing the aqueous phase with 0.1 volume of 3 M sodium acetate and 2.5 volumes of 100% ethanol, followed by centrifugation, after which the pelleted DNA was washed in 70% ethanol. The DNA was air dried and re-dissolved in TE buffer (pH 8.0). DNA concentrations were quantified using a NanoDrop ND 1000 spectrophotometer (ThermoFisher Scientific, Inc., Wilmington, NC, USA), and the DNA was stored in a −20°C freezer.

NGS analysis and sequence assembly

Genomic DNA libraries were prepared using a TruSeq Library Kit (Illumina, Inc. USA), following the manufacturer’s recommended protocol. Six separate indexed libraries were produced for four test materials and two donor controls, and these six libraries were used to create test and control library pools before sequencing. The pooled sequencing library samples were sequenced using the Illumina HiSeq platform (Illumina, Inc. USA), following the manufacturer’s procedure (Kovalic et al. 2012). Sufficient numbers of sequence reads from each pooled sample library were obtained with approximately 30× genome coverage. Raw fastq files were created for each test material, and map reads to reference was performed for each material using CLC genomics workbench software, version 9.0. Map reads for each transformation event should contain two references, including a whole-genome sequence reference and a transformation vector sequence. The complete genome sequence of japonica Nipponbare (Os-Nipponbare-Reference-IRGSP-1.0) was used as a mapping reference for rice and their transformation events. The Glycine max reference complete genome sequence for Williams 82, supported by PlantGDB (, was used as a mapping reference for soybean DNA and the associated transformation events.

Southern blot hybridization

Approximately 20 μg of digested rice genomic DNA (40 μg for soybeans) was separated on a 0.8% (w/v) agarose gel and then blotted onto a Hybond-N+ membrane (Amersham), according to the manufacturer’s instructions. Each transformation plasmid (approximately 50 ng) was used as a template DNA for probe preparation using the PCR DIG Probe Synthesis Kit (Roche Diagnostics). Blotted membranes were pre-hybridized at 42°C for 2 hours. Afterwards, a denatured DIG-labeled probe was added and hybridization was performed overnight at 42°C. Signal detection by chemiluminescence was performed following procedures supported by manufacturer’s introduction of the DIG Detection Starter Kit II (Roche Applied Science, Indianapolis, IN).


NGS sequencing and map reads to references

Our NGS study was designed to produce 30× read coverage sequence data for the rice and soybean lines. The Illumina sequencing yielded a total 14,012 × 106 bps, 13,521 × 106 bps, and 10,816 × 106 bps for the genomes of the Dongjin rice variety and the transgenic lines C1-8-8 and Iksan515, respectively (Table 1). Sequencing data mapping to the japonica rice Nipponbare reference genome accounted for 92.07%, 92.48%, and 89.65% of the reads, respectively, with 37.18×, 35.89×, and 28.70× effective sequencing coverage for Dongjin, C1-8-8, and Iksan515. In addition, a total of 44,604 × 106 bps, 47,538 × 106 bps, and 33,735 × 106 bps of sequence data were generated for the Kwangan soybean variety and the transgenic soybean lines 10-19-1 and 9-1-2, respectively, accounting for 85.87%, 85.91%, and 82.02% of the reads mapped to the G. max reference William 82 genome. The effective sequencing coverages were 45.30×, 47.28×, and 33.53×, respectively.

Sequence reads for each transformation event were mapped to the transformation vector sequence, and sequence coverage is shown in Fig. 2. Most sequence reads were paired-end reads (marked in blue), although single reads were mainly concentrated at the left and right integration sites (red and green). Three transformation vectors (pCMF-Cry-MF, PSB2220, and PAC-vector) with sequence lengths of 2,884 bp, 5,609 bp, and 5,435 bp, were respectively mapped to 566 reads of sequence data for C1-8-8, 2,574 reads for Iksan515, and finally 1,592 reads for 10-19-1 and 4,694 reads for 9-1-2. The average read coverages of the transformation vectors were 29.00×, 45.89×, 40.34×, and 127.30× for C1-8-8, Iksan515, 10-19-1, and 9-1-2, respectively (Table 2).

Junction sequence analysis for transformation events

Junction sequences and transgene insertion sites in chromosomes were analyzed for the four transformation events. The Bt-resistant transgenic line C1-8-8 showed only one type of sequence read at the left border (LB) and right border (RB) junctions, respectively (Fig. 3). The left and right junction-sequence reads were individually aligned to nucleotide positions 24,020,183–24,020,142 and 24,020,082–24,020,042-bp of rice chromosome 8, based on a rice sequence database Nipponbare genome. Likewise, two junction sequences were read at nucleotides 24,020,080–24,020,140 of chromosome 8, and alignment results indicated that they corresponded respectively to bps 46–101 and 46–62 of the LB and RB region of transformation vector in both reverse directions (Fig. 3C). These data indicated that 1 copy of T-DNA had been inserted in the reverse direction at nucleotides 24,020,082–24,020,142 of rice chromosome 8 (Table 3).

For the resveratrol-producing transgenic rice line Iksan515, two sequence reads were observed at both the LB and RB junctions (Fig. 4). Two sequence reads at the LB junction were separately aligned to the 28,858,865–28,858,848 and 28,858,763–28,858,797 bp regions of chromosome 4 (Fig. 4A). At the RB junction region, BLAST searches revealed a 60–107 bp sequence from the RB region of the transformation vector in the reverse direction (Fig. 4B). The genome region from 28,858,780–28,858,850 bp in rice chromosome 4 was inspected to confirm the transgene insertion site and structure. Our results indicated that two sequences were integrated into the genome and separately aligned from bp 17–61 and 37–78 in the forward and in reverse directions, respectively, at the LB region of the transformation vector (Fig. 4C). Therefore, it appeared that two copies of T-DNA with tandem inverted repeats (tail-to-tail) were inserted at bp 28,858,797–28,858,848 of rice chromosome 4 (Table 3).

The sequence reads at the junction region in the beta-carotene-enhanced transgenic soybean line 10-19-1 were similar to those of the Bt-resistant rice line C1-8-8. T-DNA was inserted in the reverse direction was at bp 23,918,809–23,918,840 of soybean chromosome 13, as shown in Fig. 5. However, another soybean line (9-1-2) was found to have complex T-DNA insertions. We found three individual junction-sequence reads at the LB and RB regions of the transformation vector (Fig. 6). Based on BLAST searching, one read in the LB region mapped to bp 1,680,418–1,680,448 of soybean chromosome 11; however, the other two reads corresponded to bp 381–436 and 300–368 at the LB region of the transformation vector and were both in the reverse direction (Fig. 6A). For the RB junction sequence, one of three was mapped by BLAST searching to bp 755,381–755,401 of soybean chromosome 5, and another two were mapped to bp 12,709,949–12,709,967 and 12,709,991–12,710,006 of soybean chromosome 15 (Fig. 6B). Based on junction-sequence analysis, three insertions were identified on soybean chromosomes 11, 5, and 15, respectively. Furthermore, two reverse sequences were found in the LB junction areas, showing definitively that both insertions had inverted-tandem repeat (head-to-head) structures.

In addition, two RB junction-sequence reads were located on adjacent positions of chromosome 15, indicating that a tandem inverted repeat structure was probably located at bp 12,709,949–12,709,991 of chromosome 15. To further confirm T-DNA integration, three chromosome regions containing bp 755,200–755,400 bp of chromosome 5, 1,680,400–1,680,560 of chromosome 11, and 12,709,940–12,710,040 of chromosome 15 were analyzed, as shown in Fig. 7. The sequence from bp 24–87 at the RB region of the transformation vector was aligned to bp 755,200–755,400 of chromosome 5, suggested that at least one copy of T-DNA had been inserted in the forward direction at this position (Fig. 7A). On chromosome 11, we found the sequence from bp 368–400 at the LB region of the transformation vector was integrated into the soybean genome sequence in the forward direction (Fig. 7B). Sequencing by synthesis revealed that at least one T-DNA (associated with a 367-bp sequence deletion at the LB region) was inserted into bp 1,680,460–1,680,560 of chromosome 11 in the forward direction. In addition, two sequence reads spanning from bp 27–62 of the RB region (in the reverse direction) and from bp 24–62 of the RB region (in forward direction) were aligned with bp 12,709,950–12,710,020 of chromosome 15. The presence of this kind of junction sequences suggested that two T-DNAs with an inverted (head-to-head) tandem-repeat structure were inserted into this chromosome region. In addition, we deduced that two inverted tandem-repeats insertions were present in the soybean genome of the 9-1-2 line (Fig. 6A). To exclude one of them on chromosome 15, we presumed that another inverted tandem repeat T-DNAs had been inserted into chromosome 5. In conclusion, the soybean line 9-1-2 had five or more copies at three insertion sites, based on the NGS analysis.

Southern blot hybridization

Using inserted genes as probes, Southern blot hybridizations were performed for four transformation events. A 4–5 kb signal for the Cry1AC gene was observed in the rice line C1-8-8 (Fig. 8A) after digestion with EcoRI (Fig. 1A). For the soybean line 10-19-1, each one signal of Bar and CrtI genes was separately detected following treatment with BamHI (Fig. 8D and 8E). These results clearly demonstrated the presence of one T-DNA insertion into the rice or soybean genome, respectively, in accordance with conclusions from NGS analysis. For the Iksan515 line, transformation vectors encoding the Bar and RS genes (described in Fig. 1B) were separately used as probes to detect the T-DNA copy number. Results from Southern blot hybridizations showed two detectable signals using the Bar probe and one signal using the RS probe (Fig. 8B and 8C), which was consistent with prediction that two copies of each gene were inserted in an inverted tandem-repeat structure in one locus of rice chromosome. Southern blot hybridizations performed with the Bar and CrtI probes revealed the presence of three and five signals, respectively, in the 9-1-2 line (Fig. 8D and 8E). Therefore, we inferred that two T-DNAs with inverted tandem-repeat (tail-to-tail) structures were inserted at the 1,680,460–1,680,560-bp region in chromosome 11. Collectively, the NGS and Southern blot data showed that the 9-1-2 line has six T-DNA copies inserted into three locations including one tail-to-tail inverted tandem repeat and two head-to-head inverted tandem repeats (Fig. 7 and Table 3).

Read coverage depth for inserted genes of transformants

The average read coverage of inserted gene was calculated for four transformants. The single-copy housekeeping genes sucrose phosphate synthase (SPS, U33175) of rice and lectin 1 (Le1, K00821.1) of soybeans were separately selected for read-coverage statistics (Table 2). Based on the average read coverage of the inserted genes, the read depth was calculated as the ratio of read coverage of inserted gene to read coverage of housekeeping gene. This quantitative value could provide a valuable reference for transgene insertion numbers for individual transformation events. Our calculations suggested that one transgene copy generally had a read-depth range of 0.76–1.29 for rice and soybeans, as shown in Table 2. In addition, a transgenic rice line Iksan515 with two copy insertions had a read depth ranging from 1.66–1.99. For the multi-copy-inserted transgenic soybean line 9-1-2, a read depth ranging from 4.18-4.70 was observed for the inserted genes, but six insertions were determined by NGS technology and Southern blot hybridization.


In this study, NGS analysis combined with Southern blotting was performed to detect flanking sequences and copy numbers for two rice and two soybean transformation events. In recent years, NGS technologies have been recommended as a highly sensitive and cost- and labor-effective alternative for the molecular characterization of transformation events, compared to traditional Southern blot analysis (Guttikonda et al. 2016). However, a literature search conducted to determine whether NGS can replace techniques commonly used for molecular characterization of transformation events indicated that NGS does not always provide more valuable information than Southern blot hybridization and Sanger sequencing (Pauwels et al. 2015). Data from the present study suggest that NGS technologies should be used for some transformation events with complex insertion structures and highly duplicate genomes.

Southern blot hybridization requires careful experimental design. During the molecular characterization of different kinds of transformation events in crops, clean DNA preparation, effective enzyme digestion, and specific probe labeling have a direct bearing on the interpretation of results. In this study, two kinds of crops (donor rice and soybeans) were separately used to develop two different transformation events. Compared to rice, soybean leaves or seeds require much more extensive purification with chloroform/isoamyl alcohol to acquire clean DNA because they contain more metabolites, lipids, and proteins. Insufficient DNA digestion or over-digestion by specific enzymes also impacts experimental results. Furthermore, transgenes may undergo unpredictable insertions, such as a partial insertion, gene rearrangement, or multiple insertions, suggesting that each transgenic element and the effects of several restriction enzymes should be studied. These problems require several labor- and time-consuming steps.

To select desirable events from a large pool of transformation events, we often assign a high priority to favorable phenotypes and a single-copy insertion with a clean flanking sequence. However, adaptor-ligated PCR or inverse PCR, which rely on the sequence information of transgenic elements, cannot provide correct and complete information in the case of tandem constructions and rearrangement modifications. In addition, analysis of highly duplicated genomes of wheat, maize, and soybean from paleopolyploid species often fails to identify flanking sequences of inserted T-DNAs (Kim et al. 2009) by general PCR walking methods. In these cases, NGS analyses can replace procedures such PCR and Sanger sequencing. The beta-carotene-enhanced soybean line 9-1-2 was incorrectly determined to contain a tandem structure in one locus by adaptor-ligated PCR and Southern blot hybridization, which led to continued experiments for this line in this study. Subsequently, the underestimated copy numbers were corrected by NGS analysis. Using junction sequence information, the flanking sequences and integrated structures were conjectured from whole-genome levels. In addition, the read-coverage depth of inserted genes provides T-DNA copy numbers of transformation events that can be referenced. In the present study, two crop species and variant copy numbers of four transformation events provided a valuable read-depth reference, which was very conducive for discerning T-DNA copy numbers.

A sequencing depth of 75× is often recommended for the molecular characterization of transformation events by NGS analysis (Kovalic et al. 2012). Several studies have employed a 20×-30× sequencing depth to detect flanking sequences and inserts when studying transformation events (Yang et al. 2013; Guo et al. 2016). In this study, we considered that increasing sequencing depth provides access to relatively complete sequence information, which seems beneficial for revealing T-DNA integrations and numbers; however, increased sequencing depths involve higher costs. For insertions with tandem-repeat structures at one genomic site, generally a 30× sequencing depth can provide most information needed for such molecular characterization. Integrating Southern blot hybridization and long-PCR technology could provide effective molecular characterization for candidate events.

The junction sequencing depth compared to whole-genome sequencing coverage is regarded as being very important for identifying T-DNA copy numbers and the integration structure. Target-capture sequencing for target regions was demonstrated to provide a remarkable increase in coverage at junction regions (Guttikonda et al. 2016). In this study, the average number of junction sequence reads was 16 each for the LB and RB regions of transformation vector in the C1-8-8 and 10-19-1 lines, whereas 10 and 35 sequence reads were found for both junction regions in the Iksan515 and 9-1-2 lines. Regarding single-locus insertions, 10–16 junction reads could provide sufficient T-DNA insertion information; however, 35 sequence reads were not enough to obtain the entire junction information for 9-1-2, a 6-copy T-DNA insertion line. Hence, a complementary measure is to re-confirm the junction sequence from the T-DNA insertion site in the chromosome, which is advantage of whole-genome re-sequencing. In addition, the beta-carotene-enhanced soybean line 9-1-2 was incompatible for detailed molecular characterization due to the T-DNA multi-copy insertion. However, expanded NGS applications with complex T-DNA insertions revealed a correlation between read-coverage depths and copy numbers for transformation events. As has been mentioned, an integrated analysis with both NGS and Southern blot hybridization is recommended for molecular characterization of candidate transformation events.


This work was supported by a grant (PJ010902) from the National Institute of Agricultural Science (Rural Development Administration, Republic of Korea).

Fig. 1. Three transformation vector constructs of four transgenic lines. (A) Vector construction used to produce Bt-resistant transgenic rice C1-8-8. (B) Vector construction used to produce resveratrol-producing transgenic rice Iksan515. (C) Vector construction used to produce beta-carotene-enhanced transgenic soybeans 10-19-1 and 9-1-2. The red star signifies the enzyme selected for Southern blot hybridization.
Fig. 2. Four transgenic lines and reference read-mapping coverages of their respective transformation vectors. (A) Bt-resistant transgenic rice C1-8-8. (B) Resveratrol-producing transgenic rice Iksan515. (C) Beta-carotene-enhanced transgenic soybean 10-19-1. (D) Beta-carotene-enhanced transgenic soybean 9-1-2. Single reads mapping in the forward and reverse directions are shown in green and red, respectively; Paired reads including both the forward and reverse directions are shown in blue; Non-specific matches are shown in yellow.
Fig. 3. Flanking sequence analysis of Bt-resistant transgenic rice C1-8-8 by NGS. (A) Left border flanking sequence. (B) Right border flanking sequence. (C) Transgene insertion site on rice chromosome 8.
Fig. 4. Flanking sequence analysis of resveratrol-producing rice Iksan515 by NGS. (A) Left border flanking sequence. (B) Right border flanking sequence. (C) Transgene insertion site on rice chromosome 4.
Fig. 5. Flanking sequence analysis of beta-carotene enhanced soybean 10-19-1 by NGS. (A) Left border flanking sequence. (B) Right border flanking sequence. (C) Transgene insertion site on soybean chromosome 13.
Fig. 6. Flanking sequence analysis of beta-carotene enhanced soybean 9-1-2 by NGS. (A) Left border flanking sequence. (B) Right border flanking sequence.
Fig. 7. Transgene insertion sites of beta-carotene-enhanced soybean 9-1-2 on soybean chromosomes 5, 11 and 15. (A) Transgene insertion site on chromosome 5. (B) Transgene insertion site on chromosome 11. (C) Transgene insertion site on chromosome 15.
Fig. 8. Transgene insertion numbers analysis by Southern blot hybridization for each transgenic line. (A) C1-8-8 rice line detected by the Cry1AC probe. (B, C) Iksan515 rice line detected by Bar and RS probes, respectively. (D, E) 9-1-2 and 10-19-1 soybean lines detected by Bar and CrtI probes, respectively.

Reference assembly of four transgenic lines and their two donor varieties onto the Nipponbare genome (Oryza sativa Japonica) and Williams 82 genome (Glycine max).

Donors/TransformantsTotal reference length (bp)Total read length (bp)Average coverageTotal read countReads in aligned pairsMapping reads percentage (%)

Read depth of inserted genes versus each reference gene.

TransformantsFeaturesReference nameReference length (bp)Mapped readsAverage coverageRead depthz)Copy number
C1-8-8 (Rice)Inserted geneCry1AC1,86035528.130.841
T-DNA vectorPCMF-Cry-MF2,88456629.000.87
H.S. geney)SPS (U33175)7,1501,61033.41
Iksan515 (Rice)Inserted geneBar55424342.491.662
Inserted geneResveratrol synthase1,17059850.791.99
T-DNA vectorPSB22205,6092,57445.891.80
H.S. geneSPS (U33175)7,1501,82925.55
10-19-1 (Soybean)Inserted geneBar55212130.180.761
Inserted genePhytoene synthase1,25743450.961.29
Inserted geneCarotene desaturase1,47938438.240.97
T-DNA vectorPAC-vector5,4351,46740.341.02
H.S. geneLe1 (K00821.1)2,15258039.58
9-1-2 (Soybean)Inserted geneBar552496126.094.186
Inserted genePhytoene synthase1,2571,221142.034.70
Inserted geneCarotene desaturase1,4791,375135.464.49
T-DNA vectorPAC-vector5,4354,694127.304.22
H.S. geneLe1 (K00821.1)2,15245030.20

z)Read depth = average coverage of inserted gene (transformation vector) / average coverage of housekeeping gene;

y)H.S. gene: Housekeeping gene with one copy in plant genomes.

T-DNA insertion sites and insertion structures of four transformants.

Donor genomeTransformantsInsertion chr.z)Insertion siteInsertion structure
RiceC1-8-8Chr.824,020,082-24,020,142Reverse (RB-LB)
RiceIksan515Chr.428,858,797-28,858,848Inverted repeat (tail to tail)
Soybean10-19-1Chr.1323,918,809-23,918,840Reverse (RB-LB)
Soybean9-1-2Chr.5755,381-755,401Inverted repeat (head to head)
Chr.111,680,418-1,680,448Inverted repeat (tail to tail)
Chr.1512,709,967-12,709,991Inverted repeat (head to head)

z)T-DNA insertion chromosome of donor genome.

  1. Codex Alimentarius Commission (2003). Guideline for the conduct of food safety assessment of foods derived from recombinant-DNA plants. CAC/GL. 45,
  2. Daniela, W, Leif, S, Joachim, B, and Lutz, G (2013). Next-generation sequencing as a tool for detailed molecular characterization of genomic insertions and flanking regions in genetically modified plants: a pilot study using a rice event unauthorized in the EU. Food Anal Method. 6, 1718-1727.
  3. Guo, B, Guo, Y, Hong, H, and Qiu, LJ (2016). Identification of genomic insertion and flanking sequence of G2-EPSPS and GAT transgenes in soybean using whole genome sequencing method. Front Plant Sci. 7, 1009.
    Pubmed KoreaMed CrossRef
  4. Guttikonda, SK, Marri, P, Mammadov, J, Ye, L, Soe, K, and Richey, K (2016). Molecular characterization of transgenic events using next generation sequencing approach. PLoS ONE. 11, e0149515.
    Pubmed KoreaMed CrossRef
  5. James, C (2015). 20th Anniversary (1996 to 2015) of the global commercialization of biotech crops and biotech crop highlights in 2015. ISAAA Brief No. 51. Ithaca, NY: ISAAA
  6. Kim, KD, Shin, JH, Van, K, Kim, DH, and Lee, SH (2009). Dynamic rearrangements determine genome organization and useful traits in soybean. Plant Physiol. 151, 1066-1076.
    Pubmed KoreaMed CrossRef
  7. Kovalic, D, Garnaat, C, Guo, L, Yang, YP, Groat, J, and Silvanovich, A (2012). The use of next generation sequencing and junction sequence analysis bioinformatics to achieve molecular characterization of crops improved through modern biotechnology. Plant Genome. 5, 149-163.
  8. Lepage, E, Zampini, E, Boyle, B, and Brisson, N (2013). Time and cost-efficient identification of T-DNA insertion sites through targeted genomic sequencing. PLoS ONE. 8, e70912.
  9. Liu, YG, Mitsukawa, N, Oosumi, T, and Whittier, RF (1995). Efficient isolation and mapping of Arabidopsis thaliana T-DNA insert junctions by thermal asymmetric interlaced PCR. Plant J. 8, 457-463.
    Pubmed CrossRef
  10. Ochman, H, Gerber, AS, and Hartl, DL (1988). Genetic applications of an inverse polymerase chain reaction. Genetics. 120, 621-623.
    Pubmed KoreaMed
  11. OECD (2010). Consensus document on molecular characterisation of plants derived from modern
  12. O’Malley, RC, and Ecker, JR (2010). Linking genotype to phenotype using the Arabidopsis unimutant collection. Plant J. 61, 928-940.
  13. Park, D, Kim, DG, Jang, G, Lim, JS, Shin, YJ, and Kin, J (2015). Efficiency to discovery transgenic loci in GM rice using next generation sequencing whole genome re-sequencing. Genomics Inform. 13, 81-85.
    Pubmed KoreaMed CrossRef
  14. Pauwels, K, De Keersmaecker, SC, De Schrijver, A, du Jardin, P, Roosens, NH, and Herman, P (2015). Next-generation sequencing as a tool for the molecular characterisation and risk assessment of genetically modified plants: added value or not?. Trends Food Sci Tech. 45, 319-326.
  15. Qin, Y, Ahn, HI, Kweon, SJ, Baek, SH, Shin, KS, and Woo, HJ (2013). Molecular characterization of transgenic rice producing resveratrol. Plant Breed Biotech. 1, 406-415.
  16. Qin, Y, Kweon, SJ, Chung, YS, Ha, SH, Shin, KS, and Lim, MH (2015). Selection of β-carotene enhanced transgenic soybean containing single-copy transgene and analysis of integration sites. Korean J Breed Sci. 47, 111-117.
  17. Redenbaugh, K, Hiatt, B, Martineau, B, Kramer, M, Sheehy, R, and Sanders, R (1992). Safety assessment of genetically engineered fruits and vegetables: A case study of the Flavr SavrTM Tomato. Boca Raton, FL: CRC Press
  18. Woo, HJ, Qin, Y, Park, SY, Park, SK, Cho, YG, and Shin, KS (2015). Development of selectable marker-free transgenic rice plants with enhanced seed tocopherol content through FLP/FRT-mediated spontaneous auto-excision. PLoS ONE. 10, e0132667.
    Pubmed KoreaMed CrossRef
  19. Yang, LT, Wang, CM, Jensen, AH, Morisset, D, Lin, YJ, and Zhang, DB (2013). Characterization of GM events by insert knowledge adapted re-sequencing approaches. Sci Rep. 3, 127-132.
  20. Zastrow-Hayes, GM, Lin, H, Sigmund, AL, Hoffman, JL, Alarcon, CM, and Hayes, KR (2015). Southern-by-sequencing: a robust screening approach for molecular characterization of genetically modified crops. Plant Genome. 8, 1-15.
  21. Zhang, R, Yin, Y, Zhang, Y, Li, K, Zhu, H, and Gong, Q (2012). Molecular characterization of transgene integration by next-generation sequencing in transgenic cattle. PLoS ONE. 7, e50348.
    Pubmed KoreaMed CrossRef

December 2017, 5 (4)
  • Science Central