Background: Previous studies exploring sequence variation in the model legume, Medicago truncatula, relied on mapping short reads to a single reference. However, read-mapping approaches are inadequate to examine large, diverse gene families or to probe variation in repeat-rich or highly divergent genome regions. De novo sequencing and assembly of M. truncatula genomes enables near-comprehensive discovery of structural variants (SVs), analysis of rapidly evolving gene families, and ultimately, construction of a pan-genome.
Results: Genome-wide synteny based on 15 de novo M. truncatula assemblies effectively detected different types of SVs indicating that as much as 22% of the genome is involved in large structural changes, altogether affecting 28% of gene models. A total of 63 million base pairs (Mbp) of novel sequence was discovered, expanding the reference genome space for Medicago by 16%. Pan-genome analysis revealed that 42% (180 Mbp) of genomic sequences is missing in one or more accession, while examination of de novo annotated genes identified 67% (50,700) of all ortholog groups as dispensable – estimates comparable to recent studies in rice, maize and soybean. Rapidly evolving gene families typically associated with biotic interactions and stress response were found to be enriched in the accession-specific gene pool. The nucleotide-binding site leucine-rich repeat (NBS-LRR) family, in particular, harbors the highest level of nucleotide diversity, large effect single nucleotide change, protein diversity, and presence/absence variation. However, the leucine-rich repeat (LRR) and heat shock gene families are disproportionately affected by large effect single nucleotide changes and even higher levels of copy number variation.
Conclusions: Analysis of multiple M. truncatula genomes illustrates the value of de novo assemblies to discover and describe structural variation, something that is often under-estimated when using read-mapping approaches. Comparisons among the de novo assemblies also indicate that different large gene families differ in the architecture of their structural variation.
Details
- Exploring Structural Variation and Gene Family Architecture With De Novo Assemblies of 15 Medicago Genomes
- Zhou, Peng (Author)
- Silverstein, Kevin A. T. (Author)
- Ramaraj, Thiruvarangan (Author)
- Guhlin, Joseph (Author)
- Denny, Roxanne (Author)
- Liu, Junqi (Author)
- Farmer, Andrew D. (Author)
- Steele, Kelly (Author)
- Stupar, Robert M. (Author)
- Miller, Jason R. (Author)
- Tiffin, Peter (Author)
- Mudge, Joann (Author)
- Young, Nevin D. (Author)
- New College of Interdisciplinary Arts and Sciences (Contributor)
- Digital object identifier: 10.1186/s12864-017-3654-1
- Identifier TypeInternational standard serial numberIdentifier Value1471-2164
- The electronic version of this article is the complete one and can be found online at: https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-017-3654-1
Citation and reuse
Cite this item
This is a suggested citation. Consult the appropriate style guide for specific citation guidelines.
Zhou, P., Silverstein, K. A., Ramaraj, T., Guhlin, J., Denny, R., Liu, J., . . . Young, N. D. (2017). Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes. BMC Genomics, 18(1). doi:10.1186/s12864-017-3654-1