Sequence-Based Taxonomic Framework for the Classification of Uncultured Single-Stranded DNA Viruses of the Family Genomoviridae

Document
Description

With the advent of metagenomics approaches, a large diversity of known and unknown viruses has been identified in various types of environmental, plant, and animal samples. One such widespread virus group is the recently established family Genomoviridae which includes viruses

With the advent of metagenomics approaches, a large diversity of known and unknown viruses has been identified in various types of environmental, plant, and animal samples. One such widespread virus group is the recently established family Genomoviridae which includes viruses with small (∼2–2.4 kb), circular ssDNA genomes encoding rolling-circle replication initiation proteins (Rep) and unique capsid proteins. Here, we propose a sequence-based taxonomic framework for classification of 121 new virus genomes within this family. Genomoviruses display ∼47% sequence diversity, which is very similar to that within the well-established and extensively studied family Geminiviridae (46% diversity). Based on our analysis, we establish a 78% genome-wide pairwise identity as a species demarcation threshold. Furthermore, using a Rep sequence phylogeny-based analysis coupled with the current knowledge on the classification of geminiviruses, we establish nine genera within the Genomoviridae family. These are Gemycircularvirus (n = 73), Gemyduguivirus (n = 1), Gemygorvirus (n = 9), Gemykibivirus (n = 29), Gemykolovirus (n = 3), Gemykrogvirus (n = 3), Gemykroznavirus (n = 1), Gemytondvirus (n = 1), Gemyvongvirus (n = 1). The presented taxonomic framework offers rational classification of genomoviruses based on the sequence information alone and sets an example for future classification of other groups of uncultured viruses discovered using metagenomics approaches.