ABSTRACT

Determination of nucleotide sequences present in biological samples (termed “sequencing”) has become a key method in almost all fields of bioscience, including virology. Since the advent of high-throughput sequencing (“second-generation sequencing”), it is possible to sequence millions of DNA fragments (“reads”) in parallel at very high accuracy, enabling the inference of single nucleotide polymorphisms (SNPs) between virus strains.

In this chapter, we provide details on how the long-read sequencing technologies (“third-generation sequencing”) which were developed in recent years have expanded the toolkit for researchers beyond the possibilities of short-read sequencing, with a focus on virus sequencing. With increased read lengths, it is possible to sequence full viral transcripts and genomes in single contiguous reads, enabling detailed studies of transcript isoforms, haplotypes, and viral quasispecies. In comparison, long-read technologies have generally higher raw read error rates, but an accurate assembly of transcripts and genomes is facilitated or made unnecessary due to the long contiguous sequences. One of the technologies, namely nanopore sequencing, also uniquely allows for direct RNA sequencing without the need for the creation or amplification of complementary DNA. This enables accurate capture of RNA content in a sample “as is,” e.g., in cells infected by RNA viruses. The protocol also leaves RNA modifications intact, which can be inferred during sequencing. Nanopore sequencing can be implemented at low costs and with constant genome coverage using cDNA amplicon sequencing methods, e.g., for highly parallel screening during virus outbreaks.

Friedrich Schiller University Jena