ABSTRACT

DNA sequencing is a method to determine the exact sequence of nucleotides in a sample of DNA. The most popular method for DNA sequencing is the dideoxy method or Sanger method. This method is named after its inventor, Frederick Sanger, who was awarded the 1980 Nobel Prize in chemistry for this achievement. The dideoxy method gets its name from the synthetic nucleotides that lack the -OH group at the 3’ carbon atom of the deoxyribose sugar moiety, which play a critical role. When a dideoxynucleotide (ddNTP) gets added to the growing DNA strand, the chain elongation stops because there is no 3’ -OH for the next nucleotide to be incorporated. For this reason, the dideoxy method is also known as chain termination method. In the presence of all the four normal nucleotides, chain elongation proceeds normally until the DNA polymerase incorporates a dideoxy nucleotide instead of the normal deoxynucleotide. Since the ratio of normal nucleotide to the dideoxy versions is kept high in the sequencing reaction, some DNA strands will have dideoxy version that halts further DNA polymerization only after addition of several normal nucleotides. After the sequencing reaction, the fragments are separated based on their length using a high-resolution separation method. Two fragments differing in

length even by a single nucleotide get separated from each other precisely. The original Sanger’s method employed radiolabeled dideoxy nucleotides that required four reactions for four nucleotides to be carried out in four different tubes and separated in four different lanes. Use of fluorescent dyes, however, allows a single tube reaction and a single lane separation enabling automation of the whole process. A standard DNA sequencing reaction gives about 500 nucleotide long sequence at one go, which is negligible when compared to large eukaryotic genomes which are million or even billion nucleotides long. Sequencing of the whole genome of an organism obviously requires different strategies while using the same basic DNA sequencing technology. The objective of this chapter is to provide an outline of whole genome sequencing, emphasizing on new generation technologies and the associated requirements of computational facilities.