ABSTRACT

The fi rst complete genome of coronavirus, mouse hepatitis virus (MHV), was sequenced more than 50 years after it was isolated. Before the SARS epidemic in 2003, there were less than 10 coronaviruses with complete genome sequences available. These include two human coronaviruses (HCoV-229E and HCoV-OC43), four other mammalian coronaviruses [MHV, bovine coronavirus (BCoV), transmissible gastroenteritis virus (TGEV), porcine epidemic diarrhea virus (PEDV)], and one avian coronavirus (IBV). The SARS epidemic that originated from southern China in 2003 has boosted interest in all areas of coronavirus research, most notably, coronavirus biodiversity and genomics [5-7]. After the SARS epidemic, up to April 2010, 15 novel coronaviruses were discovered with their complete genomes sequenced. Among these 15 previously unrecognized coronaviruses were two globally distributed human coronaviruses, human coronavirus NL63 (HCoV-NL63) and human coronavirus HKU1 (HCoV-HKU1) [8-10]; 10 other mammalian coronaviruses, SARS-related Rhinolophus bat coronavirus (SARSr-Rh-BatCoV), Rhinolophus bat coronavirus HKU2 (Rh-BatCoV HKU2), Tylonycteris bat coronavirus HKU4 (Ty-BatCoV HKU4), Pipistrellus bat coronavirus HKU5 (Pi-BatCoV HKU5), Miniopterus bat coronavirus HKU8 (Mi-BatCoV HKU8), Rousettus bat coronavirus HKU9 (Ro-BatCoV HKU9), Scotophilus bat coronavirus 512 (Sc-BatCoV 512), Miniopterus bat coronavirus 1A/B (Mi-BatCoV 1A/B), equine coronavirus (ECoV) and beluga whale coronavirus SW1 [3,6,11-15]; and three avian coronaviruses, bulbul coronavirus HKU11 (BuCoV HKU11), thrush coronavirus HKU12 (ThCoV HKU12) and munia coronavirus HKU13 (MunCoV HKU13) [2]. Most of these genomes were sequenced using the RNA extracted directly from the clinical specimens, such as nasopharyngeal aspirate or stool, as the template, while the viruses themselves were still non-cultivable [2,3,6,1115]. This provided more accurate analysis of the in situ viral genomes

avoiding mutational bias during in vitro viral replication. These sequence efforts have resulted in a marked increase in the number of coronavirus genomes and have given us an unprecedented opportunity to understand this family of virus at the genomic and in silico levels. These understandings have also led to generation of further hypotheses and experiments in the laboratory. In this article, we reviewed our current understanding on the genomics and bioinformatics analysis of coronaviruses. Details of the bioinformatics tools will not be discussed.