ABSTRACT

The gap between our ability to pump out next-generation sequencing (NGS) data and our capability to extract knowledge from these data is getting broader. To manage and process the torrent of NGS data for deep understanding of biological systems, significant investment in computational infrastructure and analytical power is needed. How to gauge computing needs and build a system to meet the needs, however, poses serious challenges to small research groups and even large research organizations. To meet this unprecedented challenge, the NGS field can borrow solutions from other “big data” fields such as high-energy particle physics, climatology, and social media. For biologists without much training in bioinformatics, while getting expert help is needed, having a good understanding of the various aspects of NGS data management and analysis will be beneficial for years to come.