ABSTRACT

98Research in clinical and medical data has proliferated with the advent of Big Data technologies like Apache Hadoop. While traditional analytics were carried out in stand-alone machines, Big Data requires parallel processing capabilities due to the ever-increasing data volume. These parallel processing tools facilitate information aggregation and analytics of varieties of data, thereby enabling the market of health care industries and medical informatics. Big Data analytics in the health care domain enables many useful tasks like clinical outcome prediction, decision support systems for assisting physicians, and disease surveillance, thus enhancing health care systems. This chapter introduces the potential of Big Data in the field of health care and bioinformatics. We also give an overview of how structured and unstructured data in the form of electronic health records, patient reports, clinical images, genomic data, etc., are managed and analyzed. The general architecture and capabilities of Big Data in health care will also be outlined. This chapter will also discuss the application of the MapReduce framework in the health care domain and case studies that utilize data mining techniques for various clinical predictions using different types of data. We also present the security and privacy issues that have loomed large with the increase in opportunities for Big Data analytics in health care.