ABSTRACT

This chapter introduces data analysts to the challenges and techniques associated with the use of unstructured data in the healthcare domain, while also introducing healthcare providers and medical informaticists to analytic advancements that may better enable them to derive value from already existent data sources. It provides examples of applications to common healthcare questions with a particular emphasis on how these different communities can work together to develop techniques that not only are analytically robust but also provide solutions relevant to the clinical goals for a given patient or population. The main objective of text analysis is to extract from documents features of potential interest and organize them in the form of structured data that can serve as an informative representation of the analyzed documents. Text mining focuses on revealing the key structure and topics of the analyzed documents. Advanced text analysis systems combine three main types of analysis: linguistic, semantic, and statistical.