ABSTRACT

This introduction presents an overview of the key concepts discussed in the subsequent chapters of this book. The book introduces spreadsheets, which most journalists agree are the fundamental tool for starting out in computer-assisted analysis. It offers tips for searches on the web for datasets, the use of email discussion groups, and downloading data. The chapter discusses “dirty data.” Dirty data is incomplete or incorrect databases that need to be “cleaned”—that is, completed or corrected. A wave of computer programmers/coders has joined with journalists to tackle the problems of capturing data from the web, cleaning and organizing it, and creating fascinating interactive presentations to be shared with the public and that encourage citizen participation and analysis. Journalists have found, too, that if they let a government employee whose job is only to process data and do basic analysis, the results may be incomplete, or hide the nuances or potential pitfalls of the data.