ABSTRACT

This quote from Ragnar Frisch’s Editor’s Note in the first issue of Econometrica in 1933 has never seemed more timely. The phrase “data rich, information poor” is often used to characterize the current state of our digitized world. Over recent decades, data storage and availability has been growing at an exponential rate, and currently, data sets on the order of terabytes are not uncommon. While a portion of this new data is in the form of numerical or categorical data in well-structured databases, the vast majority is in the form of unstructured textual data. These news stories, government reports, blog entries, e-mails, Web pages, and the like are the medium of information flow

of Empirical

throughout the world. It is this unstructured data that most decision-makers turn to for information.