Extracting Structured Datasets from Textual Sources – Some Examples

doi:10.1201/9781003293644-8

ABSTRACT

We hereby present some examples of information extraction from textual sources such as news, company regulatory filings or earning calls transcripts. For the company filings, we refer to some recent literature arguing the existence of unexploited information in these documents. We present three Brain datasets that provide several measures on various textual sources with well-defined time-stamps and that can be input to quantitative investment models.