ABSTRACT

For an introduction to non-financial data processing, this chapter describes the methods which are driven by financial applications. The first step, as in any quantitative study, is obviously to make sure the data is trustworthy, i.e., comes from a reliable provide. The second step is to have a look at summary statistics: ranges (minimum and maximum values), and averages and medians. Histograms or plots of time series carry of course more information but cannot be analyzed properly in high dimensions. The two variables have a close to monotonic impact on future returns. Returns, on average, decrease with market capitalization. The reverse pattern is less pronounced for volatility: the curve is rather flat for the first half of volatility scores and progressively increases, especially over the last quintile of volatility values.