ABSTRACT

One of the first things that many parents regularly do when their children are born is read stories to them. A compelling story based on data can likely be told in around ten-to-twenty pages. The best stories based on data tend to be multi-disciplinary. They take from whatever field they need to, but almost always draw on statistics, computer science, economics, and engineering (to name a few). The process of coming to understand the look and feel of a dataset is termed exploratory data analysis (EDA). There are different approximations of “plausibly measurable”. Hence, datasets are always the result of choices. Some countries have well-developed civil registration and vital statistics (CRVS), which collect data about every death. But many countries lack a CRVS, resulting in unrecorded deaths. The world of data is about considering possibilities and probabilities, and learning to make trade-offs between them.