ABSTRACT

CONTENTS 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419

11.1.1 Computational Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421 11.2 Acquiring the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421

11.2.1 Extracting Latitude and Longitude from a CSV File . . . . . . . . . . . . . . . . . 421 11.3 Integrating Data from Different Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 11.4 Preparing the Data for Plotting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424

11.4.1 Redoing the Merge of the Factbook and Location Data . . . . . . . . . . . . . . 428 11.5 Plotting with Google Earth™ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430 11.6 Extracting Demographic Information from the CIA XML File . . . . . . . . . . . . . . . 435 11.7 Generating KML Directly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442 11.8 Additional Computational Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448

11.8.1 Creating Plotting Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448 11.8.2 Efficiency in Generating KML from Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . 448 11.8.3 Extracting Latitude and Longitude from an HTML File . . . . . . . . . . . . . 450

11.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454

11.1 Introduction The tremendous increase in data that are freely available on the Web has created numerous possibilities for extracting data from different sources, putting them together, and creating exciting new types of visualizations. These visualizations, sometimes called “mashups,” are typically interactive and displayed on the Web. According to Wikipedia [13],

The term [mashup] implies easy, fast integration, frequently using open application programming interfaces (API) and data sources to produce enriched results that were not necessarily the original reason for producing the raw source data. ... The main characteristics of a mashup are combination, visualization, and aggregation.