ABSTRACT

The most widely used text format for tabular data is undoubtedly comma-separatedvalues (CSV). Each row of the table is a line in the file; the values within each row—i.e., the columns—are separated by commas. Tragically, CSV doesn’t require the first row to be a header, and CSV files usually don’t specify units or data types. One can guess that the values in the table above are integers, but it’s all too common to have a CSV file whose columns are labelled “height” and “weight” without any indication of whether the heights are in feet or meters or the weights in pounds or kilograms. Trace the execution of the utility program that creates a small sample of the original data, explaining what is passed into each of the chained methods calls.