ABSTRACT

Data analysis begins by reading the data from the computer file that contains the data into the analysis system such as R. With a single unmodified function call, the lessR function Read() reads data from various formats, such as Excel or text files or files from commercial systems, into a standard R data table called a data frame. The data analysis is of two types of variables, continuous and categorical. Continuous variables include a person's age, a car's fuel mileage, and the mean number of hours until failure for a type of light bulb. Examples of categorical variables include gender, the manufacturer of a person's phone, and the state or province of a country in which a person resides. When read into an analysis system, the data values for continuous variables are stored as integers or as numbers with decimal digits. The data values for categorical variables are stored as integers or character strings. The lessR enhancement of R also provides for variable labels that make data visualization and text output of data analysis functions more interpretable. Data also easily flows out of R in various formats as easily as it is read into R, so information is never locked within the R system.