ABSTRACT

One of the first tasks most users will encounter when they receive a new data set is to get the data in the form they want. This may involve a range of tasks such as reading data sets from different formats, combining multiple data sets, summarizing the data, creating new variables from the old variables, and a range of additional tasks. As scholars have started leveraging less structured and more complex data, like “big data” and text data, the importance of data management and manipulation has become even more important. InfoWorld identified this as the 80/20 dilemma, where most data analysts spend 80% of their time in data management and manipulation, while spending 20% of their time in actual analysis. This chapter talks about the basics of managing the data, from loading the data into R to exporting the data to other programs and reporting the information about the data.