ABSTRACT

A dataset is a collection of values, usually either numbers or strings AKA text data. Values are organised in two ways. Every value belongs to a variable and an observation. A variable contains all values that measure the same underlying attribute across units. Although the data are neatly organized in a rectangular spreadsheet-type format, they do not follow the definition of data in “tidy” format. Converting “wide” format data to “tidy” format often confuses new R users. The only way to learn to get comfortable with the pivot_longer() function is with practice, practice, and more practice using different datasets. “Tidy” data is a standard way of mapping the meaning of a dataset to its structure. A dataset is messy or tidy depending on how rows, columns and tables are matched up with observations, variables and types.