Data Management

doi:10.1201/9781003010623-2

ABSTRACT

Tabular datasets are the quintessential form of saving information in social sciences. Its power relies on the ability to register multiple dimensions of information for every observation one is interested in. For example, for every representative in Congress one can know their gender, age, percentage of attendance at room, and number of bills presented, etc. Obtaining the tabulation from one of the columns in the dataset, a common function for categorical variables, is an easy task. This chapter looks at some basic operations for the dataset, that as a whole will allow to do major editing in the structure and content. One of the most common operations with datasets is sorting them according to one variable. This can give clear hints about the observations. One can make summaries for the datasets with summarize function. A common exercise in dataset management is generating variables (or editing existing ones) based on certain logical conditions.