ABSTRACT

This chapter introduces a new statistical data mining method, the symmetrizing ranked data method, and adds it to the paradigm of simplicity and desirability for good model-building practice. The new method carries out the action of two basic statistical tools, symmetrizing and ranking variables, yielding new reexpressed variables with likely improved predictive power. The chapter describes Steven's scales of measurement. It defines an approximate interval scale that is an offspring of the new statistical data mining method. The chapter provides a review of the simplest of exploratory data analysis (EDA) elements: the stem-and-leaf display and the box-and-whiskers plot. The stem-and-leaf display is a graphical presentation of quantitative data to assist in visualizing the density and shape of a distribution. The box-and-whiskers plot provides a detailed visual summary of various features of a distribution. The chapter illustrates the proposed method with examples, which provide the data miner with a starting point for more applications of this useful statistical data mining tool.