ABSTRACT

Neural networks are adaptive systems that have ‘automatic’ learning properties, that is, they adapt their internal parameters in order to satisfy constraints imposed by a training algorithm and the input and output training data. In order to extract the maximum potential from the training algorithms very careful consideration must be given to the form and characteristics of the data that are presented to the network at the input and output stages. In this chapter we discuss the requirements for data preparation and data representation. We consider the issue of feature extraction from the data sample to enhance the information content of the data used for training, and give examples of data preprocessing techniques. We consider the issue of data separability and discuss the mechanisms by which neural networks can partition and categorize data. We compare and contrast the different means by which real-world variables can be represented at the input and output of neural networks, looking in detail at the properties of local and distributed schemes and discrete and continuous methods. Finally, we consider the representation of more complex or abstract properties such as time and symbolic information. The objective in this chapter is to highlight the fundamental role that data preparation plays in developing successful neural network systems, and to provide developers with the necessary methods and understanding to approach this task.