Over the past few decades, Artificial Neural Networks (ANNs) have emerged as a powerful set of tools in pattern classification, time series analysis, signal processing, dynamical system modeling and control. The popularity of ANNs can be attributed to the fact that these network models are frequently able to learn behavior when traditional modeling is very difficult to generalize. Typically, a neural network consists of several computational nodes called perceptrons arranged in layers. The number of hidden nodes essentially determines the degrees of freedom of the non-parametric model. A small number of hidden units may not be enough to capture a given system’s complex inputoutput mapping and alternately a large number of hidden units may overfit the data and may not generalize the behavior. It is also natural to ask “How many hidden layers are required to model the input-output mapping?” The answer to this question in a general sense is provided by Kolmogorov’s theorem [13] (later modified by other researchers [14]), according to which any continuous function from an input subspace to an appropriate output subspace can be approximated by a two-layer neural network with finite number of nodes (model centers).