Nonlinear models with many parameters can be extremely flexible. This gives them the potential to overfit the time series: flexible models can extract the features that are genuine to the process but can also fit noise features that are idiosyncrasies of the training data. In statistical jargon, this is the case of “low bias” of the model (since the model can fit almost anything, it is not biased much) and “high variance” of the parameters (since there are so many parameters, they are estimated with large errors). This bias-variance dilem m a is central to weak modeling, i.e., the case when the modeler does not know first principles that might suggest strong equations. A clear discussion in the context of neural networks is given by Geman, Bienenstock, and Doursat (1992).