It is generally accepted that the reliability of the model being developed is highly dependent on the adequacy of the calibration procedure employed. Nowadays, the automatic calibration procedure has been widely used to calibrate the model as this
method is much more convenient and faster. The automatic calibration procedure consists of three elements: objective function, optimization algorithm and calibration data. Automatic model calibration has been the main focus of hydrologists in solving hydrologic models in order to produce a better calibrated model. Model parameter estimation using the automatic calibration procedure has been gaining much attention in order to provide more realistic parameter estimates and more reliable forecast. This approach has been discussed extensively in the literature (Sorooshian and Dracup 1980; Sorooshian 1981; Sorooshian et al. 1982, 1983; Kuczera 1983; Gupta et al. 1998). It has been proved that the success of any calibration process is highly dependent on the characteristics (quantity and quality) of the data used. It has often been suggested that the calibration data should be as representative of the various phenomena experienced by the catchment as possible. Research has proved that the information content of the data is far more important than the amount used for model calibration (Kuczera 1982; Sorooshian et al. 1983; Gupta and Sorooshian 1985; Yapo et al. 1996). In this study, the information content is regarded as how informative the selected calibration catchments are in representing all the catchments in a region. Selection of appropriate calibration data has been carried out in a latest study by Liu and Han (2010) for the rainfall-runoff modelling. This study has proposed some indices for the selection of calibration data with adequate lengths and appropriate durations by examining the spectral properties (i.e., in terms of energy distribution in frequency domain) of data sequences before the calibration work. With the validation data determined beforehand, the similarity assumption was applied to find a set of calibration data relevant to the validation data. The more similar the calibration data is to the validation, the better the model performance would be. The similarity between the validation and calibration data was examined using the flow-duration curve, Fourier transform and wavelet analysis. Useful indices such as information cost function and an entropy-like function were used to evaluate the results of the three methods. This study has found that information content of the calibration data was more important than the data length. Shorter data length may provide more useful information than a longer data series.