Using the random forest method for classification and regression in hydrology

doi:10.1201/b18180-35

ABSTRACT

ABSTRACT: Random forest, which is an algorithm developed by Breiman and Cutler in 2001, has been proved as an effective data-driven model in many fields. The advantage of random forest is it runs fast, which makes it to be a good choice for the dataset which has a large number of observe samples or variables. In this paper, two cases were used to show how to use random forest in hydrological prediction. The first case is a classification experiment of precipitation of January in Middle and Lower Yangtze River, and the second is a regression experiment of runoff of Yangtze River in the dry season. The results show that random forest has obvious advantages in selecting predicting factors when dimension of data is very large, and it is worth further research and application in hydrology considering the satisfied simulation and prediction results.