ABSTRACT

Air pollution is a significant problem in the world, especially in developing countries. The synthesis of cloud services and IoT technologies holds a grand promise for the development of low-cost air-pollution monitoring technologies, which help monitor and forecast air-pollution levels. Several forecasting models have been proposed in the cloud to forecast air pollution. However, there is less research that evaluates individual or ensemble models for real-time forecasting of air pollution in the cloud. This research’s primary objective is to develop and test individual and ensemble multivariate time-series forecasting models for air-pollution forecasting in the cloud ahead of time. A PM2.5 dataset collected at the US Embassy in Beijing, China, was used for analyses. These cloud data consisted of one pollutant (PM2.5) and six meteorological parameters (dew, temperature, atmospheric pressure, wind speed, snow, and rain). The first 80% of data were used for model training, and the last 20% were used for model testing. Four individual multivariate models were trained, namely, a multivariate statistical SARIMAX model, a neural multilayer perceptron (MLP) model, a spatial convolution neural network (CNN) model, and a temporal long short-term memory (LSTM) model, along with a weighted ensemble model. The weighted ensemble model was developed by assigning weights to individual multivariate models. The MLP performed the best among individual models, followed by CNN, LSTM, and SARIMAX models. The weighted ensemble model performed the best on both the training and the test dataset. We highlight the potential of using weighted ensembles for air-pollution monitoring in the cloud.