ABSTRACT

For traffic flow forecasting, this chapter proposes a method that runs a prediction on images containing moving vehicles. It studies the characteristics and the performance of Convolutional Neural Networks (CNN) to explore how it may utilise a distributed environment. The chapter introduces existing machine learning and deep learning models, as well as illustrates some preliminary results by comparing the performance of CNN with that of traditional machine learning methods. It describes the proposed solution: Distributed CNN running on Apache Spark cluster and illustrates the performance evaluation. In 2009, a class project was proposed in University of California, Berkeley, which was about building a cluster management framework that could support different types of cluster computing systems. This project was called Mesos, and it is the basic element of Apache Spark. Random Forest can be considered an advanced version of the decision tree, so sometimes people also called it a random decision tree.