ABSTRACT

Constructing and developing intelligent transportation is one of the requirements for smart cities. Transportation mode detection is a prerequisite for many transportation structures and scheduling. Nowadays, with the development of GPS technology and the use of this system in tools such as smartphones as well as the existence of open-source data, vast sources of data can be accessed. This study aims to predict the transportation modes of the bike, bus, car, train, and walking by GPS recorded points, spatial data, and contextual data. Until now, we have found no study that has used kinematic, spatial, and contextual group features together while handling the interpretability of transportation modes based on the feature group and checking each important feature from the separate classification models. In the proposed approach of this paper, after extracting all three group features, a hybrid feature selection is implemented to reduce the complexity. Then, RF, GB, XGBoost, CatBoost, LightGBoost, SVM, and KNN classifications are analyzed by F-Score evaluation and the SHAP method. Finally, for the first time in transportation mode detection, supervised machine learning classification models are applied along with ensemble learning techniques, including Stacking, MVE, and WAPVE. This allows to avoid one-sided prediction and increases reliability. After the Implementation of the proposed methods, it is found that using spatial features along with kinematic features increases the accuracy of forecasting all transportation modes. Moreover, using contextual and kinematic features together have the most significant impact on car, bus, and bike modes. Employing a feature selection algorithm reduces training time in all classification models and increases the F-score in the KNN, SVM, and CatBoost models relative to all the features. Among the models, the stacking algorithm for the bike, train, and walking modes, as well as the LightGBoost algorithm for the bus and car modes, shows a higher F-score. A stacking algorithm with an F-score weighted average of 91.32% and the micro average of 91.42% is the best classification model for predicting all the transportation modes.