ABSTRACT

According to the National Highway Traffic Safety Administration (NHTSA), human error is the critical cause for more than 90 percent of motor vehicle crashes. Several risky driving behaviors such as drunk and distracted driving are determined as the main contributors to the huge price-tag of traffic crashes. Numerous studies predicted the real-time likelihood of crash occurrence on a given freeway segment over a short period without considering the driver’s personalized safety factors involved in crashes. The objective of this paper is to address this gap in knowledge by developing a novel approach to formulate the real-time traffic safety risk of individual drivers, as the likelihood of a crash and near-crash events, and to create data-driven frameworks to predict the drivers’ individualized safety risks. To evaluate the proposed framework, we used 100-Car Naturalistic Driving Studies (NDS) dataset. We developed an ensemble of Breiman’s random forest and a newly proposed Multivariate Time Series Random Forest to classify driving events into the crash and near-crash classes on a set of safety factors. The replicated k-fold cross validation is employed to evaluate the models’ performances. The results of this study provide useful insight into human factors contributing to crash and near-crash events and can help researchers and transportation agencies to get a better knowledge of errors and human-related contributing factors in crashes, all of which lead to developing effective strategies to mitigate the crash injury severity outcomes. Moreover, this paper provides valuable information for car insurance companies to develop the application of behavior-based auto insurance.