ABSTRACT

The purpose of this project is to develop a model to predict the box office performance of a theatrical movie. Since it is very risky to invest in this area of entertainment, this tool can help determine how much to invest to make a profit. Also, due to the pandemic we experienced in 2019-2022, the means of distribution are expected to change, so our model determines the best direction of distribution. We develop a model that predicts first weekend revenues using a ranking model called “hard voting”. Our main tools are decision trees, Light Gradient-Boosting Machine (LightGBM), and k-nearest neighbors classification algorithms. Our model yields a balanced accuracy of 84.62%.