ABSTRACT

Technology development has made it possible for a new kind of bullying, which frequently results in social shame. To help decrease the incidence of cyberbullying, interventions must be regularly updated. Cyberspace has expanded beyond the Internet in the tech-mediated world of today, so it is imperative that we investigate the nascent problem of cyberbullying.

In this research, we construct Naive Bayes and Bi-LSTM models to effectively identify cyberbullying from a collection of datasets. A very big dataset that addresses several forms of cyberbullying, including violence, attacks, racism, and sexism, was created by combining the Kaggles data with data from Twitter, Wikipedia Talk pages, YouTube, and Kaggle. The dataset’s text contains multiple instances of cyberbullying, including hate speech, hostility, insults, and toxic language.

The models produced fantastic outcomes. The Naive Bayes model achieved a precision of 98% and the lowest accuracy of 86%. Pytorch Bi-LSTM produced outstanding results, with an overall accuracy of 92%.