ABSTRACT

Detecting the hateful content on social media is an important task. It is challenging to distinguish hate speech text from the normal text since there is no precise boundary between hate speech, offensive speech, or neutral speech. Other challenges in processing social media texts are due to typing errors, presence of common words, word boundary changes, presence of hate speech within offensive speech, presence of offensive speech within hate speech, and the presence of offensive or hate words within the neutral speech. Most of the existing state-of-the-art methods for detecting hateful content from the tweet is based on deep-learning-based model. In this chapter, we concentrate on detecting the hate speech, offensive speech, and neutral speech in microblogs, i.e., tweets. We propose a self-attention-based deep neural network algorithm and CNN-LSTM-based classification algorithm using n-gram features. The proposed model incorporates a variety of information using n-grams and CNN-LSTM-based model, along with attention, which help the model to learn the selectively important information. The model is evaluated on a benchmark dataset. Experimental results show that this approach can efficiently detect a given tweet as hateful, offensive, or neutral. The experiments precisely indicate an overall F1-measure of 0.99, higher than the state-of-the-art work, which reports an F1-measure of 0.90. Therefore, an overall improvement of 9% is achieved using the proposed method, in addition to the F1 score of 0.96 on categorizing tweets as hateful or offensive.