ABSTRACT

With enough examples, the ML sentiment classifier can be created with increased performance. Even though an ML rule-based approach for SA can result in high accuracy (if rules are carefully refined by an expert), building and maintaining rules are expensive; therefore, most typical SAalgorithms are created using aparticular typeofMLalgorithm. Some of the popular ML algorithms for SA include

∙ Naive Bayes (NB) ∙ Maximum Entropy (MaxEnt) ∙ Support Vector Machines (SVM)

When the amount of training data are low, MaxEnt and SVM tend to produce better performance than NB. However, these reported performance gains may tend to dissipate as NB’s performance increases with very large amounts of training data.[2]

Regardless of the supervised learning algorithm employed, developing an SA engine requires a training phase to develop a sentiment classifier and then incorporate this classifier into a sentiment engine that can be used to generalize for new unseen examples. The training process contributes significantly to the performance of any SA engine. Once the sentiment classifier is created, it can be employed in numerous SA applications. Applying the sentiment classifier requires

∙ A document d ∙ A learned classifier f:d→ c

At minimum, the SA application is composed of the feature extractor (i.e., the result of the NLP task) and sentiment classifier (i.e., the result of the ML task). The sentiment engine can be deployed in numerous formats, including stand-alone software, web service, binary compiled library, etc. During operation, the sentiment engine receives as input one or more text documents, which are translated into word vectors and processed using the sentiment classifier to determine the sentiment class. This process is illustrated in Fig. 2.