ABSTRACT

The area of natural-language processing goes back to the early 1950s. In a business setting, analysts would like a user to be as specific as possible when describing what issues they are having. A single piece of text entered by a user contains two main varieties of terms: terms that are useful to understand what the user is saying, or terms that are used to join a sentence together. When working with text-based data, a common practice that is performed is the process of creating n-grams. When working with large quantities of user comments, often an overlap can be seen in specific words that are used together. However, when hundreds or thousands of records are under analysis, any application of the data quickly becomes difficult as irrelevant text may become prominent in the results. The sentiment of a piece of text is the categorization of a record into a defined category of positive, negative, or neutral sentiment.