Parts-of-Speech Tagging in NLP: Utility, Types, and Some Popular POS Taggers

doi:10.1201/9780367808495-6

Chapter

Parts-of-Speech Tagging in NLP: Utility, Types, and Some Popular POS Taggers

ABSTRACT

Majority of the basic models in the area of Natural Language Processing (NLP) are based on Bag of Words. Primary limitation of Bag of Words based models is their inability to capture the syntactic relations between words. Part-of-Speech (POS) are uselful in improving on this Bag of Words technique. Part-of-Speech (PoS) tagging is involved with the process of assigning one of the parts of speech (include nouns, verb, adverbs, adjectives, pronouns, conjunction and their sub-categories) to the given word. POS tagging is the process of marking up a word in a corpus to a corresponding part of a speech tag, based on its context and definition. POS Tags are useful for building parse trees, which are used in building Named Entity Recognitions (NERs) and extracting relations between words. POS Tagging also has major application in building lemmatizers which are used to reduce a word to its root form. To understand the meaning of any sentence or to extract relationships and build a knowledge graph, POS Tagging is a very important step. This chapter gives a vivid introduction to the POS tagging problem, its various applications and types with special emphasis on markov model based POS tagging and finally some python based implementtaion of some popular POS taggers.