ABSTRACT

Natural language processing (NLP) is a branch of artificial intelligence, and is a computational technique which analyzes and synthesizes the natural language written and spoken by humans. The human brain can recognize natural language instantly, but for a machine to understand it, this raw data needs to be processed. Computer applications which work for varying structures of languages can be built using NLP. This process will take unstructured data (human language) as input, and after processing, it will be converted to computer-readable languages. This complex process must follow various attributes and the grammatical structure of natural language. NLP performs syntactic and semantic analysis to identify data patterns hidden in natural language. Syntactic analysis, also called parsing, is a phase whose purpose is to draw the exact meaning of an NLP statement. Formal grammar rules of the underlying language are considered while processing the input. The main aim of parsing is to ensure that the string symbols in natural language comply with the grammar rules. The Graph of Words model is an approach that can be used as a better alternative to the Bag of Words model for text processing. A graph representation of the input documents focuses on word independence by considering the order of words and the distance between relevant words.