ABSTRACT

With the emerging evolution of knowledge graphs (KGs) and their broad adoption by industry, an automatic KG generation despite the input format is necessary. In this chapter, we will discuss how to deal with this problem when we want to create a KG from the text. The natural language processing (NLP) approach to capture knowledge from natural language sources has been gaining traction over the years and is applicable for general-domain text processing but poses some limitations for specific domains. Recent efforts in NLP development have shown that semantic deep neural networks can learn the complex syntactic and semantics of the natural language and, thus, give more potential for automation even in the most complex domain, i.e., legal documents. This chapter provides an overview of existing research in the field of ontology learning as well as a methodology for automatic ontology population. Furthermore, in this chapter, NLP techniques have been applied to a case study encompassing NASA's heliophysics text corpus and the meeting notes from Center for Helio-Analytics (CfHA) in the data extraction pipeline, resulting in a model that extracts named entities and relationships to automatically populate a CfHA ontology. The developed case study and the model are available online. 1