ABSTRACT
LLMs available openly on web such as GPT, Gemini, and Claude are powerful, yet they have great difficulty representing knowledge from textual context without also carrying biases coming from their extensive pre-trained data. This creates issues for tasks requiring accurate, context-specific question answering. We overcome these limitations with our approach to minimize the likelihood of irrelevant and inconsistent responses by extracting information strictly from the user's text corpus. We leveraged the Mistral7B Zephyr model and LangChain, we extract meaningful embeddings locally to preserve data integrity, which are then synthesized into structured ontologies and visualized in Neo4j to create a rich knowledge KG for fast question answering. Resting our approach on Retrieval-Augmented Generation (RAG) in-and-out to ensure that all answers are grounded solely in the input corpus. Our method thus mitigates the limitations of existing large language models with respect to producing canonical representations or retrieving reliable knowledge. Organizing information in a well-organized manner, exposing human interpretable representations of textual knowledge is usable at both scales: the atomic factual unit and also for complex queries.
