Semantic Information Extraction From Natural Language Using a Learning and Rule-Based Approach
Description
Open Information Extraction (OIE) is a subset of Natural Language Processing (NLP) that constitutes the processing of natural language into structured and machine-readable data. This thesis uses data in Resource Description Framework (RDF) triple format that comprises of a subject, predicate, and object. The extraction of RDF triples from natural language is an essential step towards importing data into web ontologies as part of the linked open data cloud on the Semantic web. There have been a number of related techniques for extraction of triples from plain natural language text including but not limited to ClausIE, OLLIE, Reverb, and DeepEx. This proposed study aims to reduce the dependency on conventional machine learning models since they require training datasets, and the models are not easily customizable or explainable. By leveraging a context-free grammar (CFG) based model, this thesis aims to address some of these issues while minimizing the trade-offs on performance and accuracy. Furthermore, a deep-dive is conducted to analyze the strengths and limitations of the proposed approach.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
2023
Agent
- Author (aut): Singh, Varun
- Thesis advisor (ths): Bansal, Srividya
- Committee member: Bansal, Ajay
- Committee member: Mehlhase, Alexandra
- Publisher (pbl): Arizona State University