Multi-class and Multi-label classication of Darkweb Data
Description
In this research, I try to solve multi-class multi-label classication problem, where
the goal is to automatically assign one or more labels(tags) to discussion topics seen
in deepweb. I observed natural hierarchy in our dataset, and I used dierent
techniques to ensure hierarchical integrity constraint on the predicted tag list. To
solve `class imbalance' and `scarcity of labeled data' problems, I developed semisupervised
model based on elastic search(ES) document relevance score. I evaluate
our models using standard K-fold cross-validation method. Ensuring hierarchical
integrity constraints improved F1 score by 11.9% over standard supervised learning,
while our ES based semi-supervised learning model out-performed other models in
terms of precision(78.4%) score while maintaining comparable recall(21%) score.
the goal is to automatically assign one or more labels(tags) to discussion topics seen
in deepweb. I observed natural hierarchy in our dataset, and I used dierent
techniques to ensure hierarchical integrity constraint on the predicted tag list. To
solve `class imbalance' and `scarcity of labeled data' problems, I developed semisupervised
model based on elastic search(ES) document relevance score. I evaluate
our models using standard K-fold cross-validation method. Ensuring hierarchical
integrity constraints improved F1 score by 11.9% over standard supervised learning,
while our ES based semi-supervised learning model out-performed other models in
terms of precision(78.4%) score while maintaining comparable recall(21%) score.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
2018
Agent
- Author (aut): Patil, Revanth
- Thesis advisor (ths): Shakarian, Paulo
- Committee member: Doupe, Adam
- Committee member: Davulcu, Hasan
- Publisher (pbl): Arizona State University