Description
In this research, I try to solve multi-class multi-label classication problem, where
the goal is to automatically assign one or more labels(tags) to discussion topics seen
in deepweb. I observed natural hierarchy in our dataset, and I used dierent
techniques to ensure hierarchical integrity constraint on the predicted tag list. To
solve `class imbalance' and `scarcity of labeled data' problems, I developed semisupervised
model based on elastic search(ES) document relevance score. I evaluate
our models using standard K-fold cross-validation method. Ensuring hierarchical
integrity constraints improved F1 score by 11.9% over standard supervised learning,
while our ES based semi-supervised learning model out-performed other models in
terms of precision(78.4%) score while maintaining comparable recall(21%) score.
the goal is to automatically assign one or more labels(tags) to discussion topics seen
in deepweb. I observed natural hierarchy in our dataset, and I used dierent
techniques to ensure hierarchical integrity constraint on the predicted tag list. To
solve `class imbalance' and `scarcity of labeled data' problems, I developed semisupervised
model based on elastic search(ES) document relevance score. I evaluate
our models using standard K-fold cross-validation method. Ensuring hierarchical
integrity constraints improved F1 score by 11.9% over standard supervised learning,
while our ES based semi-supervised learning model out-performed other models in
terms of precision(78.4%) score while maintaining comparable recall(21%) score.
Download count: 2
Details
Title
- Multi-class and Multi-label classication of Darkweb Data
Contributors
- Patil, Revanth (Author)
- Shakarian, Paulo (Thesis advisor)
- Doupe, Adam (Committee member)
- Davulcu, Hasan (Committee member)
- Arizona State University (Publisher)
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
2018
Subjects
Resource Type
Collections this item is in
Note
-
Masters Thesis Computer Science 2018