Full metadata
Title
Improving and Automating Machine Learning Model Compression
Description
Machine learning models are increasingly employed by smart devices on the edge to support important applications such as real-time virtual assistants and privacy-preserving healthcare. However, deploying state-of-the-art (SOTA) deep learning models on devices faces multiple serious challenges. First, it is infeasible to deploy large models on resource-constrained edge devices whereas small models cannot achieve the SOTA accuracy. Second, it is difficult to customize the models according to diverse application requirements in accuracy and speed and diverse capabilities of edge devices. This study proposes several novel solutions to comprehensively address the above challenges through automated and improved model compression. First, it introduces Automatic Attention Pruning (AAP), an adaptive, attention-based pruning approach to automatically reduce model parameters while meeting diverse user objectives in model size, speed, and accuracy. AAP achieves an impressive 92.72% parameter reduction in ResNet-101 on Tiny-ImageNet without causing any accuracy loss. Second, it presents Self-Supervised Quantization-Aware Knowledge Distillation (SQAKD), a framework for reducing model precision without supervision from labeled training data. For example, it quantizes VGG-8 to 2 bits on CIFAR-10 without any accuracy loss. Finally, the study explores two more works, Contrastive Knowledge Distillation Framework (CKDF) and Log-Curriculum based Module Replacing (LCMR), for further improving the performance of small models. All the works proposed in this study are designed to address real-world challenges, and have been successfully deployed on diverse hardware platforms, including cloud instances and edge devices, catalyzing AI for the edge.
Date Created
2024
Contributors
- Zhao, Kaiqi (Author)
- Zhao, Ming (Thesis advisor)
- Li, Baoxin (Committee member)
- Zou, Jia (Committee member)
- Yang, Yingzhen (Committee member)
- Arizona State University (Publisher)
Topical Subject
Resource Type
Extent
134 pages
Language
eng
Copyright Statement
In Copyright
Primary Member of
Peer-reviewed
No
Open Access
No
Handle
https://hdl.handle.net/2286/R.2.N.193384
Level of coding
minimal
Cataloging Standards
Note
Partial requirement for: Ph.D., Arizona State University, 2024
Field of study: Computer Science
System Created
- 2024-05-02 01:20:12
System Modified
- 2024-05-02 01:20:19
- 6 months 1 week ago
Additional Formats