Description
The introduction of parameterized loss functions for robustness in machine learning has led to questions as to how hyperparameter(s) of the loss functions can be tuned. This thesis explores how Bayesian methods can be leveraged to tune such hyperparameters. Specifically, a modified Gibbs sampling scheme is used to generate a distribution of loss parameters of tunable loss functions. The modified Gibbs sampler is a two-block sampler that alternates between sampling the loss parameter and optimizing the other model parameters. The sampling step is performed using slice sampling, while the optimization step is performed using gradient descent. This thesis explores the application of the modified Gibbs sampler to alpha-loss, a tunable loss function with a single parameter $\alpha \in (0,\infty]$, that is designed for the classification setting. Theoretically, it is shown that the Markov chain generated by a modified Gibbs sampling scheme is ergodic; that is, the chain has, and converges to, a unique stationary (posterior) distribution. Further, the modified Gibbs sampler is implemented in two experiments: a synthetic dataset and a canonical image dataset. The results show that the modified Gibbs sampler performs well under label noise, generating a distribution indicating preference for larger values of alpha, matching the outcomes of previous experiments.
Details
Title
- Bayesian Methods for Tuning Hyperparameters of Loss Functions in Machine Learning
Contributors
- Cole, Erika Lingo (Author)
- Sankar, Lalitha (Thesis advisor)
- Lan, Shiwei (Thesis advisor)
- Pedrielli, Giulia (Committee member)
- Hahn, Paul (Committee member)
- Arizona State University (Publisher)
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
2022
Subjects
Resource Type
Collections this item is in
Note
-
Partial requirement for: M.A., Arizona State University, 2022
-
Field of study: Mathematics