Visual impairment is a significant challenge that affects millions of people worldwide. Access to written text, such as books, documents, and other printed materials, can be particularly difficult for individuals with visual impairments. In order to address this issue, our…
Visual impairment is a significant challenge that affects millions of people worldwide. Access to written text, such as books, documents, and other printed materials, can be particularly difficult for individuals with visual impairments. In order to address this issue, our project aims to develop a text-to-Braille and speech translating device that will help people with visual impairments to access written text more easily and independently.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
In this thesis, six experiments which were computer simulations were conducted in order to replicate the negative association between sample size and accuracy that is repeatedly found in ML literature by accounting for data leakage and publication bias. The reason…
In this thesis, six experiments which were computer simulations were conducted in order to replicate the negative association between sample size and accuracy that is repeatedly found in ML literature by accounting for data leakage and publication bias. The reason why it is critical to understand why this negative association is occurring is that in published studies, there have been multiple reports that the accuracies in ML models are overoptimistic leading to cases where the results are irreproducible despite conducting multiple trials and experiments. Additionally, after replicating the negative association between sample size and accuracy, parametric curves (learning curves with the parametric function) were fitted along the empirical learning curves in order to evaluate the performance. It was found that there is a significant variance in accuracies when the sample size is small, but little to no variation when the sample size is large. In other words, the empirical learning curves with data leakage and publication bias were able to achieve the same accuracy as the learning curve without data leakage at a large sample size.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
In the standard pipeline for machine learning model development, several design decisions are made largely based on trial and error. Take the classification problem as an example. The starting point for classifier design is a dataset with samples from the…
In the standard pipeline for machine learning model development, several design decisions are made largely based on trial and error. Take the classification problem as an example. The starting point for classifier design is a dataset with samples from the classes of interest. From this, the algorithm developer must decide which features to extract, which hypothesis class to condition on, which hyperparameters to select, and how to train the model. The design process is iterative with the developer trying different classifiers, feature sets, and hyper-parameters and using cross-validation to pick the model with the lowest error. As there are no guidelines for when to stop searching, developers can continue "optimizing" the model to the point where they begin to "fit to the dataset". These problems are amplified in the active learning setting, where the initial dataset may be unlabeled and label acquisition is costly. The aim in this dissertation is to develop algorithms that provide ML developers with additional information about the complexity of the underlying problem to guide downstream model development. I introduce the concept of "meta-features" - features extracted from a dataset that characterize the complexity of the underlying data generating process. In the context of classification, the complexity of the problem can be characterized by understanding two complementary meta-features: (a) the amount of overlap between classes, and (b) the geometry/topology of the decision boundary. Across three complementary works, I present a series of estimators for the meta-features that characterize overlap and geometry/topology of the decision boundary, and demonstrate how they can be used in algorithm development.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
Linear-regression estimators have become widely accepted as a reliable statistical tool in predicting outcomes. Because linear regression is a long-established procedure, the properties of linear-regression estimators are well understood and can be trained very quickly. Many estimators exist for modeling…
Linear-regression estimators have become widely accepted as a reliable statistical tool in predicting outcomes. Because linear regression is a long-established procedure, the properties of linear-regression estimators are well understood and can be trained very quickly. Many estimators exist for modeling linear relationships, each having ideal conditions for optimal performance. The differences stem from the introduction of a bias into the parameter estimation through the use of various regularization strategies. One of the more popular ones is ridge regression which uses ℓ2-penalization of the parameter vector. In this work, the proposed graph regularized linear estimator is pitted against the popular ridge regression when the parameter vector is known to be dense. When additional knowledge that parameters are smooth with respect to a graph is available, it can be used to improve the parameter estimates. To achieve this goal an additional smoothing penalty is introduced into the traditional loss function of ridge regression. The mean squared error(m.s.e) is used as a performance metric and the analysis is presented for fixed design matrices having a unit covariance matrix. The specific problem setup enables us to study the theoretical conditions where the graph regularized estimator out-performs the ridge estimator. The eigenvectors of the laplacian matrix indicating the graph of connections between the various dimensions of the parameter vector form an integral part of the analysis. Experiments have been conducted on simulated data to compare the performance of the two estimators for laplacian matrices of several types of graphs – complete, star, line and 4-regular. The experimental results indicate that the theory can possibly be extended to more general settings taking smoothness, a concept defined in this work, into consideration.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
Modern physical systems are experiencing tremendous evolutions with growing size, more and more complex structures, and the incorporation of new devices. This calls for better planning, monitoring, and control. However, achieving these goals is challenging since the system knowledge (e.g.,…
Modern physical systems are experiencing tremendous evolutions with growing size, more and more complex structures, and the incorporation of new devices. This calls for better planning, monitoring, and control. However, achieving these goals is challenging since the system knowledge (e.g., system structures and edge parameters) may be unavailable for a normal system, let alone some dynamic changes like maintenance, reconfigurations, and events, etc. Therefore, extracting system knowledge becomes a central topic. Luckily, advanced metering techniques bring numerous data, leading to the emergence of Machine Learning (ML) methods with efficient learning and fast inference. This work tries to propose a systematic framework of ML-based methods to learn system knowledge under three what-if scenarios: (i) What if the system is normally operated? (ii) What if the system suffers dynamic interventions? (iii) What if the system is new with limited data? For each case, this thesis proposes principled solutions with extensive experiments. Chapter 2 tackles scenario (i) and the golden rule is to learn an ML model that maintains physical consistency, bringing high extrapolation capacity for changing operational conditions. The key finding is that physical consistency can be linked to convexity, a central concept in optimization. Therefore, convexified ML designs are proposed and the global optimality implies faithfulness to the underlying physics. Chapter 3 handles scenario (ii) and the goal is to identify the event time, type, and locations. The problem is formalized as multi-class classification with special attention to accuracy and speed. Subsequently, Chapter 3 builds an ensemble learning framework to aggregate different ML models for better prediction. Next, to tackle high-volume data quickly, a tensor as the multi-dimensional array is used to store and process data, yielding compact and informative vectors for fast inference. Finally, if no labels exist, Chapter 3 uses physical properties to generate labels for learning. Chapter 4 deals with scenario (iii) and a doable process is to transfer knowledge from similar systems, under the framework of Transfer Learning (TL). Chapter 4 proposes cutting-edge system-level TL by considering the network structure, complex spatial-temporal correlations, and different physical information.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
In the era of big data, more and more decisions and recommendations are being made by machine learning (ML) systems and algorithms. Despite their many successes, there have been notable deficiencies in the robustness, rigor, and reliability of these ML…
In the era of big data, more and more decisions and recommendations are being made by machine learning (ML) systems and algorithms. Despite their many successes, there have been notable deficiencies in the robustness, rigor, and reliability of these ML systems, which have had detrimental societal impacts. In the next generation of ML, these significant challenges must be addressed through careful algorithmic design, and it is crucial that practitioners and meta-algorithms have the necessary tools to construct ML models that align with human values and interests. In an effort to help address these problems, this dissertation studies a tunable loss function called α-loss for the ML setting of classification. The alpha-loss is a hyperparameterized loss function originating from information theory that continuously interpolates between the exponential (alpha = 1/2), log (alpha = 1), and 0-1 (alpha = infinity) losses, hence providing a holistic perspective of several classical loss functions in ML. Furthermore, the alpha-loss exhibits unique operating characteristics depending on the value (and different regimes) of alpha; notably, for alpha > 1, alpha-loss robustly trains models when noisy training data is present. Thus, the alpha-loss can provide robustness to ML systems for classification tasks, and this has bearing in many applications, e.g., social media, finance, academia, and medicine; indeed, results are presented where alpha-loss produces more robust logistic regression models for COVID-19 survey data with gains over state of the art algorithmic approaches.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
A remarkable phenomenon in contemporary physics is quantum scarring in classically chaoticsystems, where the wave functions tend to concentrate on classical periodic orbits. Quantum
scarring has been studied for more than four decades, but the problem of efficiently detecting
quantum scars has…
A remarkable phenomenon in contemporary physics is quantum scarring in classically chaoticsystems, where the wave functions tend to concentrate on classical periodic orbits. Quantum
scarring has been studied for more than four decades, but the problem of efficiently detecting
quantum scars has remained to be challenging, relying mostly on human visualization of
wave function patterns. This paper develops a machine learning approach to detecting
quantum scars in an automated and highly efficient manner. In particular, this paper exploits Meta
learning. The first step is to construct a few-shot classification algorithm, under the
requirement that the one-shot classification accuracy be larger than 90%. Then propose a
scheme based on a combination of neural networks to improve the accuracy. This paper shows that
the machine learning scheme can find the correct quantum scars from thousands images of
wave functions, without any human intervention, regardless of the symmetry of the underlying
classical system. This will be the first application of Meta learning to quantum systems. Interacting spin networks are fundamental to quantum computing. Data-based tomography oftime-independent spin networks has been achieved, but an open challenge is to ascertain the
structures of time-dependent spin networks using time series measurements taken locally
from a small subset of the spins. Physically, the dynamical evolution of a spin network under
time-dependent driving or perturbation is described by the Heisenberg equation of motion.
Motivated by this basic fact, this paper articulates a physics-enhanced machine learning framework
whose core is Heisenberg neural networks. This paper demonstrates that, from local measurements, not only the local Hamiltonian can be recovered but the Hamiltonian reflecting the interacting structure of the whole system can
also be faithfully reconstructed. Using Heisenberg neural machine on spin networks of a
variety of structures. In the extreme case where measurements are taken from only one spin,
the achieved tomography fidelity values can reach about 90%. The developed machine
learning framework is applicable to any time-dependent systems whose quantum dynamical
evolution is governed by the Heisenberg equation of motion.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
Due to their effectiveness in capturing similarities between different entities, graphical models are widely used to represent datasets that reside on irregular and complex manifolds. Graph signal processing offers support to handle such complex datasets. By extending the digital signal…
Due to their effectiveness in capturing similarities between different entities, graphical models are widely used to represent datasets that reside on irregular and complex manifolds. Graph signal processing offers support to handle such complex datasets. By extending the digital signal processing conceptual frame from time and frequency domain to graph domain, operators such as graph shift, graph filter and graph Fourier transform are defined. In this dissertation, two novel graph filter design methods are proposed. First, a graph filter with multiple shift matrices is applied to semi-supervised classification, which can handle features with uneven qualities through an embedded feature importance evaluation process. Three optimization solutions are provided: an alternating minimization method that is simple to implement, a convex relaxation method that provides a theoretical performance benchmark and a genetic algorithm, which is computationally efficient and better at configuring overfitting. Second, a graph filter with splitting-and-merging scheme is proposed, which splits the graph into multiple subgraphs. The corresponding subgraph filters are trained parallelly and in the last, by merging all the subgraph filters, the final graph filter is obtained. Due to the splitting process, the redundant edges in the original graph are dropped, which can save computational cost in semi-supervised classification. At the same time, this scheme also enables the filter to represent unevenly sampled data in manifold learning. To evaluate the performance of the proposed graph filter design approaches, simulation experiments with synthetic and real datasets are conduct. The Monte Carlo cross validation method is employed to demonstrate the need for the proposed graph filter design approaches in various application scenarios. Criterions, such as accuracy, Gini score, F1-score and learning curves, are provided to analyze the performance of the proposed methods and their competitors.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
The seminal work of Lasry and Lion showed the existence of Nash equilibria in thecontinuum limit of agents who try to optimize their own utility functions. However,
a lot of work in this region is predicated on strong assumptions on the…
The seminal work of Lasry and Lion showed the existence of Nash equilibria in thecontinuum limit of agents who try to optimize their own utility functions. However,
a lot of work in this region is predicated on strong assumptions on the asymptotic
independence of the agents and their homogeneity. This work explores the existence
of Equilibria under the limit for Markov Decision Processes for density dependent
continuous time Markov chains. Under suitable conditions it is possible to show
that the empirical measure of the agents converges in finite time to a time invariant
distribution which makes the solution of the MDP tractable. This key step allows
one to show not only the existence of equilibria for these MDPs without asymptotic
independence but also a tractable means to find said equilibria. Finally, this work
shows that a fixed point does exist in the in finite state limit. However, to show that
such a limit is indeed a Nash equilibrium remains an open problem.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
With the formation of next generation wireless communication, a growing number of new applications like internet of things, autonomous car, and drone is crowding the unlicensed spectrum. Licensed network such as LTE also comes to the unlicensed spectrum for better…
With the formation of next generation wireless communication, a growing number of new applications like internet of things, autonomous car, and drone is crowding the unlicensed spectrum. Licensed network such as LTE also comes to the unlicensed spectrum for better providing high-capacity contents with low cost. However, LTE was not designed for sharing spectrum with others. A cooperation center for these networks is costly because they possess heterogeneous properties and everyone can enter and leave the spectrum unrestrictedly, so the design will be challenging. Since it is infeasible to incorporate potentially infinite scenarios with one unified design, an alternative solution is to let each network learn its own coexistence policy. Previous solutions only work on fixed scenarios. In this work we present a reinforcement learning algorithm to cope with the coexistence between Wi-Fi and LTE-LAA agents in 5 GHz unlicensed spectrum. The coexistence problem was modeled as a Dec-POMDP and Bayesian approach was adopted for policy learning with nonparametric prior to accommodate the uncertainty of policy for different agents. A fairness measure was introduced in the reward function to encourage fair sharing between agents. We turned the reinforcement learning into an optimization problem by transforming the value function as likelihood and variational inference for posterior approximation. Simulation results demonstrate that this algorithm can reach high value with compact policy representations, and stay computationally efficient when applying to agent set.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)