artificial intelligence

Using ChatGPT to Aid in Concert Band Music Selection: A Pilot Study

Description

With the central focus of making it easier for wind band conductors to find repertoire written by composers from underrepresented communities, this pilot study aimed to incorporate a ChatGPT widget within The Wind Repertory Project database, allowing it to quickly pull accurate information from the corresponding website, www.windrep.org. During this research, I created a custom GPT that is directly linked to The Wind Repertory Project database and am in progress of adding the before mentioned widget. This paper documents how I obtained the knowledge necessary to link ChatGPT with The Wind Repertory Project website as well as provides a brief history of artificial intelligence (AI) and the evolution of ChatGPT.

Date Created

2024

Agent

Author (aut): Scott, Anna
Thesis advisor (ths): Caslor, Jason
Committee member: Duncan, Jamal
Committee member: Hudson, James G.
Committee member: Reymore, Lindsey
Publisher (pbl): Arizona State University

Physics-guided Machine Learning in Air Traffic Flow Prediction and Optimization

Description

The increasing demands of air travel and the escalating complexity of air traffic management (ATM) necessitate advanced air traffic flow prediction and optimization methodologies. This dissertation delves into integrating physics-guided machine learning techniques to address these challenges. By encompassing four pivotal studies, it contributes to the ATM field, showcasing how data-driven insights and physical principles can revolutionize our understanding and management of air traffic density, state predictions, flight delays, and airspace sectorization. The first study investigates the Bayesian Ensemble Graph Attention Network (BEGAN), a novel machine learning framework designed for precise air traffic density prediction. BEGAN combines spatial-temporal analysis with domain knowledge, enabling the model to interpret complex air traffic patterns in a highly dynamic and regulated airspace environment. The second study introduces the Physics-Informed Graph Attention Transformer, a novel approach integrating graph-based spatial learning with temporal Transformers. This model excels in capturing dynamic spatial-temporal interdependencies and integrates partial differential equations from fluid mechanics, enhancing the predictive accuracy and interpretability in ATM. The third study shifts focus to predictive modeling of aircraft delays, employing Physics-Informed Neural Networks. By utilizing sparse regression for system identification, this approach adeptly deciphers the intricate partial differential equations that dictate near-terminal air traffic dynamics, providing a novel perspective in forecasting flight delays with enhanced precision. The final study focuses on dynamic airspace sectorization, deploying an attention-based deep learning model that adeptly navigates the complexities of workload dynamics. In conjunction with constrained K-means clustering and evolutionary algorithms, it facilitates a more efficient and adaptable approach to airspace management, ensuring optimal traffic flow and safety. The findings of these studies demonstrate the significant impact of physics-guided machine learning in advancing ATM's safety and efficiency. They mark a shift from traditional empirical methods to innovative, data-driven approaches for air traffic management. This research enhances current practices and charts new paths for future technological advancements in aviation, especially in autonomous systems and digital transformation.

Date Created

2024

Agent

Author (aut): Xu, Qihang
Thesis advisor (ths): Liu, Yongming YL
Committee member: Yan, Hao HY
Committee member: Zhou, Xuesong XZ
Committee member: Huang, Huei-Ping HH
Committee member: Zhuang, Houlong HZ
Publisher (pbl): Arizona State University

LanSAR – Language-commanded Scene-aware Action Response

Description

Robot motion and control remains a complex problem both in general and inthe field of machine learning (ML). Without ML approaches, robot controllers are typically designed manually, which can take considerable time, generally requiring accounting for a range of edge cases and often producing models highly constrained to specific tasks. ML can decrease the time it takes to create a model while simultaneously allowing it to operate on a broader range of tasks. The utilization of neural networks to learn from demonstration is, in particular, an approach with growing popularity due to its potential to quickly fit the parameters of a model to mimic training data. Many such neural networks, especially in the realm of transformer-based architectures, act more as planners, taking in an initial context and then generating a sequence from that context one step at a time. Others hybridize the approach, predicting a latent plan and conditioning immediate actions on that plan. Such approaches may limit a model’s ability to interact with a dynamic environment, needing to replan to fully update its understanding of the environmental context. In this thesis, Language-commanded Scene-aware Action Response (LanSAR) is proposed as a reactive transformer-based neural network that makes immediate decisions based on previous actions and environmental changes. Its actions are further conditioned on a language command, serving as a control mechanism while also narrowing the distribution of possible actions around this command. It is shown that LanSAR successfully learns a strong representation of multimodal visual and spatial input, and learns reasonable motions in relation to most language commands. It is also shown that LanSAR can struggle with both the accuracy of motions and understanding the specific semantics of language commands

Date Created

2024

Agent

Author (aut): Hardy, Adam
Thesis advisor (ths): Ben Amor, Heni
Committee member: Srivastava, Siddharth
Committee member: Pavlic, Theodore
Publisher (pbl): Arizona State University

Addressing Efficiency and Reliability Challenges in Natural Language Processing

Description

Recently developed large language models have achieved remarkable success on a wide range of natural language tasks. Furthermore, they have been shown to possess an impressive ability to generate fluent and coherent text. Despite all the notable abilities of these models, there exist several efficiency and reliability related challenges. For example, they are vulnerable to a phenomenon called 'hallucination' in which they generate text that is not factually correct and they also have a large number of parameters which makes their inference slow and computationally expensive. With the objective of taking a step closer towards further enabling the widespread adoption of the Natural Language Processing (NLP) systems, this dissertation studies the following question: how to effectively address the efficiency and reliability related concerns of the NLP systems? Specifically, to improve the reliability of models, this dissertation first presents an approach that actively detects and mitigates the hallucinations of LLMs using a retrieval augmented methodology. Note that another strategy to mitigate incorrect predictions is abstention from answering when error is likely, i.e., selective prediction. To this end, I present selective prediction approaches and conduct extensive experiments to demonstrate their effectiveness. Building on top of selective prediction, I also present post-abstention strategies that focus on reliably increasing the coverage of a selective prediction system without considerably impacting its accuracy. Furthermore, this dissertation covers multiple aspects of improving the efficiency including 'inference efficiency' (making model inferences in a computationally efficient manner without sacrificing the prediction accuracy), 'data sample efficiency' (efficiently collecting data instances for training a task-specific system), 'open-domain QA reader efficiency' (leveraging the external knowledge efficiently while answering open-domain questions), and 'evaluation efficiency' (comparing the performance of different models efficiently). In summary, this dissertation highlights several challenges pertinent to the efficiency and reliability involved in the development of NLP systems and provides effective solutions to address them.

Date Created

2024

Agent

Author (aut): Varshney, Neeraj
Thesis advisor (ths): Baral, Chitta
Committee member: Yang, Yezhou
Committee member: Gopalan, Nakul
Committee member: Banerjee, Pratyay
Publisher (pbl): Arizona State University

Improving and Automating Machine Learning Model Compression

Description

Machine learning models are increasingly employed by smart devices on the edge to support important applications such as real-time virtual assistants and privacy-preserving healthcare. However, deploying state-of-the-art (SOTA) deep learning models on devices faces multiple serious challenges. First, it is infeasible to deploy large models on resource-constrained edge devices whereas small models cannot achieve the SOTA accuracy. Second, it is difficult to customize the models according to diverse application requirements in accuracy and speed and diverse capabilities of edge devices. This study proposes several novel solutions to comprehensively address the above challenges through automated and improved model compression. First, it introduces Automatic Attention Pruning (AAP), an adaptive, attention-based pruning approach to automatically reduce model parameters while meeting diverse user objectives in model size, speed, and accuracy. AAP achieves an impressive 92.72% parameter reduction in ResNet-101 on Tiny-ImageNet without causing any accuracy loss. Second, it presents Self-Supervised Quantization-Aware Knowledge Distillation (SQAKD), a framework for reducing model precision without supervision from labeled training data. For example, it quantizes VGG-8 to 2 bits on CIFAR-10 without any accuracy loss. Finally, the study explores two more works, Contrastive Knowledge Distillation Framework (CKDF) and Log-Curriculum based Module Replacing (LCMR), for further improving the performance of small models. All the works proposed in this study are designed to address real-world challenges, and have been successfully deployed on diverse hardware platforms, including cloud instances and edge devices, catalyzing AI for the edge.

Date Created

2024

Agent

Author (aut): Zhao, Kaiqi
Thesis advisor (ths): Zhao, Ming
Committee member: Li, Baoxin
Committee member: Zou, Jia
Committee member: Yang, Yingzhen
Publisher (pbl): Arizona State University

Incorporating Causal Information using Temporal-Logic-Based Causal Diagram in Reinforcement Learning

Description

In this thesis, I investigate a subset of reinforcement learning (RL) tasks where the objective for the agent is to achieve temporally extended goals. A common approach, in this setting, is to represent the tasks using deterministic finite automata (DFA) and integrate them in the state space of the RL algorithms, yet such representations often disregard causal knowledge pertinent to the environment. To address this limitation, I introduce the Temporal-Logic-based Causal Diagram (TL-CD) in RL.TL-CD encapsulates temporal causal relationships among diverse environmental properties. We leverage the TL-CD to devise an RL algorithm that significantly reduces environment exploration requirements. By synergizing TL-CD with task-specific DFAs, I identify scenarios wherein the agent can efficiently determine expected rewards early during the exploration phases. Through a series of case studies, I empirically demonstrate the advantages of TL-CDs, particularly highlighting the accelerated convergence of the algorithm towards an optimal policy facilitated by diminished exploration of the environment.

Date Created

2024

Agent

Author (aut): Paliwal, Yash
Thesis advisor (ths): Xu, Zhe
Committee member: Marvi, Hamidreza
Committee member: Berman, Spring
Publisher (pbl): Arizona State University

Towards Unsupervised Denoising of Magnetic Resonance Imaging Scans

Description

Image denoising, a fundamental task in computer vision, poses significant challenges due to its inherently inverse and ill-posed nature. Despite advancements in traditional methods and supervised learning approaches, particularly in medical imaging such as Medical Resonance Imaging (MRI) scans, the reliance on paired datasets and known noise distributions remains a practical hurdle. Recent progress in noise statistical independence theory and diffusion models has revitalized research interest, offering promising avenues for unsupervised denoising. However, existing methods often yield overly smoothed results or introduce hallucinated structures, limiting their clinical applicability. This thesis tackles the core challenge of progressing towards unsupervised denoising of MRI scans. It aims to retain intricate details without smoothing or introducing artificial structures, thus ensuring the production of high-quality MRI images. The thesis makes a three-fold contribution: Firstly, it presents a detailed analysis of traditional techniques, early machine learning algorithms for denoising, and new statistical-based models, with an extensive evaluation study on self-supervised denoising methods highlighting their limitations. Secondly, it conducts an evaluation study on an emerging class of diffusion-based denoising methods, accompanied by additional empirical findings and discussions on their effectiveness and limitations, proposing solutions to enhance their utility. Lastly, it introduces a novel approach, Unsupervised Multi-stage Ensemble Deep Learning with diffusion models for denoising MRI scans (MEDL). Leveraging diffusion models, this approach operates independently of signal or noise priors and incorporates weighted rescaling of multi-stage reconstructions to balance over-smoothing and hallucination tendencies. Evaluation using benchmark datasets demonstrates an average gain of 1dB and 2% in PSNR and SSIM metrics, respectively, over existing approaches.

Date Created

2024

Agent

Author (aut): Vora, Sahil
Thesis advisor (ths): Li, Baoxin
Committee member: Wang, Yalin
Committee member: Zhou, Yuxiang
Publisher (pbl): Arizona State University

QPMeL: Quantum Polar Metric Learning

Description

Deep metric learning has recently shown extremely promising results in the classical data domain, creating well-separated feature spaces. This idea was also adapted to quantum computers via Quantum Metric Learning (QMeL). QMeL consists of a 2 step process with a classical model to compress the data to fit into the limited number of qubits, then train a Parameterized Quantum Circuit (PQC) to create better separation in Hilbert Space. However, on Noisy Intermediate Scale Quantum (NISQ) devices, QMeL solutions result in high circuit width and depth, both of which limit scalability. The proposed Quantum Polar Metric Learning (QPMeL ), uses a classical model to learn the parameters of the polar form of a qubit. A shallow PQC with Ry and Rz gates is then utilized to create the state and a trainable layer of ZZ(θ)-gates to learn entanglement. The circuit also computes fidelity via a SWAP Test for the proposed Fidelity Triplet Loss function, used to train both classical and quantum components. When compared to QMeL approaches, QPMeL achieves 3X better multi-class separation, while using only 1/2 the number of gates and depth. QPMeL is shown to outperform classical networks with similar configurations, presentinga promising avenue for future research on fully classical models with quantum loss functions.

Date Created

2024

Agent

Author (aut): Sharma, Vinayak
Thesis advisor (ths): Shrivastava, Aviral
Committee member: Jiang, Zilin
Committee member: Kambhampati, Subbarao
Publisher (pbl): Arizona State University

LUCI: Multi-Application Orchestration Agent

Description

Research in building agents by employing Large Language Models (LLMs) for computer control is expanding, aiming to create agents that can efficiently automate complex or repetitive computational tasks. Prior works showcased the potential of Large Language Models (LLMs) with in-context learning (ICL). However, they suffered from limited context length and poor generalization of the underlying models, which led to poor performance in long-horizon tasks, handling multiple applications and working across multiple domains. While initial work focused on extending the coding capabilities of LLMs to work with APIs to accomplish tasks, a new body of work focused on Graphical User Interface (GUI) manipulation has shown strong success in web and mobile application automation. In this work, I introduce LUCI: Large Language Model-assisted User Control Interface, a hierarchical, modular, and efficient framework to extend the capabilities of LLMs to automate GUIs. LUCI utilizes the reasoning capabilities of LLMs to decompose tasks into sub-tasks and recursively solve them. A key innovation is the application-centric approach which creates sub-tasks by first selecting the applications needed to solve the prompt. The GUI application is decomposed into a novel compressed Information-Action-Field (IAF) representation based on the underlying syntax tree. Furthermore, LUCI follows a modular structure allowing it to be extended to new platforms without any additional training as the underlying reasoning works on my IAF representations. These innovations alongside the `ensemble of LLMs' structure allow LUCI to outperform previous supervised learning (SL), reinforcement learning (RL), and LLM approaches on Miniwob++, overcoming challenges such as limited context length, exemplar memory requirements, and human intervention for task adaptability. LUCI shows a 20% improvement over the state-of-the-art (SOTA) in GUI automation on the Mind2Web benchmark. When tested in a realistic setting with over 22 commonly used applications, LUCI achieves an 80% success rate in undertaking tasks that use a subset of these applications. I also note an over 70% success rate on unseen applications, which is a less than 5% drop as compared to the fine-tuned applications.

Date Created

2024

Agent

Author (aut): LAGUDU, GUNA SEKHAR SAI HARSHA
Thesis advisor (ths): Shrivastava, Aviral
Committee member: Ramapuram Matavalam, Amarsagar Reddy
Committee member: Chhabria, Vidya
Publisher (pbl): Arizona State University

Joint Learning of Reward Machines and Policies for Multi-Agent Reinforcement Learning in Non-Cooperative Stochastic Games

Description

Multi-agent reinforcement learning (MARL) plays a pivotal role in artificial intelligence by facilitating the learning process in complex environments inhabited by multiple entities. This thesis explores the integration of learning high-level knowledge through reward machines (RMs) with MARL to effectively manage non-Markovian reward functions in non-cooperative stochastic games. Reward machines offer a sophisticated way to model the temporal structure of rewards, thereby providing an enhanced representation of agent decision-making processes. A novel algorithm JIRP-SG is introduced, enabling agents to concurrently learn RMs and optimize their best response policies while navigating the intricate temporal dependencies present in non-cooperative settings. This approach employs automata learning to iteratively acquire RMs and utilizes the Lemke-Howson method to update the Q-functions, aiming for a Nash equilibrium. It is demonstrated that the method introduced reliably converges to accurately encode the reward functions and achieve the optimal best response policy for each agent over time. The effectiveness of the proposed approach is validated through case studies, including a Pacman Game scenario and a Factory Assembly scenario, illustrating its superior performance compared to baseline methods. Additionally, the impact of batch size on learning performance is examined, revealing that a diligent agent employing smaller batches can surpass the performance of an agent using larger batches, which fails to summarize experiences as effectively.

Date Created

2024

Agent

Author (aut): Kim, Hyohun
Thesis advisor (ths): Xu, Zhe ZX
Committee member: Lee, Hyunglae HL
Committee member: Berman, Spring SB
Publisher (pbl): Arizona State University

Subscribe to artificial intelligence