JEDAI.Ed: An Interactive Explainable AI Platform for Outreach with Robotics Programming

193839-Thumbnail Image.png
Description
While the growing prevalence of robots in industry and daily life necessitatesknowing how to operate them safely and effectively, the steep learning curve of programming languages and formal AI education is a barrier for most beginner users. This thesis presents an interactive

While the growing prevalence of robots in industry and daily life necessitatesknowing how to operate them safely and effectively, the steep learning curve of programming languages and formal AI education is a barrier for most beginner users. This thesis presents an interactive platform which leverages a block based programming interface with natural language instructions to teach robotics programming to novice users. An integrated robot simulator allows users to view the execution of their high-level plan, with the hierarchical low level planning abstracted away from them. Users are provided human-understandable explanations of their planning failures and hints using LLMs to enhance the learning process. The results obtained from a user study conducted with students having minimal programming experience show that JEDAI-Ed is successful in teaching robotic planning to users, as well as increasing their curiosity about AI in general.
Date Created
2024
Agent

An Approximate Dynamic Programming Framework for Occlusion-Robust Multi-Object Tracking

193835-Thumbnail Image.png
Description
In this work, the problem of multi-object tracking (MOT) is studied, particularly the challenges that arise from object occlusions. A solution based on a principled approximate dynamic programming approach called ADPTrack is presented. ADPTrack relies on existing MOT solutions and

In this work, the problem of multi-object tracking (MOT) is studied, particularly the challenges that arise from object occlusions. A solution based on a principled approximate dynamic programming approach called ADPTrack is presented. ADPTrack relies on existing MOT solutions and directly improves them. When matching tracks to objects at a particular frame, the proposed approach simulates executions of these existing solutions into future frames to obtain approximate track extensions, from which a comparison of past and future appearance feature information is leveraged to improve overall robustness to occlusion-based error. The proposed solution when applied to the renowned MOT17 dataset empirically demonstrates a 0.7% improvement in the association accuracy (IDF1 metric) over a state-of-the-art baseline that it builds upon while obtaining minor improvements with respect to all other metrics. Moreover, it is shown that this improvement is even more pronounced in scenarios where the camera maintains a fixed position. This implies that the proposed method is effective in addressing MOT issues pertaining to object occlusions.
Date Created
2024
Agent

Data-Efficient Paradigms for Personalized Assessment of Taskable AI Systems

193680-Thumbnail Image.png
Description
Recent advances in Artificial Intelligence (AI) have brought AI closer to laypeople than ever before. This leads to a pervasive problem: how would a user ascertain whether an AI system will be safe, reliable, or useful in a given situation?

Recent advances in Artificial Intelligence (AI) have brought AI closer to laypeople than ever before. This leads to a pervasive problem: how would a user ascertain whether an AI system will be safe, reliable, or useful in a given situation? This problem becomes particularly challenging when it is considered that most autonomous systems are not designed by their users; the internal software of these systems may be unavailable or difficult to understand; and the functionality of these systems may even change from initial specifications as a result of learning. To overcome these challenges, this dissertation proposes a paradigm for third-party autonomous assessment of black-box taskable AI systems. The four main desiderata of such assessment systems are: (i) interpretability: generating a description of the AI system's functionality in a language that the target user can understand; (ii) correctness: ensuring that the description of AI system's working is accurate; (iii) generalizability creating a solution approach that works well for different types of AI systems; and (iv) minimal requirements: creating an assessment system that does not place complex requirements on AI systems to support the third-party assessment, otherwise the manufacturers of AI system's might not support such an assessment. To satisfy these properties, this dissertation presents algorithms and requirements that would enable user-aligned autonomous assessment that helps the user understand the limits of a black-box AI system's safe operability. This dissertation proposes a personalized AI assessment module that discovers the high-level ``capabilities'' of an AI system with arbitrary internal planning algorithms/policies and learns an accurate symbolic description of these capabilities in terms of concepts that a user understands. Furthermore, the dissertation includes the associated theoretical results and the empirical evaluations. The results show that (i) a primitive query-response interface can enable the development of autonomous assessment modules that can derive a causally accurate user-interpretable model of the system's capabilities efficiently, and (ii) such descriptions are easier to understand and reason with for the users than the agent's primitive actions.
Date Created
2024
Agent

Uncertainty-Aware Neural Networks for Engineering Risk Assessment and Decision Support

193678-Thumbnail Image.png
Description
This dissertation contributes to uncertainty-aware neural networks using multi-modality data, with a focus on industrial and aviation applications. Drawing from seminal works in recent years that have significantly advanced the field, this dissertation develops techniques for incorporating uncertainty estimation and

This dissertation contributes to uncertainty-aware neural networks using multi-modality data, with a focus on industrial and aviation applications. Drawing from seminal works in recent years that have significantly advanced the field, this dissertation develops techniques for incorporating uncertainty estimation and leveraging multi-modality information into neural networks for tasks such as fault detection and environmental perception. The escalating complexity of data in engineering contexts demands models that predict accurately and quantify uncertainty in these predictions. The methods proposed in this document utilize various techniques, including Bayesian Deep Learning, multi-task regularization and feature fusion, and efficient use of unlabeled data. Popular methods of uncertainty quantification are analyzed empirically to derive important insights on their use in real world engineering problems. The primary objective is to develop and refine Bayesian neural network models for enhanced predictive accuracy and decision support in engineering. This involves exploring novel architectures, regularization methods, and data fusion techniques. Significant attention is given to data handling challenges in deep learning, particularly in the context of quality inspection systems. The research integrates deep learning with vision systems for engineering risk assessment and decision support tasks, and introduces two novel benchmark datasets designed for semantic segmentation and classification tasks. Additionally, the dissertation delves into RGB-Depth data fusion for pipeline defect detection and the use of semi-supervised learning algorithms for manufacturing inspection tasks with imaging data. The dissertation contributes to bridging the gap between advanced statistical methods and practical engineering applications.
Date Created
2024
Agent

Advancing Precision in Medical Diagnostics using AI Expert-Guided Transformers for Enhanced Accuracy

193662-Thumbnail Image.png
Description
In the realm of medical diagnostics, achieving heightened accuracy is paramount, leading to the meticulous refinement of AI Models through expert-guided tuning aiming to bolster the precision by ensuring their adaptability to complex datasets and optimizing outcomes across various healthcare

In the realm of medical diagnostics, achieving heightened accuracy is paramount, leading to the meticulous refinement of AI Models through expert-guided tuning aiming to bolster the precision by ensuring their adaptability to complex datasets and optimizing outcomes across various healthcare sectors. By incorporating expert knowledge into the fine-tuning process, these advanced models become proficient at navigating the intricacies of medical data, resulting in more precise and dependable diagnostic predictions. As healthcare practitioners grapple with challenges presented by conditions requiring heightened sensitivity, such as cardiovascular diseases, continuous blood glucose monitoring, the application of nuanced refinement in Transformer Models becomes indispensable. Temporal data, a common feature in medical diagnostics, presents unique challenges for Transformer Models characterized by sequential observations over time, requiring models to capture intricate temporal dependencies and complex patterns effectively. In the study, two pivotal healthcare scenarios are delved into: the detection of Coronary Artery Disease (CAD) using Stress ECGs and the identification of psychological stress using Continuous Glucose Monitoring (CGM) data. The CAD dataset was obtained from the Mayo Clinic Integrated Stress Center (MISC) database, which encompassed 100,000 Exercise Stress ECG signals (n=1200), sourced from multiple Mayo Clinic facilities. For the CGM scenario, expert knowledge was utilized to generate synthetic data using the Bergman minimal model, which was then fed to the transformers for classification. Implementation in the CAD example yielded a remarkable 28% Positive Predictive Value (PPV) improvement over the current state-of-the-art, reaching an impressive 91.2%. This significant enhancement demonstrates the efficacy of the approach in enhancing diagnostic accuracy and underscores the transformative impact of expert-guided fine-tuning in medical diagnostics.
Date Created
2024
Agent

Value and Policy Approximation for Two-player General-sum Differential Games

193641-Thumbnail Image.png
Description
Human-robot interactions can often be formulated as general-sum differential games where the equilibrial policies are governed by Hamilton-Jacobi-Isaacs (HJI) equations. Solving HJI PDEs faces the curse of dimensionality (CoD). While physics-informed neural networks (PINNs) alleviate CoD in solving PDEs with

Human-robot interactions can often be formulated as general-sum differential games where the equilibrial policies are governed by Hamilton-Jacobi-Isaacs (HJI) equations. Solving HJI PDEs faces the curse of dimensionality (CoD). While physics-informed neural networks (PINNs) alleviate CoD in solving PDEs with smooth solutions, they fall short in learning discontinuous solutions due to their sampling nature. This causes PINNs to have poor safety performance when they are applied to approximate values that are discontinuous due to state constraints. This dissertation aims to improve the safety performance of PINN-based value and policy models. The first contribution of the dissertation is to develop learning methods to approximate discontinuous values. Specifically, three solutions are developed: (1) hybrid learning uses both supervisory and PDE losses, (2) value-hardening solves HJIs with increasing Lipschitz constant on the constraint violation penalty, and (3) the epigraphical technique lifts the value to a higher-dimensional state space where it becomes continuous. Evaluations through 5D and 9D vehicle and 13D drone simulations reveal that the hybrid method outperforms others in terms of generalization and safety performance. The second contribution is a learning-theoretical analysis of PINN for value and policy approximation. Specifically, by extending the neural tangent kernel (NTK) framework, this dissertation explores why the choice of activation function significantly affects the PINN generalization performance, and why the inclusion of supervisory costate data improves the safety performance. The last contribution is a series of extensions of the hybrid PINN method to address real-time parameter estimation problems in incomplete-information games. Specifically, a Pontryagin-mode PINN is developed to avoid costly computation for supervisory data. The key idea is the introduction of a costate loss, which is cheap to compute yet effectively enables the learning of important value changes and policies in space-time. Building upon this, a Pontryagin-mode neural operator is developed to achieve state-of-the-art (SOTA) safety performance across a set of differential games with parametric state constraints. This dissertation demonstrates the utility of the resultant neural operator in estimating player constraint parameters during incomplete-information games.
Date Created
2024
Agent

Autonomously Learning World-Model Representations For Efficient Robot Planning

193613-Thumbnail Image.png
Description
In today's world, robotic technology has become increasingly prevalent across various fields such as manufacturing, warehouses, delivery, and household applications. Planning is crucial for robots to solve various tasks in such difficult domains. However, most robots rely heavily on humans

In today's world, robotic technology has become increasingly prevalent across various fields such as manufacturing, warehouses, delivery, and household applications. Planning is crucial for robots to solve various tasks in such difficult domains. However, most robots rely heavily on humans for world models that enable planning. Consequently, it is not only expensive to create such world models, as it requires human experts who understand the domain as well as robot limitations, these models may also be biased by human embodiment, which can be limiting for robots whose kinematics are not human-like. This thesis answers the fundamental question: Can we learn such world models automatically? This research shows that we can learn complex world models directly from unannotated and unlabeled demonstrations containing only the configurations of the robot and the objects in the environment. The core contributions of this thesis are the first known approaches for i) task and motion planning that explicitly handle stochasticity, ii) automatically inventing neuro-symbolic state and action abstractions for deterministic and stochastic motion planning, and iii) automatically inventing relational and interpretable world models in the form of symbolic predicates and actions. This thesis also presents a thorough and rigorous empirical experimentation. With experiments in both simulated and real-world settings, this thesis has demonstrated the efficacy and robustness of automatically learned world models in overcoming challenges, generalizing beyond situations encountered during training.
Date Created
2024
Agent

Applications of Conditional Abstractions for Sample Efficient And Scalable Reinforcement Learning

193583-Thumbnail Image.png
Description
Reinforcement Learning (RL) presents a diverse and expansive collection of approaches that enable systems to learn and adapt through interaction with their environments. However, the widespread deployment of RL in real-world applications is hindered by challenges related to sample efficiency

Reinforcement Learning (RL) presents a diverse and expansive collection of approaches that enable systems to learn and adapt through interaction with their environments. However, the widespread deployment of RL in real-world applications is hindered by challenges related to sample efficiency and the interpretability of decision-making processes. This thesis addresses the critical challenges of sample efficiency and interpretability in reinforcement learning (RL), which are pivotal for advancing RL applications in complex, real-world scenarios.This work first presents a novel approach for learning dynamic abstract representations for continuous or parameterized state and action spaces. Empirical evaluations show that the proposed approach achieves a higher sample efficiency and beat state- of-the-art Deep-RL methods. Second, it presents a new approach HOPL for Transfer Reinforcement Learning (RL) for Stochastic Shortest Path (SSP) problems in factored domains with unknown transition functions. This approach continually learns transferable, generalizable knowledge in the form of symbolically represented options and integrates search techniques with RL to solve new problems by efficiently composing the learned options. The empirical results show that the approach achieves superior sample efficiency as compared to SOTA methods for transfering learned knowledge.
Date Created
2024
Agent

eTraM: Event-based Traffic Monitoring for Resource-Efficient Detection and Tracking Across Varied Lighting Conditions

193558-Thumbnail Image.png
Description
Traffic monitoring plays a crucial role in urban planning, transportation management, and road safety initiatives. However, existing monitoring systems often struggle to balance the need for high-resolution data acquisition and resource efficiency. This study proposes an innovative approach leveraging neuromorphic

Traffic monitoring plays a crucial role in urban planning, transportation management, and road safety initiatives. However, existing monitoring systems often struggle to balance the need for high-resolution data acquisition and resource efficiency. This study proposes an innovative approach leveraging neuromorphic sensor technology to enhance traffic monitoring efficiency while still exhibiting robust performance when exposed to difficult conditions. Neuromorphic cameras, also called event-based cameras, with their high temporal and dynamic range and minimal memory usage, have found applications in various fields. However, despite their potential, their use in static traffic monitoring is largely unexplored. This study introduces eTraM, the first-of-its-kind fully event-based traffic monitoring dataset, to address the gap in existing research. eTraM offers 10 hr of data from diverse traffic scenarios under varying lighting and weather conditions, providing a comprehensive overview of real-world situations. Providing 2M bounding box annotations, it covers eight distinct classes of traffic participants, ranging from vehicles to pedestrians and micro-mobility. eTraM's utility has been assessed using state-of-the-art methods, including RVT, RED, and YOLOv8. The quantitative evaluation of the ability of event-based models to generalize on nighttime and unseen scenes further substantiates the compelling potential of leveraging event cameras for traffic monitoring, opening new avenues for research and application.
Date Created
2024
Agent