Ben Amor, Hani

Autonomous System Control of Multiple Robotic Arms Collaboration via Machine Learning

Description

Multiple robotic arms collaboration is to control multiple robotic arms to collaborate with each other to work on the same task. During the collaboration, theagent is required to avoid all possible collisions between each part of the robotic arms. Thus, incentivizing collaboration and preventing collisions are the two principles which are followed by the agent during the training process. Nowadays, more and more applications, both in industry and daily lives, require at least two arms, instead of requiring only a single arm. A dual-arm robot satisfies much more needs of different types of tasks, such as folding clothes at home, making a hamburger in a grill or picking and placing a product in a warehouse. The applications done in this paper are all about object pushing. This thesis focuses on how to train the agent to learn pushing an object away as far as possible. Reinforcement Learning (RL), which is a type of Machine Learning (ML), is then utilized in this paper to train the agent to generate optimal actions. Deep Deterministic Policy Gradient (DDPG) and Hindsight Experience Replay (HER) are the two RL methods used in this thesis.

Date Created

2023

Agent

Author (aut): Lin, Steve
Thesis advisor (ths): Ben Amor, Hani
Committee member: Redkar, Sangram
Committee member: Zhang, Yu
Publisher (pbl): Arizona State University

Dynamic Potential Fields for Flexible Behavior-based Swarm Control via Reinforcement Learning

Description

In this thesis work, a novel learning approach to solving the problem of controllinga quadcopter (drone) swarm is explored. To deal with large sizes, swarm control is often achieved in a distributed fashion by combining different behaviors such that each behavior implements some desired swarm characteristics, such as avoiding ob- stacles and staying close to neighbors. One common approach in distributed swarm control uses potential fields. A limitation of this approach is that the potential fields often depend statically on a set of control parameters that are manually specified a priori. This paper introduces Dynamic Potential Fields for flexible swarm control. These potential fields are modulated by a set of dynamic control parameters (DCPs) that can change under different environment situations. Since the focus is only on these DCPs, it simplifies the learning problem and makes it feasible for practical use. This approach uses soft actor critic (SAC) where the actor only determines how to modify DCPs in the current situation, resulting in more flexible swarm control. In the results, this work will show that the DCP approach allows for the drones to bet- ter traverse environments with obstacles compared to several state-of-the-art swarm control methods with a fixed set of control parameters. This approach also obtained a higher safety score commonly used to assess swarm behavior. A basic reinforce- ment learning approach is compared to demonstrate faster convergence. Finally, an ablation study is conducted to validate the design of this approach.

Date Created

2022

Agent

Author (aut): Ferraro, Calvin Shores
Thesis advisor (ths): Zhang, Yu
Committee member: Ben Amor, Hani
Committee member: Berman, Spring
Publisher (pbl): Arizona State University

Traffic Accident Reconstruction Using Monocular Dashcam Videos

Description

Automated driving systems (ADS) have come a long way since their inception. It is clear that these systems rely heavily on stochastic deep learning techniques for perception, planning, and prediction, as it is impossible to construct every possible driving scenario to generate driving policies. Moreover, these systems need to be trained and validated extensively on typical and abnormal driving situations before they can be trusted with human life. However, most publicly available driving datasets only consist of typical driving behaviors. On the other hand, there is a plethora of videos available on the internet that capture abnormal driving scenarios, but they are unusable for ADS training or testing as they lack important information such as camera calibration parameters, and annotated vehicle trajectories. This thesis proposes a new toolbox, DeepCrashTest-V2, that is capable of reconstructing high-quality simulations from monocular dashcam videos found on the internet. The toolbox not only estimates the crucial parameters such as camera calibration, ego-motion, and surrounding road user trajectories but also creates a virtual world in Car Learning to Act (CARLA) using data from OpenStreetMaps to simulate the estimated trajectories. The toolbox is open-source and is made available in the form of a python package on GitHub at https://github.com/C-Aniruddh/deepcrashtest_v2.

Date Created

2022

Agent

Author (aut): Chandratre, Aniruddh Vinay
Thesis advisor (ths): Fainekos, Georgios
Thesis advisor (ths): Ben Amor, Hani
Committee member: Pedrielli, Giulia
Publisher (pbl): Arizona State University

Generative Models for Trajectory Prediction

Description

Trajectory forecasting is used in many fields such as vehicle future trajectory prediction, stock market price prediction, human motion prediction and so on. Also, robots having the capability to reason about human behavior is an important aspect in human robot interaction. In trajectory prediction with regards to human motion prediction, implicit learning and reproduction of human behavior is the major challenge. This work tries to compare some of the recent advances taking a phenomenological approach to trajectory prediction. \par The work is expected to mainly target on generating future events or trajectories based on the previous data observed across many time intervals. In particular, this work presents and compares machine learning models to generate various human handwriting trajectories. Although the behavior of every individual is unique, it is still possible to broadly generalize and learn the underlying human behavior from the current observations to predict future human writing trajectories. This enables the machine or the robot to generate future handwriting trajectories given an initial trajectory from the individual thus helping the person to fill up the rest of the letter or curve. This work tests and compares the performance of Conditional Variational Autoencoders and Sinusoidal Representation Network models on handwriting trajectory prediction and reconstruction.

Date Created

2021

Agent

Author (aut): Kota, Venkata Anil
Thesis advisor (ths): Ben Amor, Hani
Committee member: Venkateswara, Hemanth Kumar Demakethepalli
Committee member: Redkar, Sangram
Publisher (pbl): Arizona State University

Robotic Swarm Control using Deep Reinforcement Learning Strategies based on Mean-Field Models

Description

As technological advancements in silicon, sensors, and actuation continue, the development of robotic swarms is shifting from the domain of science fiction to reality. Many swarm applications, such as environmental monitoring, precision agriculture, disaster response, and lunar prospecting, will require controlling numerous robots with limited capabilities and information to redistribute among multiple states, such as spatial locations or tasks. A scalable control approach is to program the robots with stochastic control policies such that the robot population in each state evolves according to a mean-field model, which is independent of the number and identities of the robots. Using this model, the control policies can be designed to stabilize the swarm to the target distribution. To avoid the need to reprogram the robots for different target distributions, the robot control policies can be defined to depend only on the presence of a “leader” agent, whose control policy is designed to guide the swarm to a particular distribution. This dissertation presents a novel deep reinforcement learning (deep RL) approach to designing control policies that redistribute a swarm as quickly as possible over a strongly connected graph, according to a mean-field model in the form of the discrete-time Kolmogorov forward equation. In the leader-based strategies, the leader determines its next action based on its observations of robot populations and shepherds the swarm over the graph by probabilistically repelling nearby robots. The scalability of this approach with the swarm size is demonstrated with leader control policies that are designed using two tabular Temporal-Difference learning algorithms, trained on a discretization of the swarm distribution. To improve the scalability of the approach with robot population and graph size, control policies for both leader-based and leaderless strategies are designed using an actor-critic deep RL method that is trained on the swarm distribution predicted by the mean-field model. In the leaderless strategy, the robots’ control policies depend only on their local measurements of nearby robot populations. The control approaches are validated for different graph and swarm sizes in numerical simulations, 3D robot simulations, and experiments on a multi-robot testbed.

Date Created

2021

Agent

Author (aut): Kakish, Zahi Mousa
Thesis advisor (ths): Berman, Spring
Committee member: Yong, Sze Zheng
Committee member: Marvi, Hamid
Committee member: Pavlic, Theodore
Committee member: Pratt, Stephen
Committee member: Ben Amor, Hani
Publisher (pbl): Arizona State University

Evaluation of Machine Learning Algorithms for Modeling Therapist Assistance during Gait Rehabilitation

Description

Robotic assisted devices in gait rehabilitation have not seen penetration into clinical settings proportionate to the developments in this field. A possible reason for this is due to the development and evaluation of these devices from a predominantly engineering perspective. One way to mitigate this effect is to further include the principles of neurophysiology into the development of these systems. To further include these principles, this research proposes a method for grounded evaluation of three machine learning algorithms to gain insight on what modeling approaches are able to both replicate therapist assistance and emulate therapist strategies. The algorithms evaluated in this paper include ordinary least squares regression (OLS), gaussian process regression (GPR) and inverse reinforcement learning (IRL). The results show that grounded evaluation is able to provide evidence to support the algorithms at a higher resolution. Also, it was observed that GPR is likely the most accurate algorithm to replicate therapist assistance and to emulate therapist adaptation strategies.

Date Created

2021

Agent

Author (aut): Smith, Mason Owen
Thesis advisor (ths): Zhang, Wenlong
Committee member: Ben Amor, Hani
Committee member: Sugar, Thomas
Publisher (pbl): Arizona State University

Differentiable Harvard Machine Architecture with Neural Network Controller

Description

There have been multiple attempts of coupling neural networks with external memory components for sequence learning problems. Such architectures have demonstrated success in algorithmic, sequence transduction, question-answering and reinforcement learning tasks. Most notable of these attempts is the Neural Turing Machine (NTM), which is an implementation of the Turing Machine with a neural network controller that interacts with a continuous memory. Although the architecture is Turing complete and hence, universally computational, it has seen limited success with complex real-world tasks.

In this thesis, I introduce an extension of the Neural Turing Machine, the Neural Harvard Machine, that implements a fully differentiable Harvard Machine framework with a feed-forward neural network controller. Unlike the NTM, it has two different memories - a read-only program memory and a read-write data memory. A sufficiently complex task is divided into smaller, simpler sub-tasks and the program memory stores parameters of pre-trained networks trained on these sub-tasks. The controller reads inputs from an input-tape, uses the data memory to store valuable signals and writes correct symbols to an output tape. The output symbols are a function of the outputs of each sub-network and the state of the data memory. Hence, the controller learns to load the weights of the appropriate program network to generate output symbols.

A wide range of experiments demonstrate that the Harvard Machine framework learns faster and performs better than the NTM and RNNs like LSTM, as the complexity of tasks increases.

Date Created

2020

Agent

Author (aut): Bhatt, Manthan Bharat
Thesis advisor (ths): Ben Amor, Hani
Committee member: Zhang, Yu
Committee member: Yang, Yezhou
Publisher (pbl): Arizona State University

Neural Network Architecture with External Memory and Domain-aware Weight Switching Mechanism

Description

Humans have an excellent ability to analyze and process information from multiple domains. They also possess the ability to apply the same decision-making process when the situation is familiar with their previous experience.

Inspired by human's ability to remember past experiences and apply the same when a similar situation occurs, the research community has attempted to augment memory with Neural Network to store the previously learned information. Together with this, the community has also developed mechanisms to perform domain-specific weight switching to handle multiple domains using a single model. Notably, the two research fields work independently, and the goal of this dissertation is to combine their capabilities.

This dissertation introduces a Neural Network module augmented with two external memories, one allowing the network to read and write the information and another to perform domain-specific weight switching. Two learning tasks are proposed in this work to investigate the model performance - solving mathematics operations sequence and action based on color sequence identification. A wide range of experiments with these two tasks verify the model's learning capabilities.

Date Created

2020

Agent

Author (aut): Patel, Deep Chittranjan
Thesis advisor (ths): Ben Amor, Hani
Committee member: Banerjee, Ayan
Committee member: McDaniel, Troy
Publisher (pbl): Arizona State University

Sequencing Behavior in an Intelligent Pro-active Co-Driver System

Description

Driving is the coordinated operation of mind and body for movement of a vehicle, such as a car, or a bus. Driving, being considered an everyday activity for many people, still has an issue of safety. Driver distraction is becoming a critical safety problem. Speed, drunk driving as well as distracted driving are the three leading factors in the fatal car crashes. Distraction, which is defined as an excessive workload and limited attention, is the main paradigm that guides this research area. Driver behavior analysis can be used to address the distraction problem and provide an intelligent adaptive agent to work closely with the driver, fay beyond traditional algorithmic computational models. A variety of machine learning approaches has been proposed to estimate or predict drivers’ fatigue level using car data, driver status or a combination of them.

Three important features of intelligence and cognition are perception, attention and sensory memory. In this thesis, I focused on memory and attention as essential parts of highly intelligent systems. Without memory, systems will only show limited intelligence since their response would be exclusively based on spontaneous decision without considering the effect of previous events. I proposed a memory-based sequence to predict the driver behavior and distraction level using neural network. The work started with a large-scale experiment to collect data and make an artificial intelligence-friendly dataset. After that, the data was used to train a deep neural network to estimate the driver behavior. With a focus on memory by using Long Short Term Memory (LSTM) network to increase the level of intelligence in two dimensions: Forgiveness of minor glitches, and accumulation of anomalous behavior., I reduced the model error and computational expense by adding attention mechanism on the top of LSTM models. This system can be generalized to build and train highly intelligent agents in other domains.

Date Created

2020

Agent

Author (aut): Monjezi Kouchak, Shokoufeh
Thesis advisor (ths): Gaffar, Ashraf
Committee member: Doupe, Adam
Committee member: Ben Amor, Hani
Committee member: Cheeks, Loretta
Publisher (pbl): Arizona State University

Sample-Efficient Reinforcement Learning of Robot Control Policies in the Real World

Description

The goal of reinforcement learning is to enable systems to autonomously solve tasks in the real world, even in the absence of prior data. To succeed in such situations, reinforcement learning algorithms collect new experience through interactions with the environment to further the learning process. The behaviour is optimized by maximizing a reward function, which assigns high numerical values to desired behaviours. Especially in robotics, such interactions with the environment are expensive in terms of the required execution time, human involvement, and mechanical degradation of the system itself. Therefore, this thesis aims to introduce sample-efficient reinforcement learning methods which are applicable to real-world settings and control tasks such as bimanual manipulation and locomotion. Sample efficiency is achieved through directed exploration, either by using dimensionality reduction or trajectory optimization methods. Finally, it is demonstrated how data-efficient reinforcement learning methods can be used to optimize the behaviour and morphology of robots at the same time.

Date Created

2019

Agent

Author (aut): Luck, Kevin Sebastian
Thesis advisor (ths): Ben Amor, Hani
Committee member: Aukes, Daniel
Committee member: Fainekos, Georgios
Committee member: Scholz, Jonathan
Committee member: Yang, Yezhou
Publisher (pbl): Arizona State University

Subscribe to Ben Amor, Hani