Imitation Learning on Bimanual Robots

190984-Thumbnail Image.png
Description
Bimanual robot manipulation, involving the coordinated control of two robot arms, holds great promise for enhancing the dexterity and efficiency of robotic systems across a wide range of applications, from manufacturing and healthcare to household chores and logistics. However, enabling

Bimanual robot manipulation, involving the coordinated control of two robot arms, holds great promise for enhancing the dexterity and efficiency of robotic systems across a wide range of applications, from manufacturing and healthcare to household chores and logistics. However, enabling robots to perform complex bimanual tasks with the same level of skill and adaptability as humans remains a challenging problem. The control of a bimanual robot can be tackled through various methods like inverse dynamic controller or reinforcement learning, but each of these methods have their own problems. Inverse dynamic controller cannot adapt to a changing environment, whereas Reinforcement learning is computationally intensive and may require weeks of training for even simple tasks, and reward formulation for Reinforcement Learning is often challenging and is still an open research topic. Imitation learning, leverages human demonstrations to enable robots to acquire the skills necessary for complex tasks and it can be highly sample-efficient and reduces exploration. Given the advantages of Imitation learning we want to explore the application of imitation learning techniques to bridge the gap between human expertise and robotic dexterity in the context of bimanual manipulation. In this thesis, an examination of the Implicit Behavioral Cloning imitation learning algorithm is conducted. Implicit behavioral cloning aims to capture the fundamental behavior or policy of the expert by utilizing energy-based models, which frequently demonstrate superior performance when compared to explicit behavior cloning policies. The assessment encompasses an investigation of the impact of expert demonstrations' quality on the efficacy of the acquired policies. Furthermore, computational and performance metrics of diverse training and inference techniques for energy-based models are compared.
Date Created
2023
Agent

Comparison of Evolutionary Strategies and Reinforcement Learning Algorithms on Custom and Non-Conventional Environment

161938-Thumbnail Image.png
Description
Reinforcement Learning(RL) algorithms have made a remarkable contribution in the eld of robotics and training human-like agents. On the other hand, Evolutionary Algorithms(EA) are not well explored and promoted to use in the robotics field. However, they have an excellent

Reinforcement Learning(RL) algorithms have made a remarkable contribution in the eld of robotics and training human-like agents. On the other hand, Evolutionary Algorithms(EA) are not well explored and promoted to use in the robotics field. However, they have an excellent potential to perform well. In thesis work, various RL learning algorithms like Q-learning, Deep Deterministic Policy Gradient(DDPG), and Evolutionary Algorithms(EA) like Harmony Search Algorithm(HSA) are tested for a customized Penalty Kick Robot environment. The experiments are done with both discrete and continuous action space for a penalty kick agent. The main goal is to identify which algorithm suites best in which scenario. Furthermore, a goalkeeper agent is also introduced to block the ball from reaching the goal post using the multiagent learning algorithm.
Date Created
2021
Agent

DeepCrashTest: Translating Dashcam Videos to Virtual Tests forAutomated Driving Systems

Description
The autonomous vehicle technology has come a long way, but currently, there are no companies that are able to offer fully autonomous ride in any conditions, on any road without any human supervision. These systems should be extensively trained and

The autonomous vehicle technology has come a long way, but currently, there are no companies that are able to offer fully autonomous ride in any conditions, on any road without any human supervision. These systems should be extensively trained and validated to guarantee safe human transportation. Any small errors in the system functionality may lead to fatal accidents and may endanger human lives. Deep learning methods are widely used for environment perception and prediction of hazardous situations. These techniques require huge amount of training data with both normal and abnormal samples to enable the vehicle to avoid a dangerous situation.



The goal of this thesis is to generate simulations from real-world tricky collision scenarios for training and testing autonomous vehicles. Dashcam crash videos from the internet can now be utilized to extract valuable collision data and recreate the crash scenarios in a simulator. The problem of extracting 3D vehicle trajectories from videos recorded by an unknown monocular camera source is solved using a modular approach. The framework is divided into two stages: (a) extracting meaningful adversarial trajectories from short crash videos, and (b) developing methods to automatically process and simulate the vehicle trajectories on a vehicle simulator.
Date Created
2019
Agent

Low to High Dimensional Modality Reconstruction Using Aggregated Fields of View

157676-Thumbnail Image.png
Description
Autonomous systems that are out in the real world today deal with a slew of different data modalities to perform effectively in tasks ranging from robot navigation in complex maneuverable robots to identity verification in simpler static systems. The performance

Autonomous systems that are out in the real world today deal with a slew of different data modalities to perform effectively in tasks ranging from robot navigation in complex maneuverable robots to identity verification in simpler static systems. The performance of the system heavily banks on the continuous supply of data from all modalities. These systems can face drastically increased risk with the loss of one or multiple modalities due to an adverse scenario like that of hardware malfunction, inimical environmental conditions, etc. This thesis investigates modality hallucination and its efficacy in mitigating the risks posed to the autonomous system. Modality hallucination is proposed as one effective way to ensure consistent modality availability thereby reducing unfavorable consequences. While there has been a significant research effort in high-to-low dimensional modality hallucination, like that of RGB to depth, there is considerably lesser interest in the other direction( low-to-high dimensional modality prediction). This thesis serves to demonstrate the effectiveness of this low-to-high modality hallucination in reducing the uncertainty in the affected system while also ensuring that the method remains task agnostic.

A deep neural network based encoder-decoder architecture that aggregates multiple fields of view in its encoder blocks to recover the lost information of the affected modality from the extant modality is presented with evidence of its efficacy. The hallucination process is implemented by capturing a non-linear mapping between the data modalities and the learned mapping is used to aid the extant modality to mitigate the risk posed to the system in the adverse scenarios which involve modality loss. The results are compared with a well known generative model built for the task of image translation, as well as an off-the-shelf semantic segmentation architecture re-purposed for hallucination. To validate the practicality of hallucinated modality, extensive classification and segmentation experiments are conducted on the University of Washington's depth image database (UWRGBD) database and the New York University database (NYUD) and demonstrate that hallucination indeed lessens the negative effects of the modality loss.
Date Created
2019
Agent

Representation, Exploration, and Recommendation of Music Playlists

157595-Thumbnail Image.png
Description
Playlists have become a significant part of the music listening experience today because of the digital cloud-based services such as Spotify, Pandora, Apple Music. Owing to the meteoric rise in usage of playlists, recommending playlists is crucial to music services

Playlists have become a significant part of the music listening experience today because of the digital cloud-based services such as Spotify, Pandora, Apple Music. Owing to the meteoric rise in usage of playlists, recommending playlists is crucial to music services today. Although there has been a lot of work done in playlist prediction, the area of playlist representation hasn't received that level of attention. Over the last few years, sequence-to-sequence models, especially in the field of natural language processing have shown the effectiveness of learned embeddings in capturing the semantic characteristics of sequences. Similar concepts can be applied to music to learn fixed length representations for playlists and the learned representations can then be used for downstream tasks such as playlist comparison and recommendation.

In this thesis, the problem of learning a fixed-length representation is formulated in an unsupervised manner, using Neural Machine Translation (NMT), where playlists are interpreted as sentences and songs as words. This approach is compared with other encoding architectures and evaluated using the suite of tasks commonly used for evaluating sentence embeddings, along with a few additional tasks pertaining to music. The aim of the evaluation is to study the traits captured by the playlist embeddings such that these can be leveraged for music recommendation purposes. This work lays down the foundation for analyzing music playlists and learning the patterns that exist in the playlists in an end-to-end manner. This thesis finally concludes with a discussion on the future direction for this research and its potential impact in the domain of Music Information Retrieval.
Date Created
2019
Agent

Bi-manual learning for a basketball playing robot

155071-Thumbnail Image.png
Description
Sports activities have been a cornerstone in the evolution of humankind through the ages from the ancient Roman empire to the Olympics in the 21st century. These activities have been used as a benchmark to evaluate the how humans have

Sports activities have been a cornerstone in the evolution of humankind through the ages from the ancient Roman empire to the Olympics in the 21st century. These activities have been used as a benchmark to evaluate the how humans have progressed through the sands of time. In the 21st century, machines along with the help of powerful computing and relatively new computing paradigms have made a good case for taking up the mantle. Even though machines have been able to perform complex tasks and maneuvers, they have struggled to match the dexterity, coordination, manipulability and acuteness displayed by humans. Bi-manual tasks are more complex and bring in additional variables like coordination into the task making it harder to evaluate.

A task capable of demonstrating the above skillset would be a good measure of the progress in the field of robotic technology. Therefore a dual armed robot has been built and taught to handle the ball and make the basket successfully thus demonstrating the capability of using both arms. A combination of machine learning techniques, Reinforcement learning, and Imitation learning has been used along with advanced optimization algorithms to accomplish the task.
Date Created
2016
Agent