Imitation Learning on Bimanual Robots

190984-Thumbnail Image.png
Description
Bimanual robot manipulation, involving the coordinated control of two robot arms, holds great promise for enhancing the dexterity and efficiency of robotic systems across a wide range of applications, from manufacturing and healthcare to household chores and logistics. However, enabling

Bimanual robot manipulation, involving the coordinated control of two robot arms, holds great promise for enhancing the dexterity and efficiency of robotic systems across a wide range of applications, from manufacturing and healthcare to household chores and logistics. However, enabling robots to perform complex bimanual tasks with the same level of skill and adaptability as humans remains a challenging problem. The control of a bimanual robot can be tackled through various methods like inverse dynamic controller or reinforcement learning, but each of these methods have their own problems. Inverse dynamic controller cannot adapt to a changing environment, whereas Reinforcement learning is computationally intensive and may require weeks of training for even simple tasks, and reward formulation for Reinforcement Learning is often challenging and is still an open research topic. Imitation learning, leverages human demonstrations to enable robots to acquire the skills necessary for complex tasks and it can be highly sample-efficient and reduces exploration. Given the advantages of Imitation learning we want to explore the application of imitation learning techniques to bridge the gap between human expertise and robotic dexterity in the context of bimanual manipulation. In this thesis, an examination of the Implicit Behavioral Cloning imitation learning algorithm is conducted. Implicit behavioral cloning aims to capture the fundamental behavior or policy of the expert by utilizing energy-based models, which frequently demonstrate superior performance when compared to explicit behavior cloning policies. The assessment encompasses an investigation of the impact of expert demonstrations' quality on the efficacy of the acquired policies. Furthermore, computational and performance metrics of diverse training and inference techniques for energy-based models are compared.
Date Created
2023
Agent

TIPANGLE: A Machine Learning Approach for Accurate Spatial Pan and Tilt Angle Determination of Pan Tilt Traffic Cameras

190953-Thumbnail Image.png
Description
Pan Tilt Traffic Cameras (PTTC) are a vital component of traffic managementsystems for monitoring/surveillance. In a real world scenario, if a vehicle is in pursuit of another vehicle or an accident has occurred at an intersection causing traffic stoppages, accurate and venerable

Pan Tilt Traffic Cameras (PTTC) are a vital component of traffic managementsystems for monitoring/surveillance. In a real world scenario, if a vehicle is in pursuit of another vehicle or an accident has occurred at an intersection causing traffic stoppages, accurate and venerable data from PTTC is necessary to quickly localize the cars on a map for adept emergency response as more and more traffic systems are getting automated using machine learning concepts. However, the position(orientation) of the PTTC with respect to the environment is often unknown as most of them lack Inertial Measurement Units or Encoders. Current State Of the Art systems 1. Demand high performance compute and use carbon footprint heavy Deep Neural Networks(DNN), 2. Are only applicable to scenarios with appropriate lane markings or only roundabouts, 3. Demand complex mathematical computations to determine focal length and optical center first before determining the pose. A compute light approach "TIPANGLE" is presented in this work. The approach uses the concept of Siamese Neural Networks(SNN) encompassing simple mathematical functions i.e., Euclidian Distance and Contrastive Loss to achieve the objective. The effectiveness of the approach is reckoned with a thorough comparison study with alternative approaches and also by executing the approach on an embedded system i.e., Raspberry Pi 3.
Date Created
2023
Agent

Using Language Models to Generate Text-to-SQL Training Data An Approach to Improve Performance of a Text-to-SQL Parser

187426-Thumbnail Image.png
Description
Code Generation is a task that has gained rapid progress in Natural Language Processing (NLP) research. This thesis focuses on the text-to-Structured Query Language (SQL) task, where the input is a question about a specific database and the output is

Code Generation is a task that has gained rapid progress in Natural Language Processing (NLP) research. This thesis focuses on the text-to-Structured Query Language (SQL) task, where the input is a question about a specific database and the output is the SQL that when executed will return the desired answer. The data creation process bottlenecks current text-to-SQL datasets. The technical knowledge required to understand and create SQL makes crowd-sourcing a dataset expensive and time-consuming. Thus, existing datasets do not provide a robust enough training set for state-of-the-art semantic parsing models. This thesis outlines my technique for generating a text-to-SQL dataset using GPT3 and prompt engineering techniques. My approach entails providing the Generative Pretrained Transformer 3 model (GPT-3) with particular instructions to build a rigorous text-to-SQL dataset. In this paper, I show that the created pairs have excellent quality and diversity, and when utilized as training data, they can enhance the accuracy of SQL generation models. I expect that my method will be of interest to academics in the disciplines of NLP because it can considerably reduce the time, effort, and cost necessary to produce large, high-quality text-to-SQL datasets. Furthermore, my approach can be extended to other tasks and domains to alleviate the burden of curating human-annotated data.
Date Created
2023
Agent

Event Detection as Multi-Task Text Generation

171580-Thumbnail Image.png
Description
Event detection refers to the task of identifying event occurrences in a given natural language text. Event detection comprises two subtasks; recognizing event mention (event identification) and the type of event (event classification). Breaking from the sequence labeling and word

Event detection refers to the task of identifying event occurrences in a given natural language text. Event detection comprises two subtasks; recognizing event mention (event identification) and the type of event (event classification). Breaking from the sequence labeling and word classification approaches, this work models event detection, and its constituent subtasks of trigger identification and trigger classification, as independent sequence generation tasks. This work proposes a prompted multi-task generative model trained on event identification, classification, and combined event detection. The model is evaluated on on general-domain and biomedical-domain event detection datasets, achieving state-of-the-art results on the general-domain Roles Across Multiple Sentences (RAMS) dataset, establishing event detection benchmark performance on WikiEvents, and achieving competitive performance on the general-domain Massive Event Detection (MAVEN) dataset and the biomedical-domain Multi-Level Event Extraction (MLEE) dataset.
Date Created
2022
Agent

Inside the Box: Analysing Cyber-physical Systems, Exploiting Models and Specifications

171515-Thumbnail Image.png
Description
The notion of the safety of a system when placed in an environment with humans and other machines has been one of the primary concerns of practitioners while deploying any cyber-physical system (CPS). Such systems, also called safety-critical systems, need

The notion of the safety of a system when placed in an environment with humans and other machines has been one of the primary concerns of practitioners while deploying any cyber-physical system (CPS). Such systems, also called safety-critical systems, need to be exhaustively tested for erroneous behavior. This generates the need for coming up with algorithms that can help ascertain the behavior and safety of the system by generating tests for the system where they are likely to falsify. In this work, three algorithms have been presented that aim at finding falsifying behaviors in cyber-physical Systems. PART-X intelligently partitions while sampling the input space to provide probabilistic point and region estimates of falsification. PYSOAR-C and LS-EMIBO aims at finding falsifying behaviors in gray-box systems when some information about the system is available. Specifically, PYSOAR-C aims to find falsification while maximizing coverage using a two-phase optimization process, while LS-EMIBO aims at exploiting the structure of a requirement to find falsifications with lower computational cost compared to the state-of-the-art. This work also shows the efficacy of the algorithms on a wide range of complex cyber-physical systems. The algorithms presented in this thesis are available as python toolboxes.
Date Created
2022
Agent