Pose Estimation with Convolutional Neural Networks

132574-Thumbnail Image.png
Description
Convolutional neural networks boast a myriad of applications in artificial intelligence, but one of the most common uses for such networks is image extraction. The ability of convolutional layers to extract and combine data features for the purpose of image

Convolutional neural networks boast a myriad of applications in artificial intelligence, but one of the most common uses for such networks is image extraction. The ability of convolutional layers to extract and combine data features for the purpose of image analysis can be leveraged for pose estimation on an object - detecting the presence and attitude of corners and edges allows a convolutional neural network to identify how an object is positioned. This task can assist in working to grasp an object correctly in robotics applications, or to track an object more accurately in 3D space. However, the effectiveness of pose estimation may change based on properties of the object; the pose of a complex object, complexity being determined by internal occlusions, similar faces, etcetera, can be difficult to resolve.
This thesis is part of a collaboration between ASU’s Interactive Robotics Laboratory and NASA’s Jet Propulsion Laboratory. In this thesis, the training pipeline from Sharma’s paper “Pose Estimation for Non-Cooperative Spacecraft Rendezvous Using Convolutional Neural Networks” was modified to perform pose estimation on a complex object - specifically, a segment of a hollow truss. After initial attempts to replicate the architecture used in the paper and train solely on synthetic images, a combination of synthetic dataset generation and transfer learning on an ImageNet-pretrained AlexNet model was implemented to mitigate the difficulty of gathering large amounts of real-world data. Experimentation with pose estimation accuracy and hyperparameters of the model resulted in gradual test accuracy improvement, and future work is suggested to improve pose estimation for complex objects with some form of rotational symmetry.
Date Created
2019-05

Foundations of Human-Aware Planning -- A Tale of Three Models

157016-Thumbnail Image.png
Description
A critical challenge in the design of AI systems that operate with humans in the loop is to be able to model the intentions and capabilities of the humans, as well as their beliefs and expectations of the AI system

A critical challenge in the design of AI systems that operate with humans in the loop is to be able to model the intentions and capabilities of the humans, as well as their beliefs and expectations of the AI system itself. This allows the AI system to be "human- aware" -- i.e. the human task model enables it to envisage desired roles of the human in joint action, while the human mental model allows it to anticipate how its own actions are perceived from the point of view of the human. In my research, I explore how these concepts of human-awareness manifest themselves in the scope of planning or sequential decision making with humans in the loop. To this end, I will show (1) how the AI agent can leverage the human task model to generate symbiotic behavior; and (2) how the introduction of the human mental model in the deliberative process of the AI agent allows it to generate explanations for a plan or resort to explicable plans when explanations are not desired. The latter is in addition to traditional notions of human-aware planning which typically use the human task model alone and thus enables a new suite of capabilities of a human-aware AI agent. Finally, I will explore how the AI agent can leverage emerging mixed-reality interfaces to realize effective channels of communication with the human in the loop.
Date Created
2018
Agent

Training Robot Policies using External Memory Based Networks Via Imitation Learning

156971-Thumbnail Image.png
Description
Recent advancements in external memory based neural networks have shown promise

in solving tasks that require precise storage and retrieval of past information. Re-

searchers have applied these models to a wide range of tasks that have algorithmic

properties but have not applied

Recent advancements in external memory based neural networks have shown promise

in solving tasks that require precise storage and retrieval of past information. Re-

searchers have applied these models to a wide range of tasks that have algorithmic

properties but have not applied these models to real-world robotic tasks. In this

thesis, we present memory-augmented neural networks that synthesize robot navigation policies which a) encode long-term temporal dependencies b) make decisions in

partially observed environments and c) quantify the uncertainty inherent in the task.

We extract information about the temporal structure of a task via imitation learning

from human demonstration and evaluate the performance of the models on control

policies for a robot navigation task. Experiments are performed in partially observed

environments in both simulation and the real world
Date Created
2018
Agent

Beyond Deep Learning: Synthesizing Navigation Programs using Neural Turing Machines

133211-Thumbnail Image.png
Description
This thesis aims to improve neural control policies for self-driving cars. State-of-the-art navigation software for self-driving cars is based on deep neural networks, where the network is trained on a dataset of past driving experience in various situations. With previous

This thesis aims to improve neural control policies for self-driving cars. State-of-the-art navigation software for self-driving cars is based on deep neural networks, where the network is trained on a dataset of past driving experience in various situations. With previous methods, the car can only make decisions based on short-term memory. To address this problem, we proposed that using a Neural Turing Machine (NTM) framework adds long-term memory to the system. We evaluated this approach by using it to master a palindrome task. The network was able to infer how to create a palindrome with 100% accuracy. Since the NTM structure proves useful, we aim to use it in the given scenarios to improve the navigation safety and accuracy of a simulated autonomous car.
Date Created
2018-05
Agent

3D Printed Robotic Arm

134066-Thumbnail Image.png
Description
For those interested in the field of robotics, there are not many options to get your hands on a physical robot without paying a steep price. This is why the folks at BCN3D Technologies decided to design a fully open-source

For those interested in the field of robotics, there are not many options to get your hands on a physical robot without paying a steep price. This is why the folks at BCN3D Technologies decided to design a fully open-source 3D-printable robotic arm. Their goal was to reduce the barrier to entry for the field of robotics and make it exponentially more accessible for people around the world. For our honors thesis, we chose to take the design from BCN3D and attempt to build their robot, to see how accessible the design truly is. Although their designs were not perfect and we were forced to make some adjustments to the 3D files, overall the work put forth by the people at BCN3D was extremely useful in successfully building a robotic arm that is programmed with ease.
Date Created
2017-12
Agent

A Human-Robot Interaction Perspective on Assistive and Rehabilitation Robotics

128351-Thumbnail Image.png
Description

Assistive and rehabilitation devices are a promising and challenging field of recent robotics research. Motivated by societal needs such as aging populations, such devices can support motor functionality and subject training. The design, control, sensing, and assessment of the devices

Assistive and rehabilitation devices are a promising and challenging field of recent robotics research. Motivated by societal needs such as aging populations, such devices can support motor functionality and subject training. The design, control, sensing, and assessment of the devices become more sophisticated due to a human in the loop. This paper gives a human-robot interaction perspective on current issues and opportunities in the field. On the topic of control and machine learning, approaches that support but do not distract subjects are reviewed. Options to provide sensory user feedback that are currently missing from robotic devices are outlined. Parallels between device acceptance and affective computing are made. Furthermore, requirements for functional assessment protocols that relate to real-world tasks are discussed. In all topic areas, the design of human-oriented frameworks and methods is dominated by challenges related to the close interaction between the human and robotic device. This paper discusses the aforementioned aspects in order to open up new perspectives for future robotic solutions.

Date Created
2017-05-23
Agent

Programmable Insight: A Computational Methodology to Explore Online News Use of Frames

155511-Thumbnail Image.png
Description
The Internet is a major source of online news content. Online news is a form of large-scale narrative text with rich, complex contents that embed deep meanings (facts, strategic communication frames, and biases) for shaping and transitioning standards, values, attitudes,

The Internet is a major source of online news content. Online news is a form of large-scale narrative text with rich, complex contents that embed deep meanings (facts, strategic communication frames, and biases) for shaping and transitioning standards, values, attitudes, and beliefs of the masses. Currently, this body of narrative text remains untapped due—in large part—to human limitations. The human ability to comprehend rich text and extract hidden meanings is far superior to known computational algorithms but remains unscalable. In this research, computational treatment is given to online news framing for exposing a deeper level of expressivity coined “double subjectivity” as characterized by its cumulative amplification effects. A visual language is offered for extracting spatial and temporal dynamics of double subjectivity that may give insight into social influence about critical issues, such as environmental, economic, or political discourse. This research offers benefits of 1) scalability for processing hidden meanings in big data and 2) visibility of the entire network dynamics over time and space to give users insight into the current status and future trends of mass communication.
Date Created
2017
Agent

Mediating Human-Robot Collaboration through Mixed Reality Cues

155401-Thumbnail Image.png
Description
This work presents a communication paradigm, using a context-aware mixed reality approach, for instructing human workers when collaborating with robots. The main objective of this approach is to utilize the physical work environment as a canvas to communicate task-related instructions

This work presents a communication paradigm, using a context-aware mixed reality approach, for instructing human workers when collaborating with robots. The main objective of this approach is to utilize the physical work environment as a canvas to communicate task-related instructions and robot intentions in the form of visual cues. A vision-based object tracking algorithm is used to precisely determine the pose and state of physical objects in and around the workspace. A projection mapping technique is used to overlay visual cues on tracked objects and the workspace. Simultaneous tracking and projection onto objects enables the system to provide just-in-time instructions for carrying out a procedural task. Additionally, the system can also inform and warn humans about the intentions of the robot and safety of the workspace. It was hypothesized that using this system for executing a human-robot collaborative task will improve the overall performance of the team and provide a positive experience to the human partner. To test this hypothesis, an experiment involving human subjects was conducted and the performance (both objective and subjective) of the presented system was compared with a conventional method based on printed instructions. It was found that projecting visual cues enabled human subjects to collaborate more effectively with the robot and resulted in higher efficiency in completing the task.
Date Created
2017
Agent

Crossroads --- A Time-Sensitive Autonomous Intersection Management Technique

155312-Thumbnail Image.png
Description
For autonomous vehicles, intelligent autonomous intersection management will be required for safe and efficient operation. In order to achieve safe operation despite uncertainties in vehicle trajectory, intersection management techniques must consider a safety buffer around the vehicles. For truly safe

For autonomous vehicles, intelligent autonomous intersection management will be required for safe and efficient operation. In order to achieve safe operation despite uncertainties in vehicle trajectory, intersection management techniques must consider a safety buffer around the vehicles. For truly safe operation, an extra buffer space should be added to account for the network and computational delay caused by communication with the Intersection Manager (IM). However, modeling the worst-case computation and network delay as additional buffer around the vehicle degrades the throughput of the intersection. To avoid this problem, AIM, a popular state-of-the-art IM, adopts a query-based approach in which the vehicle requests to enter at a certain arrival time dictated by its current velocity and distance to the intersection, and the IM replies yes
o. Although this solution does not degrade the position uncertainty, it ultimately results in poor intersection throughput. We present Crossroads, a time-sensitive programming method to program the interface of a vehicle and the IM. Without requiring additional buffer to account for the effect of network and computational delay, Crossroads enables efficient intersection management. Test results on a 1/10 scale model of intersection using TRAXXAS RC cars demonstrates that our Crossroads approach obviates the need for large buffers to accommodate for the network and computation delay, and can reduce the average wait time for the vehicles at a single-lane intersection by 24%. To compare Crossroads with previous approaches, we perform extensive Matlab simulations, and find that Crossroads achieves on average 1.62X higher throughput than a simple VT-IM with extra safety buffer, and 1.36X better than AIM.
Date Created
2017
Agent