LiKamWa, Robert

Song Final Project (Spring 2022)

Description

Spatial audio can be especially useful for directing human attention. However, delivering spatial audio through speakers, rather than headphones that deliver audio directly to the ears, produces the issue of crosstalk, where sounds from each of the two speakers reach the opposite ear, inhibiting the spatialized effect. A research team at Meteor Studio has developed an algorithm called Xblock that solves this issue using a crosstalk cancellation technique. This thesis project expands upon the existing Xblock IoT system by providing a way to test the accuracy of the directionality of sounds generated with spatial audio. More specifically, the objective is to determine whether the usage of Xblock with smart speakers can provide generalized audio localization, which refers to the ability to detect a general direction of where a sound might be coming from. This project also expands upon the existing Xblock technique to integrate voice commands, where users can verbalize the name of a lost item using the phrase, “Find [item]”, and the IoT system will use spatial audio to guide them to it.

Date Created

2022-05

Agent

Author (aut): Song, Lucy
Thesis director: LiKamWa, Robert
Committee member: Berisha, Visar
Contributor (ctb): Barrett, The Honors College
Contributor (ctb): Computer Science and Engineering Program

Spatial Audio Localization with Internet of Things (IoT)

Description

Date Created

2022-05

Agent

Author (aut): Song, Lucy
Thesis director: LiKamWa, Robert
Committee member: Berisha, Visar
Contributor (ctb): Barrett, The Honors College
Contributor (ctb): Computer Science and Engineering Program

A Scalable and Programmable I/O Controller for Region-based Computing

Description

I present my work on a scalable and programmable I/O controller for region-based computing, which will be used in a rhythmic pixel-based camera pipeline. I provide a breakdown of the development and design of the I/O controller and how it fits in to rhythmic pixel regions, along with a studyon memory traffic of rhythmic pixel regions and how this translates to energy efficiency. This rhythmic pixel region-based camera pipeline has been jointly developed through Dr. Robert LiKamWa’s research lab. High spatiotemporal resolutions allow high precision for vision applications, such as for detecting features for augmented reality or face detection. High spatiotemporal resolution also comes with high memory throughput, leading to higher energy usage. This creates a tradeoff between high precision and energy efficiency, which becomes more important in mobile systems. In addition, not all pixels in a frame are necessary for the vision application, such as pixels that make up the background. Rhythmic pixel regions aim to reduce the tradeoff by creating a pipeline that allows an application developer to specify regions to capture at a non-uniform spatiotemporal resolution. This is accomplished by encoding the incoming image, and only sending the pixels within these specified regions. Later these encoded representations will be decoded to a standard frame representation usable by traditional vision applications. My contribution to this effort has been the design, testing and evaluation of the I/O controller.

Date Created

2020

Agent

Author (aut): Nguyen, Van
Thesis advisor (ths): LiKamWa, Robert
Committee member: Jayasuriya, Suren
Committee member: Yang, Yezhou
Publisher (pbl): Arizona State University

Thermal noise analysis of near-sensor image processing

Description

Commonly, image processing is handled on a CPU that is connected to the image sensor by a wire. In these far-sensor processing architectures, there is energy loss associated with sending data across an interconnect from the sensor to the CPU. In an effort to increase energy efficiency, near-sensor processing architectures have been developed, in which the sensor and processor are stacked directly on top of each other. This reduces energy loss associated with sending data off-sensor. However, processing near the image sensor causes the sensor to heat up. Reports of thermal noise in near-sensor processing architectures motivated us to study how temperature affects image quality on a commercial image sensor and how thermal noise affects computer vision task accuracy. We analyzed image noise across nine different temperatures and three sensor configurations to determine how image noise responds to an increase in temperature. Ultimately, our team used this information, along with transient analysis of a stacked image sensor’s thermal behavior, to advise thermal management strategies that leverage the benefits of near-sensor processing and prevent accuracy loss at problematic temperatures.

Date Created

2020-12

Agent

Author (aut): Jones, Britton Steele
Thesis director: LiKamWa, Robert
Committee member: Jayasuriya, Suren
Contributor (ctb): Watts College of Public Service & Community Solut
Contributor (ctb): Electrical Engineering Program
Contributor (ctb): Electrical Engineering Program
Contributor (ctb): Barrett, The Honors College

Exploring the Influence of Visualized Data: Inclusion and Collaboration Between University Members

Description

Visualizations are an integral component for communicating and evaluating modern networks. As data becomes more complex, info-graphics require a balance between visual noise and effective storytelling that is often restricted by layouts unsuitable for scalability. The challenge then rests upon researchers to effectively structure their information in a way that allows for flexible, transparent illustration. We propose network graphing as an operative alternative for demonstrating community behavior over traditional charts which are unable to look past numeric data. In this paper, we explore methods for manipulating, processing, cleaning, and aggregating data in Python; a programming language tailored for handling structured data, which can then be formatted for analysis and modeling of social network tendencies in Gephi. We implement this data by applying an algorithm known as the Fruchterman-Reingold force-directed layout to datasets of Arizona State University’s research and collaboration network. The result is a visualization that analyzes the university’s infrastructure by providing insight about community behaviors between colleges. Furthermore, we highlight how the flexibility of this visualization provides a foundation for specific use cases by demonstrating centrality measures to find important liaisons that connect distant communities.

Date Created

2020-05

Agent

Author (aut): McMichael, Jacob Andrew
Thesis director: LiKamWa, Robert
Committee member: Anderson, Derrick
Committee member: Goshert, Maxwell
Contributor (ctb): Arts, Media and Engineering Sch T
Contributor (ctb): Barrett, The Honors College

Investigating Methods of Achieving Photorealistic Materials for Augmented Reality Applications on Mobile Devices

Description

As the prevalence of augmented reality (AR) technology continues to increase, so too have methods for improving the appearance and behavior of computer-generated objects. This is especially significant as AR applications now expand to territories outside of the entertainment sphere and can be utilized for numerous purposes encompassing but not limited to education, specialized occupational training, retail & online shopping, design, marketing, and manufacturing. Due to the nature of AR technology, where computer-generated objects are being placed into a real-world environment, a decision has to be made regarding the visual connection between the tangible and the intangible. Should the objects blend seamlessly into their environment or purposefully stand out? It is not purely a stylistic choice. A developer must consider how their application will be used — in many instances an optimal user experience is facilitated by mimicking the real world as closely as possible; even simpler applications, such as those built primarily for mobile devices, can benefit from realistic AR. The struggle here lies in creating an immersive user experience that is not reliant on computationally-expensive graphics or heavy-duty models. The research contained in this thesis provides several ways for achieving photorealistic rendering in AR applications using a range of techniques, all of which are supported on mobile devices. These methods can be employed within the Unity Game Engine and incorporate shaders, render pipelines, node-based editors, post-processing, and light estimation.

Date Created

2020-05

Agent

Author (aut): Schanberger, Schuyler Catherine
Thesis director: LiKamWa, Robert
Committee member: Jayasuriya, Suren
Contributor (ctb): Arts, Media and Engineering Sch T
Contributor (ctb): Barrett, The Honors College

Viewpoint Recommendation for Aesthetic Photography

Description

This thesis addresses the problem of recommending a viewpoint for aesthetic photography. Viewpoint recommendation is suggesting the best camera pose to capture a visually pleasing photograph of the subject of interest by using any end-user device such as drone, mobile robot or smartphone. Solving this problem enables to capture visually pleasing photographs autonomously in areal photography, wildlife photography, landscape photography or in personal photography.

The viewpoint recommendation problem can be divided into two stages: (a) generating a set of dense novel views based on the basis views captured about the subject. The dense novel views are useful to better understand the scene and to know how the subject looks from different viewpoints and (b) each novel is scored based on how aesthetically good it is. The viewpoint with the greatest aesthetic score is recommended for capturing a visually pleasing photograph.

Date Created

2019

Agent

Author (aut): Katukuri, Sathish Kumar
Thesis advisor (ths): LiKamWa, Robert
Committee member: Turaga, Pavan
Committee member: Jayasuriya, Suren
Publisher (pbl): Arizona State University

Accelerating Linear Algebra and Machine Learning Kernels on a Massively Parallel Reconfigurable Architecture

Description

This thesis presents efficient implementations of several linear algebra kernels, machine learning kernels and a neural network based recommender systems engine onto a massively parallel reconfigurable architecture, Transformer. The linear algebra kernels include Triangular Matrix Solver (TRSM), LU Decomposition (LUD), QR Decomposition (QRD), and Matrix Inversion. The machine learning kernels include an LSTM (Long Short Term Memory) cell, and a GRU (gated Recurrent Unit) cell used in recurrent neural networks. The neural network based recommender systems engine consists of multiple kernels including fully connected layers, embedding layer, 1-D batchnorm, Adam optimizer, etc.

Transformer is a massively parallel reconfigurable multicore architecture designed at the University of Michigan. The Transformer configuration considered here is 4 tiles and 16 General Processing Elements (GPEs) per tile. It supports a two level cache hierarchy where the L1 and L2 caches can operate in shared (S) or private (P) modes. The architecture was modeled using Gem5 and cycle accurate simulations were done to evaluate the performance in terms of execution times, giga-operations per second per Watt (GOPS/W), and giga-floating-point-operations per second per Watt (GFLOPS/W).

This thesis shows that for linear algebra kernels, each kernel achieves high performance for a certain cache mode and that this cache mode can change when the matrix size changes. For instance, for smaller matrix sizes, L1P, L2P cache mode is best for TRSM, while L1S, L2S is the best cache mode for LUD, and L1P, L2S is the best for QRD. For each kernel, the optimal cache mode changes when the matrix size is increased. For instance, for TRSM, the L1P, L2P cache mode is best for smaller matrix sizes ($N=64, 128, 256, 512$) and it changes to L1S, L2P for larger matrix sizes ($N=1024$). For machine learning kernels, L1P, L2P is the best cache mode for all network parameter sizes.

Gem5 simulations show that the peak performance for TRSM, LUD, QRD and Matrix Inverse in the 14nm node is 97.5, 59.4, 133.0 and 83.05 GFLOPS/W, respectively. For LSTM and GRU, the peak performance is 44.06 and 69.3 GFLOPS/W.

The neural network based recommender system was implemented in L1S, L2S cache mode. It includes a forward pass and a backward pass and is significantly more complex in terms of both computational complexity and data movement. The most computationally intensive block is the fully connected layer followed by Adam optimizer. The overall performance of the recommender systems engine is 54.55 GFLOPS/W and 169.12 GOPS/W.

Date Created

2019

Agent

Author (aut): Soorishetty, Anuraag
Thesis advisor (ths): Chakrabarti, Chaitali
Committee member: Kim, Hun Seok
Committee member: LiKamWa, Robert
Publisher (pbl): Arizona State University

Protecting Visual Information in Augmented Reality from Malicious Application Developers

Description

Visual applications – those that use camera frames as part of the application – provide a rich, context-aware experience. The continued development of mixed and augmented reality (MR/AR) computing environments furthers the richness of this experience by providing applications a continuous vision experience, where visual information continuously provides context for applications and the real world is augmented by the virtual. To understand user privacy concerns in continuous vision computing environments, this work studies three MR/AR applications (augmented markers, augmented faces, and text capture) to show that in a modern mobile system, the typical user is exposed to potential mass collection of sensitive information, posing privacy and security deficiencies to be addressed in future systems.

To address such deficiencies, a development framework is proposed that provides resource isolation between user information contained in camera frames and application access to the network. The design is implemented using existing system utilities as a proof of concept on the Android operating system and demonstrates its viability with a modern state-of-the-art augmented reality library and several augmented reality applications. Evaluation is conducted on the design on a Samsung Galaxy S8 phone by comparing the applications from the case study with modified versions which better protect user privacy. Early results show that the new design efficiently protects users against data collection in MR/AR applications with less than 0.7% performance overhead.

Date Created

2019

Agent

Author (aut): Jensen, Jk
Thesis advisor (ths): LiKamWa, Robert
Committee member: Doupe, Adam
Committee member: Wang, Ruoyu
Publisher (pbl): Arizona State University

ECOAcoustic: A VR Experience

Description

Acoustic Ecology is an undervalued field of study of the relationship between the environment and sound. This project aims to educate people on this topic and show people the importance by immersing them in virtual reality scenes. The scenes were created using VR180 content as well as 360° spatial audio.

Date Created

2019-05

Agent

Author (aut): Neel, Jordan Tanner
Thesis director: LiKamWa, Robert
Committee member: Feisst, Sabine
Contributor (ctb): Arts, Media and Engineering Sch T
Contributor (ctb): Department of Psychology
Contributor (ctb): Barrett, The Honors College

Subscribe to LiKamWa, Robert