Recent advances in Artificial Intelligence (AI) have brought AI closer to laypeople than ever before. This leads to a pervasive problem: how would a user ascertain whether an AI system will be safe, reliable, or useful in a given situation?…
Recent advances in Artificial Intelligence (AI) have brought AI closer to laypeople than ever before. This leads to a pervasive problem: how would a user ascertain whether an AI system will be safe, reliable, or useful in a given situation? This problem becomes particularly challenging when it is considered that most autonomous systems are not designed by their users; the internal software of these systems may be unavailable or difficult to understand; and the functionality of these systems may even change from initial specifications as a result of learning. To overcome these challenges, this dissertation proposes a paradigm for third-party autonomous assessment of black-box taskable AI systems. The four main desiderata of such assessment systems are: (i) interpretability: generating a description of the AI system's functionality in a language that the target user can understand; (ii) correctness: ensuring that the description of AI system's working is accurate; (iii) generalizability creating a solution approach that works well for different types of AI systems; and (iv) minimal requirements: creating an assessment system that does not place complex requirements on AI systems to support the third-party assessment, otherwise the manufacturers of AI system's might not support such an assessment. To satisfy these properties, this dissertation presents algorithms and requirements that would enable user-aligned autonomous assessment that helps the user understand the limits of a black-box AI system's safe operability. This dissertation proposes a personalized AI assessment module that discovers the high-level ``capabilities'' of an AI system with arbitrary internal planning algorithms/policies and learns an accurate symbolic description of these capabilities in terms of concepts that a user understands. Furthermore, the dissertation includes the associated theoretical results and the empirical evaluations. The results show that (i) a primitive query-response interface can enable the development of autonomous assessment modules that can derive a causally accurate user-interpretable model of the system's capabilities efficiently, and (ii) such descriptions are easier to understand and reason with for the users than the agent's primitive actions.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
Choosing a video streaming service to subscribe to involves a complex decision-making process consisting of multifaceted factors that consumers must carefully consider. This dissertation identifies and examines the factors influencing consumers’ decisions by reviewing research from diverse fields such as…
Choosing a video streaming service to subscribe to involves a complex decision-making process consisting of multifaceted factors that consumers must carefully consider. This dissertation identifies and examines the factors influencing consumers’ decisions by reviewing research from diverse fields such as human factors, psychology, economics, and human-computer interaction. The identified factors shaping consumers’ choices, in order of importance, are advertisements, social value, price, content, and content discovery methods. Additionally, this study assesses consumers’ willingness to pay for each factor and examines whether their perceived explanation of their choice of platform aligns with the behavior in the choice experiments. Opportunities for future research are discussed including, the potential for developing an algorithm to determine one’s likelihood of subscribing to a streaming platform based on the choice heuristics outlined in this study.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
Due to monumental advancements in large language models (LLMs), such as OpenAI's ChatGPT, there is widespread interest in integrating this general AI’s capabilities into various applications, including robotics. However, the rush to deploy this technology has left safety as an…
Due to monumental advancements in large language models (LLMs), such as OpenAI's ChatGPT, there is widespread interest in integrating this general AI’s capabilities into various applications, including robotics. However, the rush to deploy this technology has left safety as an afterthought, if at all. This study investigates the potential for LLM-fused robots to operate safely in real-world settings. This study begins with a review of ChatGPT, highlighting its capabilities and current challenges, particularly with integrating LLMs into robotics, and continues with similar applications as AI agents though APIs. To assess the safety implications of LLM-driven robots, the study presents experimental methods involving the navigation of a TurtleSim robot in 2D environments when given different scenarios. Various parameters are analyzed to determine the current capabilities of ChatGPT to understand how to adjust any agents it possesses based on the situation. Current findings reveal that ChatGPT-driven robots demonstrate adaptive behavior based on the scenario provided, indicating their potential for real-time safety adjustments and eliciting further research to ensure safe and successful integration of these robots into diverse work environments.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
The purpose of the present study is to explore a potential rehabilitation alternative/additive, when time, insurance, finances, or lack of knowledge are limitations for mild traumatic brain injury (mTBI) executive function (EF) rehabilitation. The experimental intervention involved two sets of…
The purpose of the present study is to explore a potential rehabilitation alternative/additive, when time, insurance, finances, or lack of knowledge are limitations for mild traumatic brain injury (mTBI) executive function (EF) rehabilitation. The experimental intervention involved two sets of participants an experimental group and a control group. Participants within the experimental and control groups partook in initial (week 1) and final (week 6) EF and TBI assessments. The experimental group additionally participated in four weeks (weeks 2 - 5) of an experimental intervention in beta stage of a web-based application. The aim of the intervention was to train EF skills planning, organization, and cognitive flexibility through serious gamification. At the conclusion of the study, it was observed that participants within the experimental group achieved higher scores on the experimental executive function assessment when compared to the control group. The difference in scores can be attributed to the weekly participation in executive function training.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
Proper allocation of attention while driving is imperative to driver safety, as well as the safety of those around the driver. There is no doubt that in-vehicle alerts can effectively direct driver attention. In fact, visual, auditory, and tactile alert…
Proper allocation of attention while driving is imperative to driver safety, as well as the safety of those around the driver. There is no doubt that in-vehicle alerts can effectively direct driver attention. In fact, visual, auditory, and tactile alert modalities have all shown to be more effective than no alert at all. However, research on in-vehicle alerts has primarily been limited to single-hazard scenarios. The current research examines the effects of in-vehicle alert modality on driver attention towards simultaneously occurring hazards. When a driver is presented with multiple stimuli simultaneously, there is the risk that they will experience alert masking, when one stimulus is obscured by the presence of another stimulus. As the number of concurrent stimuli increases, the ability to report targets decreases. Meanwhile, the alert acts as another target that they must also process. Recent research on masking effects of simultaneous alerts has shown masking to lead to breakdowns in detection and identification of alarms during a task, outlining a possible cost of alert technology. Additionally, existing work has shown auditory alerts to be more effective in directing driver attention, resulting in faster reaction times (RTs) than visual alerts. Multiple Resource Theory suggests that because of the highly visual nature of driving, drivers may have more auditory resources than visual resources available to process stimuli without becoming overloaded. Therefore, it was predicted that auditory alerts would be more effective in allowing drivers to recognize both potential hazards, measured though reduced brake reaction times and increased accuracy during a post-drive hazard observance question. The current study did not support the hypothesis. Modality did not result in a significant difference in drivers’ attention to simultaneously occurring hazards. The salience of hazards in each scenario seemed to make the largest impact on whether participants observed the hazard. Though the hypothesis was not supported, there were several limitations. Additionally, and regardless, the study results did point to the importance of further research on simultaneously occurring hazards. These scenarios pose a risk to drivers, especially when their attention is allocated to only one of the hazards.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
With the increasing popularity of AI and machine learning, human-AI teaming has a wide range of applications in transportation, healthcare, the military, manufacturing, and people’s everyday life. Measurement of human-AI team effectiveness is essential for guiding the design of AI…
With the increasing popularity of AI and machine learning, human-AI teaming has a wide range of applications in transportation, healthcare, the military, manufacturing, and people’s everyday life. Measurement of human-AI team effectiveness is essential for guiding the design of AI and evaluating human-AI teams. To develop suitable measures of human-AI teamwork effectiveness, we created a search and rescue task environment in Minecraft, in which Artificial Social Intelligence (ASI) agents inferred human teams’ mental states, predicted their actions, and intervened to improve their teamwork (Huang et al., 2022). As a comparison, we also collected data from teams with a human advisor and with no advisor. We investigated the effects of human advisor interventions on team performance. In this study, we examined intervention data and compliance in a human-AI teaming experiment to gain insights into the efficacy of advisor interventions. The analysis categorized the types of interventions provided by a human advisor and the corresponding compliance. The finding of this paper is a preliminary step towards a comprehensive study on ASI agents, in which results from the human advisor study can provide valuable comparisons and insights. Future research will focus on analyzing ASI agents’ interventions to determine their effectiveness, identify the best measurements for human-AI teamwork effectiveness, and facilitate the development of ASI agents.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
I compared scores of situational awareness to mission performance scores from the Human-Robot Interaction Lab at the ASU campus. This study uses Roblox in a virtual environment to simulate a search and rescue environment. Higher situational awareness was seen to…
I compared scores of situational awareness to mission performance scores from the Human-Robot Interaction Lab at the ASU campus. This study uses Roblox in a virtual environment to simulate a search and rescue environment. Higher situational awareness was seen to be positively correlated with mission performance scores, but the study is yet to be complete.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
Cyber operations are a complex sociotechnical system where humans and computers are operating in an environments in constant flux, as new technology and procedures are applied. Once inside the network, establishing a foothold, or beachhead, malicious actors can collect sensitive…
Cyber operations are a complex sociotechnical system where humans and computers are operating in an environments in constant flux, as new technology and procedures are applied. Once inside the network, establishing a foothold, or beachhead, malicious actors can collect sensitive information, scan targets, and execute an attack.Increasing defensive capabilities through cyber deception shows great promise by providing an opportunity to delay and disrupt an attacker once network perimeter security has already been breached. Traditional Human Factors research and methods are designed to mitigate human limitations (e.g., mental, physical) to improve performance. These methods can also be used combatively to upend performance. Oppositional Human Factors (OHF), seek to strategically capitalize on cognitive limitations by eliciting decision-making errors and poor usability. Deceptive tactics to elicit decision-making biases might infiltrate attacker processes with uncertainty and make the overall attack economics unfavorable and cause an adversary to make mistakes and waste resources.
Two online experimental platforms were developed to test the Sunk Cost Fallacy in an interactive, gamified, and abstracted version of cyber attacker activities. This work presents the results of the Cypher platform. Offering a novel approach to understand decision-making and the Sunk Cost Fallacy influenced by factors of uncertainty, project completion and difficulty on progress decisions. Results demonstrate these methods are effective in delaying attacker forward progress, while further research is needed to fully understand the context in which decision-making limitations do and do not occur. The second platform, Attack Surface, is described. Limitations and lessons learned are presented for future work.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
As people begin to live longer and the population shifts to having more olderadults on Earth than young children, radical solutions will be needed to ease the
burden on society. It will be essential to develop technology that can age with…
As people begin to live longer and the population shifts to having more olderadults on Earth than young children, radical solutions will be needed to ease the
burden on society. It will be essential to develop technology that can age with the
individual. One solution is to keep older adults in their homes longer through smart
home and smart living technology, allowing them to age in place. People have many
choices when choosing where to age in place, including their own homes, assisted
living facilities, nursing homes, or family members. No matter where people choose to
age, they may face isolation and financial hardships. It is crucial to keep finances in
mind when developing Smart Home technology.
Smart home technologies seek to allow individuals to stay inside their homes for
as long as possible, yet little work looks at how we can use technology in different
life stages. Robots are poised to impact society and ease burns at home and in the
workforce. Special attention has been given to social robots to ease isolation. As
social robots become accepted into society, researchers need to understand how these
robots should mimic natural conversation. My work attempts to answer this question
within social robotics by investigating how to make conversational robots natural and
reciprocal.
I investigated this through a 2x2 Wizard of Oz between-subjects user study. The
study lasted four months, testing four different levels of interactivity with the robot.
None of the levels were significantly different from the others, an unexpected result. I
then investigated the robot’s personality, the participant’s trust, and the participant’s
acceptance of the robot and how that influenced the study.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
This paper documents a study of the relationship between heads up display (HUDs) customization and player performance. Additional measures capture satisfaction and prior gaming experience. The goal of this study was to develop a framework on which future Human Systems…
This paper documents a study of the relationship between heads up display (HUDs) customization and player performance. Additional measures capture satisfaction and prior gaming experience. The goal of this study was to develop a framework on which future Human Systems Engineering studies could create games that are tailor made to examine a given area of interest. This study utilized a two-by-two design, where participants play a two-dimensional (2D) platformer game with a mechanic that incentivizes attention to the HUD. This study successfully developed a framework and was moderately successful in uncovering limitations and demonstrating areas for improvement in follow-on studies. Specifically, this study illuminated issues with the low amount of usable data caused by design issues, participant apathy, and reliance on self-reporting data collection. Extensions of this study can utilize this framework and should look to recruit beyond crowdsourcing platforms, collect more diverse data, reduce participant effort, and address other considerations that were found during execution.
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)