System-level Models for Network Monitoring and Change Detection

168304-Thumbnail Image.png
Description
Monitoring a system for deviations from standard or reference behavior is essential for many data-driven tasks. Whether it is monitoring sensor data or the interactions between system elements, such as edges in a path or transactions in a network, the

Monitoring a system for deviations from standard or reference behavior is essential for many data-driven tasks. Whether it is monitoring sensor data or the interactions between system elements, such as edges in a path or transactions in a network, the goal is to detect significant changes from a reference. As technological advancements allow for more data to be collected from systems, monitoring approaches should evolve to accommodate the greater collection of high-dimensional data and complex system settings. This dissertation introduces system-level models for monitoring tasks characterized by changes in a subset of system components, utilizing component-level information and relationships. A change may only affect a portion of the data or system (partial change). The first three parts of this dissertation present applications and methods for detecting partial changes. The first part introduces a methodology for partial change detection in a simple, univariate setting. Changes are detected with posterior probabilities and statistical mixture models which allow only a fraction of data to change. The second and third parts of this dissertation center around monitoring more complex multivariate systems modeled through networks. The goal is to detect partial changes in the underlying network attributes and topology. The contributions of the second and third parts are two non-parametric system-level monitoring techniques that consider relationships between network elements. The algorithm Supervised Network Monitoring (SNetM) leverages Graph Neural Networks and transforms the problem into supervised learning. The other algorithm Supervised Network Monitoring for Partial Temporal Inhomogeneity (SNetMP) generates a network embedding, and then transforms the problem to supervised learning. At the end, both SNetM and SNetMP construct measures and transform them to pseudo-probabilities to be monitored for changes. The last topic addresses predicting and monitoring system-level delays on paths in a transportation/delivery system. For each item, the risk of delay is quantified. Machine learning is used to build a system-level model for delay risk, given the information available (such as environmental conditions) on the edges of a path, which integrates edge models. The outputs can then be used in a system-wide monitoring framework, and items most at risk are identified for potential corrective actions.
Date Created
2021
Agent

Machine Learning Models for High-Dimensional Matched Data

161983-Thumbnail Image.png
Description
Matching or stratification is commonly used in observational studies to remove bias due to confounding variables. Analyzing matched data sets requires specific methods which handle dependency among observations within a stratum. Also, modern studies often include hundreds or thousands of

Matching or stratification is commonly used in observational studies to remove bias due to confounding variables. Analyzing matched data sets requires specific methods which handle dependency among observations within a stratum. Also, modern studies often include hundreds or thousands of variables. Traditional methods for matched data sets are challenged in high-dimensional settings, mixed type variables (numerical and categorical), nonlinear andinteraction effects. Furthermore, machine learning research for such structured data is quite limited. This dissertation addresses this important gap and proposes machine learning models for identifying informative variables from high-dimensional matched data sets. The first part of this dissertation proposes a machine learning model to identify informative variables from high-dimensional matched case-control data sets. The outcome of interest in this study design is binary (case or control), and each stratum is assumed to have one unit from each outcome level. The proposed method which is referred to as Matched Forest (MF) is effective for large number of variables and identifying interaction effects. The second part of this dissertation proposes three enhancements of MF algorithm. First, a regularization framework is proposed to improve variable selection performance in excessively high-dimensional settings. Second, a classification method is proposed to classify unlabeled pairs of data. Third, two metrics are proposed to estimate the effects of important variables identified by MF. The third part proposes a machine learning model based on Neural Networks to identify important variables from a more generalized matched case-control data set where each stratum has one unit from case outcome level and more than one unit from control outcome level. This method which is referred to as Matched Neural Network (MNN) performs better than current algorithms to identify variables with interaction effects. Lastly, a generalized machine learning model is proposed to identify informative variables from high-dimensional matched data sets where the outcome has more than two levels. This method outperforms existing algorithms in the literature in identifying variables with complex nonlinear and interaction effects.
Date Created
2021
Agent

On Memory and Physiological Signals of Experts and Novices-Case Study: Chess

134430-Thumbnail Image.png
Description
Abstract Chess has been a common research topic for expert-novice studies and thus for learning science as a whole because of its limited framework and longevity as a game. One factor is that chess studies are good at measuring how

Abstract Chess has been a common research topic for expert-novice studies and thus for learning science as a whole because of its limited framework and longevity as a game. One factor is that chess studies are good at measuring how expert chess players use their memory and skills to approach a new chessboard con�guration. Studies have shown that chess skill is based on memory, speci�cally, "chunks" of chess piece positions that have been previously encountered by players. However, debate exists concerning how these chunks are constructed in players' memory. These chunks could be constructed by proximity of pieces on the chessboard as well as their precise location or constructed through attack-defense relations. The primary objective of this study is to support which one is more in line with chess players' actual chess abilities based off their memory, proximity or attack/defense. This study replicates and extends an experiment conducted by McGregor and Howe (2002), which explored the argument that pieces are primed more by attack and defense relations than by proximity. Like their study, the present study examined novice and expert chess players' response times for correct and error responses by showing slides of game configurations. In addition to these metrics, the present study also incorporated an eye-tracker to measure visual attention and EEG to measure affective and cognitive states. They were added to allow the comparison of subtle and unconscious behaviors of both novices and expert chess players. Overall, most McGregor and Howe's (2002) results were replicated supporting their theory on chess expertise. This included statistically significance for skill in the error rates with the mean error rates on the piece recognition tests were 70.1% for novices and 87.9% for experts, as well as significance for the two-way interaction for relatedness and proximity with error rates of 22.4% for unrelated/far, 18.8% for related/far, 15.8% for unrelated
ear, and 29.3% for related
ear. Unfortunately, there were no statistically significance for any of the response time effects, which McGregor and Howe found for the interaction between skill and proximity. Despite eye-tracking and EEG data not either support nor confirm McGregor and Howe's theory on how chess players memorize chessboard configurations, these metrics did help build a secondary theory on how novices typically rely on proximity to approach chess and new visual problems in general. This was exemplified by the statistically significant results for short-term excitement for the two-way interaction of skill and proximity, where the largest short-term excitement score was between novices on near proximity slides. This may indicate that novices, because they may lean toward using proximity to try to recall these pieces, experience a short burst of excitement when the pieces are close to each other because they are more likely to recall these configurations.
Date Created
2017-05
Agent

Leye (Lie) Detector \u2014 A Study of Lie Detection using Eye Tracking, Facial Gestures, and EEG

134293-Thumbnail Image.png
Description
Lie detection is used prominently in contemporary society for many purposes such as for pre-employment screenings, granting security clearances, and determining if criminals or potential subjects may or may not be lying, but by no means is not limited to

Lie detection is used prominently in contemporary society for many purposes such as for pre-employment screenings, granting security clearances, and determining if criminals or potential subjects may or may not be lying, but by no means is not limited to that scope. However, lie detection has been criticized for being subjective, unreliable, inaccurate, and susceptible to deliberate manipulation. Furthermore, critics also believe that the administrator of the test also influences the outcome as well. As a result, the polygraph machine, the contemporary device used for lie detection, has come under scrutiny when used as evidence in the courts. The purpose of this study is to use three entirely different tools and concepts to determine whether eye tracking systems, electroencephalogram (EEG), and Facial Expression Emotion Analysis (FACET) are reliable tools for lie detection. This study found that certain constructs such as where the left eye is looking at in regard to its usual position and engagement levels in eye tracking and EEG respectively could distinguish between truths and lies. However, the FACET proved the most reliable tool out of the three by providing not just one distinguishing variable but seven, all related to emotions derived from movements in the facial muscles during the present study. The emotions associated with the FACET that were documented to possess the ability to distinguish between truthful and lying responses were joy, anger, fear, confusion, and frustration. In addition, an overall measure of the subject's neutral and positive emotional expression were found to be distinctive factors. The implications of this study and future directions are discussed.
Date Created
2017-05
Agent

Enhancing Object Detection In An Augmented Reality Learning System

136586-Thumbnail Image.png
Description
The goal of the ANLGE Lab's AR assembly project is to create/save assemblies as well as to replicate assemblies later with real-time AR feedback. In this iteration of the project, the SURF algorithm was used to provide object detection for

The goal of the ANLGE Lab's AR assembly project is to create/save assemblies as well as to replicate assemblies later with real-time AR feedback. In this iteration of the project, the SURF algorithm was used to provide object detection for 5 featureful objects (a Lego girl piece, a Lego guy piece, a blue Lego car piece, a window piece, and a fence piece). Functionality was added to determine the location of these 5 featureful objects within a frame as well by using the SURF keypoints associated with detection. Finally, the feedback mechanism by which the system detects connections between objects was improved to consider the size of the blocks in determining connections rather than using static values. Additional user features such as adding a new object and using voice commands were also implemented to make the system more user friendly.
Date Created
2015-05
Agent