Convoluted Processes: The Use and Misuse of Machine Learning in Data Analysis and Prediction

168210-Thumbnail Image.png
Description

With the rapid increase of technological capabilities, particularly in processing power and speed, the usage of machine learning is becoming increasingly widespread, especially in fields where real-time assessment of complex data is extremely valuable. This surge in popularity of machine

With the rapid increase of technological capabilities, particularly in processing power and speed, the usage of machine learning is becoming increasingly widespread, especially in fields where real-time assessment of complex data is extremely valuable. This surge in popularity of machine learning gives rise to an abundance of potential research and projects on further broadening applications of artificial intelligence. From these opportunities comes the purpose of this thesis. Our work seeks to meaningfully increase our understanding of current capabilities of machine learning and the problems they can solve. One extremely popular application of machine learning is in data prediction, as machines are capable of finding trends that humans often miss. Our effort to this end was to examine the CVE dataset and attempt to predict future entries with Random Forests. The second area of interest lies within the great promise being demonstrated by neural networks in the field of autonomous driving. We sought to understand the research being put out by the most prominent bodies within this field and to implement a model on one of the largest standing datasets, Berkeley DeepDrive 100k. This thesis describes our efforts to build, train, and optimize a Random Forest model on the CVE dataset and a convolutional neural network on the Berkeley DeepDrive 100k dataset. We document these efforts with the goal of growing our knowledge on (and usage of) machine learning in these topics.

Date Created
2022-05
Agent

A Graph-Based Machine Learning Approach to Realistic Traffic Volume Generation

Description
In this work, we explore the potential for realistic and accurate generation of hourly traffic volume with machine learning (ML), using the ground-truth data of Manhattan road segments collected by the New York State Department of Transportation (NYSDOT). Specifically, we

In this work, we explore the potential for realistic and accurate generation of hourly traffic volume with machine learning (ML), using the ground-truth data of Manhattan road segments collected by the New York State Department of Transportation (NYSDOT). Specifically, we address the following question– can we develop a ML algorithm that generalizes the existing NYSDOT data to all road segments in Manhattan?– by introducing a supervised learning task of multi-output regression, where ML algorithms use road segment attributes to predict hourly traffic volume. We consider four ML algorithms– K-Nearest Neighbors, Decision Tree, Random Forest, and Neural Network– and hyperparameter tune by evaluating the performances of each algorithm with 10-fold cross validation. Ultimately, we conclude that neural networks are the best-performing models and require the least amount of testing time. Lastly, we provide insight into the quantification of “trustworthiness” in a model, followed by brief discussions on interpreting model performance, suggesting potential project improvements, and identifying the biggest takeaways. Overall, we hope our work can serve as an effective baseline for realistic traffic volume generation, and open new directions in the processes of supervised dataset generation and ML algorithm design.
Date Created
2022-05
Agent

An Introduction to Unstructured Case Management

166246-Thumbnail Image.png
Description
In the age of information, collecting and processing large amounts of data is an integral part of running a business. From training artificial intelligence to driving decision making, the applications of data are far-reaching. However, it is difficult to process

In the age of information, collecting and processing large amounts of data is an integral part of running a business. From training artificial intelligence to driving decision making, the applications of data are far-reaching. However, it is difficult to process many types of data; namely, unstructured data. Unstructured data is “information that either does not have a predefined data model or is not organized in a pre-defined manner” (Balducci & Marinova 2018). Such data are difficult to put into spreadsheets and relational databases due to their lack of numeric values and often come in the form of text fields written by the consumers (Wolff, R. 2020). The goal of this project is to help in the development of a machine learning model to aid CommonSpirit Health and ServiceNow, hence why this approach using unstructured data was selected. This paper provides a general overview of the process of unstructured data management and explores some existing implementations and their efficacy. It will then discuss our approach to converting unstructured cases into usable data that were used to develop an artificial intelligence model which is estimated to be worth $400,000 and save CommonSpirit Health $1,200,000 in organizational impact.
Date Created
2022-05
Agent

C*-Algebra in Quantum Mechanics: Proving the Limitations of Our Typical Representations and the Need for C*-Algebra

166204-Thumbnail Image.png
Description

In thesis we will build up our operator theory for finite and infinite dimensional systems. We then prove that Heisenberg and Schrodinger representations are equivalent for systems with finite degrees of freedom. We will then make a case to switch

In thesis we will build up our operator theory for finite and infinite dimensional systems. We then prove that Heisenberg and Schrodinger representations are equivalent for systems with finite degrees of freedom. We will then make a case to switch to a C*-algebra formulation of quantum mechanics as we will prove that the Schrodinger and Heisenberg pictures become inadequate to full describe systems with infinitely many degrees of freedom because of inequivalent representations. This becomes important as we shift from single particle systems to quantum field theory, statistical mechanics, and many other areas of study. The goal of this thesis is to introduce these mathematical topics rigorously and prove that they are necessary for further study in particle physics.

Date Created
2022-05
Agent

Mathematical Assessment of the Impact of Insecticide-Based Intervention on Malaria Transmission Dynamics

166199-Thumbnail Image.png
Description
Malaria is a deadly, infectious, parasitic disease which is caused by Plasmodium parasites and transmitted between humans via the bite of adult female Anopheles mosquitoes. The primary insecticide-based interventions used to control malaria are indoor residual spraying (IRS) and long-lasting

Malaria is a deadly, infectious, parasitic disease which is caused by Plasmodium parasites and transmitted between humans via the bite of adult female Anopheles mosquitoes. The primary insecticide-based interventions used to control malaria are indoor residual spraying (IRS) and long-lasting insecticide nets (LLINs). Larvicides are another insecticide-based intervention which is less commonly used. In this study, a mathematical model for malaria transmission dynamics in an endemic region which incorporates the use of IRS, LLINS, and larvicides is presented. The model is rigorously analyzed to gain insight into the asymptotic stability of the disease-free equilibrium. Simulations of the model show that individual insecticide-based interventions will not realistically control malaria in regions with high endemicity, but an integrated vector management strategy involving the use of multiple interventions could lead to the effective control of the disease. This study suggests that the use of larvicides alongside IRS and LLINs in endemic regions may be more effective than using only IRS and LLINs.
Date Created
2022-05
Agent

Addition of Predatory Bacteria Decreases the Mortality of C. elegans

166196-Thumbnail Image.png
Description

Bdellovibrio bacteriovorus (BB) is a gram negative predatory bacteria that uses other gram negative bacteria to proliferate non-binarily. Due to the predatory nature of BB researchers have proposed to use it as a potential biocontrol agent against other gram negative

Bdellovibrio bacteriovorus (BB) is a gram negative predatory bacteria that uses other gram negative bacteria to proliferate non-binarily. Due to the predatory nature of BB researchers have proposed to use it as a potential biocontrol agent against other gram negative bacteria. The in vivo effect of predatory bacteria on a living host lacks thorough investigation. This paper explores BB inside and outside of the C. elegans. BB acts internally by pre- infecting C. elegans with E. coli and then treating the worms with BB. After BB treatment worm survivavbility increased and morbidity decreased. Ex- ternally, BB modulated the environment around the nematode which reduced infection rates and increased nematode lifespan and survivability. Together, the internal and external results suggest BB has the capability to act as a living antibiotic acting topically and internally to reduce infection rates.

Date Created
2022-05
Agent

Game Theory and its Applications to Infrastructure Security: A Bibliometric Analysis

166171-Thumbnail Image.png
Description
Game theory, the mathematical study of mathematical models and simulations that often play out like a game, is applicable to a plethora of disciplines, one of which is infrastructure security. This is a rather new and niche subject area, and

Game theory, the mathematical study of mathematical models and simulations that often play out like a game, is applicable to a plethora of disciplines, one of which is infrastructure security. This is a rather new and niche subject area, and our aim is to perform a bibliographic analysis to analyze the thematic makeup of a selected body of publications in this area, as well as analyze trends in paper publication, journal contributions, country contributions, and trends in the authorship of the publications.
Date Created
2022-05
Agent

Beauty Performance Across Different Formats of Values-Based Panhellenic Formal Recruitment at Arizona State University

166156-Thumbnail Image.png
Description
This thesis explores the relationship between the performance of beauty and Potential New Member (PNM) success across various formats of formal sorority recruitment at ASU. It builds off of existing scholarship in economics of beauty premiums in labor markets, as

This thesis explores the relationship between the performance of beauty and Potential New Member (PNM) success across various formats of formal sorority recruitment at ASU. It builds off of existing scholarship in economics of beauty premiums in labor markets, as well as sociological research on the intersection of beauty and human interaction. Through interviews of women who went through formal recruitment across three different modalities (in-person, virtual, and hybrid), themes emerged that suggest the current policies in place by ASU Panhellenic make it so that the performance of beauty hinders the facilitation of a recruitment process that is truly values-based.
Date Created
2022-05
Agent

Introduction to Unstructured Case Management

166066-Thumbnail Image.png
Description
Unstructured data management proves an increasingly valuable asset for organizations today as the amount of data organizations own increases every year. The purpose of this project is to detail the process which ServiceNow and CommonSpirit Health use in developing their

Unstructured data management proves an increasingly valuable asset for organizations today as the amount of data organizations own increases every year. The purpose of this project is to detail the process which ServiceNow and CommonSpirit Health use in developing their new IntelliRoute model which aims to classify and auto-resolve a significant portion of CommonSpirit Health’s more than 3,000,000 HR service-related cases. This paper examines typical strategies used to manage unstructured data and ServiceNow’s approach. Their approach focuses on data labelling by attaching a criticality sentiment to unstructured data and relating helpful knowledge base articles. The labelled data is then used to train an Artificial Intelligence model which automatically labels cases and refers appropriate knowledge articles.
Date Created
2022-05
Agent

Quantitative Image Corrections for EMCCD Camera

165963-Thumbnail Image.png
Description

Electron Multiplying Charge Coupled Device (EMCCD) cameras are widely used for fluorescence microscopy experiments. However, the quantitative determination of biological parameters uniquely depends on characteristics of the unavoidably inhomogenous illumination profile as it gives rise to an image. It is

Electron Multiplying Charge Coupled Device (EMCCD) cameras are widely used for fluorescence microscopy experiments. However, the quantitative determination of biological parameters uniquely depends on characteristics of the unavoidably inhomogenous illumination profile as it gives rise to an image. It is therefore of interest to learn this inhomogenous illumination profiles that can dramatically vary across images alongside the camera parameters though a detailed camera model. In this manuscript we create a detailed model to learn inhomogeneous illumination profiles as well as learn all associated camera parameters. We achieve this within a Bayesian paradigm allowing us to determine full distributions over the unknowns.

Date Created
2022-05
Agent