Enhancing Binary Analysis through Cognitive Load Theory

171701-Thumbnail Image.png
Description
Reverse engineering is a process focused on gaining an understanding for the intricaciesof a system. This practice is critical in cybersecurity as it promotes the findings and patching of vulnerabilities as well as the counteracting of malware. Disassemblers and decompilers have become

Reverse engineering is a process focused on gaining an understanding for the intricaciesof a system. This practice is critical in cybersecurity as it promotes the findings and patching of vulnerabilities as well as the counteracting of malware. Disassemblers and decompilers have become essential when reverse engineering due to the readability of information they transcribe from binary files. However, these tools still tend to produce involved and complicated outputs that hinder the acquisition of knowledge during binary analysis. Cognitive Load Theory (CLT) explains that this hindrance is due to the human brain’s inability to process superfluous amounts of data. CLT classifies this data into three cognitive load types — intrinsic, extraneous, and germane — that each can help gauge complex procedures. In this research paper, a novel program call graph is presented accounting for these CLT principles. The goal of this graphical view is to reduce the cognitive load tied to the depiction of binary information and to enhance the overall binary analysis process. This feature was implemented within the binary analysis tool, angr and it’s user interface counterpart, angr-management. Additionally, this paper will examine a conducted user study to quantitatively and qualitatively evaluate the effectiveness of the newly proposed proximity view (PV). The user study includes a binary challenge solving portion measured by defined metrics and a survey phase to receive direct participant feedback regarding the view. The results from this study show statistically significant evidence that PV aids in challenge solving and improves the overall understanding binaries. The results also signify that this improvement comes with the cost of time. The survey section of the user study further indicates that users find PV beneficial to the reverse engineering process, but additional information needs to be included in future developments.
Date Created
2022
Agent

Extracting Semantic Information from Online Conversations to Enhance Cyber Defense

171434-Thumbnail Image.png
Description
Recent advances in techniques allow the extraction of Cyber Threat Information (CTI) from online content, such as social media, blog articles, and posts in discussion forums. Most research work focuses on social media and blog posts since their content is

Recent advances in techniques allow the extraction of Cyber Threat Information (CTI) from online content, such as social media, blog articles, and posts in discussion forums. Most research work focuses on social media and blog posts since their content is often contributed by cybersecurity experts and is usually of cleaner formats. While posts in online forums are noisier and less structured, online forums attract more users than other sources and contain much valuable information that may help predict cyber threats. Therefore, effectively extracting CTI from online forum posts is an important task in today's data-driven cybersecurity defenses. Many Natural Language Processing (NLP) techniques are applied to the cybersecurity domains to extract the useful information, however, there is still space to improve. In this dissertation, a new Named Entity Recognition framework for cybersecurity domains and thread structure construction methods for unstructured forums are proposed to support the extraction of CTI. Then, extend them to filter the posts in the forums to eliminate non cybersecurity related topics with Cyber Attack Relevance Scale (CARS), extract the cybersecurity knowledgeable users to enhance more information for enhancing cybersecurity, and extract trending topic phrases related to cyber attacks in the hackers forums to find the clues for potential future attacks to predict them.
Date Created
2022
Agent

A Preliminary Approach for Rewriting AVX Instructions for Binary Decompilation

166285-Thumbnail Image.png
Description

A proposed solution for the decompilation of binaries that include Intel Advanced Vector Extension instruction sets is presented, along with an explanation of the methodology and an overview of the difficulties encountered with the current decompilation process. A simple approach

A proposed solution for the decompilation of binaries that include Intel Advanced Vector Extension instruction sets is presented, along with an explanation of the methodology and an overview of the difficulties encountered with the current decompilation process. A simple approach was made to convert vector operations into scalar operations reflected in new assembly code. This new code overwrites instructions using AVX registers so that all available decompilation software is able to properly decompile binaries using these registers. The results show that this approach is functional and successful at resolving the decompilation problem. However, there may be a way to optimize the performance of the output. In conclusion, our theoretical work can easily be extended and applied to a wider range of instructions and instruction sets to further resolve related decompilation issues with binaries utilizing external instructions.

Date Created
2022-05
Agent

Exploring the Usage of Drones to Perform Network Reconnaissance and Other Wireless
Network Exploitation Methods

165086-Thumbnail Image.png
Description

Wardriving is when prospective malicious hackers drive with a portable computer to sniff out and map potentially vulnerable networks. With the advent of smart homes and other Internet of Things devices, this poses the possibility of more unsecure targets. The

Wardriving is when prospective malicious hackers drive with a portable computer to sniff out and map potentially vulnerable networks. With the advent of smart homes and other Internet of Things devices, this poses the possibility of more unsecure targets. The hardware available to the public has also miniaturized and gotten more powerful. One no longer needs to carry a complete laptop to carry out network mapping. With this miniaturization and greater popularity of quadcopter technology, the two can be combined to create a more efficient wardriving setup in a potentially more target-rich environment. Thus, we set out to create a prototype as a proof of concept of this combination. By creating a bracket for a Raspberry Pi to be mounted to a drone with other wireless sniffing equipment, we demonstrate that one can use various off the shelf components to create a powerful network detection device. In this write up, we also outline some of the challenges encountered by combining these two technologies, as well as the solutions to those challenges. Adding payload weight to drones that are not initially designed for it causes detrimental effects to various characteristics such as flight behavior and power consumption. Less computing power is available due to the miniaturization that must take place for a drone-mounted solution. Communication between the miniature computer and a ground control computer is also essential in overall system operation. Below, we highlight solutions to these various problems as well as improvements that can be implemented for maximum system effectiveness.

Date Created
2022-05
Agent

Exploring the Usage of Drones to Perform Network Reconnaissance and Other Wireless Network Exploitation Methods

165085-Thumbnail Image.png
Description
Wardriving is when prospective malicious hackers drive with a portable computer to sniff out and map potentially vulnerable networks. With the advent of smart homes and other Internet of Things devices, this poses the possibility of more unsecure targets. The

Wardriving is when prospective malicious hackers drive with a portable computer to sniff out and map potentially vulnerable networks. With the advent of smart homes and other Internet of Things devices, this poses the possibility of more unsecure targets. The hardware available to the public has also miniaturized and gotten more powerful. One no longer needs to carry a complete laptop to carry out network mapping. With this miniaturization and greater popularity of quadcopter technology, the two can be combined to create a more efficient wardriving setup in a potentially more target-rich environment. Thus, we set out to create a prototype as a proof of concept of this combination. By creating a bracket for a Raspberry Pi to be mounted to a drone with other wireless sniffing equipment, we demonstrate that one can use various off the shelf components to create a powerful network detection device. In this write up, we also outline some of the challenges encountered by combining these two technologies, as well as the solutions to those challenges. Adding payload weight to drones that are not initially designed for it causes detrimental effects to various characteristics such as flight behavior and power consumption. Less computing power is available due to the miniaturization that must take place for a drone-mounted solution. Communication between the miniature computer and a ground control computer is also essential in overall system operation. Below, we highlight solutions to these various problems as well as improvements that can be implemented for maximum system effectiveness.
Date Created
2022-05
Agent

Analyzing, Understanding, and Improving Predicted Variable Names in Decompiled Binary Code

161705-Thumbnail Image.png
Description
Reverse engineers use decompilers to analyze binaries when their source code is unavailable. A binary decompiler attempts to transform binary programs to their corresponding high-level source code by recovering and inferring the information that was lost during the compilation process.

Reverse engineers use decompilers to analyze binaries when their source code is unavailable. A binary decompiler attempts to transform binary programs to their corresponding high-level source code by recovering and inferring the information that was lost during the compilation process. One type of information that is lost during compilation is variable names, which are critical for reverse engineers to analyze and understand programs. Traditional binary decompilers generally use automatically generated, placeholder variable names that are meaningless or have little correlation with their intended semantics. Having correct or meaningful variable names in decompiled code, instead of placeholder variable names, greatly increases the readability of decompiled binary code. Decompiled Identifier Renaming Engine (DIRE) is a state-of-the-art, deep-learning-based solution that automatically predicts variable names in decompiled binary code. However, DIRE's prediction result is far from perfect. The first goal of this research project is to take a close look at the current state-of-the-art solution for automated variable name prediction on decompilation output of binary code, assess the prediction quality, and understand how the prediction result can be improved. Then, as the second goal of this research project, I aim to improve the prediction quality of variable names. With a thorough understanding of DIRE's issues, I focus on improving the quality of training data. This thesis proposes a novel approach to improving the quality of the training data by normalizing variable names and converting their abbreviated forms to their full forms. I implemented and evaluated the proposed approach on a data set of over 10k and 20k binaries and showed improvements over DIRE.
Date Created
2021
Agent

Exploiting and Mitigating Advanced Security Vulnerabilities

161278-Thumbnail Image.png
Description
Cyberspace has become a field where the competitive arms race between defenders and adversaries play out. Adaptive, intelligent adversaries are crafting new responses to the advanced defenses even though the arms race has resulted in a gradual improvement of the

Cyberspace has become a field where the competitive arms race between defenders and adversaries play out. Adaptive, intelligent adversaries are crafting new responses to the advanced defenses even though the arms race has resulted in a gradual improvement of the security posture. This dissertation aims to assess the evolving threat landscape and enhance state-of-the-art defenses by exploiting and mitigating two different types of emerging security vulnerabilities. I first design a new cache attack method named Prime+Count which features low noise and no shared memory needed.I use the method to construct fast data covert channels. Then, I propose a novel software-based approach, SmokeBomb, to prevent cache side-channel attacks for inclusive and non-inclusive caches based on the creation of a private space in the L1 cache. I demonstrate the effectiveness of SmokeBomb by applying it to two different ARM processors with different instruction set versions and cache models and carry out an in-depth evaluation. Next, I introduce an automated approach that exploits a stack-based information leak vulnerability in operating system kernels to obtain sensitive data. Also, I propose a lightweight and widely applicable runtime defense, ViK, for preventing temporal memory safety violations which can lead attackers to have arbitrary code execution or privilege escalation together with information leak vulnerabilities. The security impact of temporal memory safety vulnerabilities is critical, but,they are difficult to identify because of the complexity of real-world software and the spatial separation of allocation and deallocation code. Therefore, I focus on preventing not the vulnerabilities themselves, but their exploitation. ViK can effectively protect operating system kernels and user-space programs from temporal memory safety violations, imposing low runtime and memory overhead.
Date Created
2021
Agent

Natural Intent: The Use and Misuse of Intents in Android Applications

158486-Thumbnail Image.png
Description
The Java programing language was implemented in such a way as to limit the amount of possible ways that a program written in Java could be exploited. Unfortunately, all of the protections and safeguards put in place for Java

The Java programing language was implemented in such a way as to limit the amount of possible ways that a program written in Java could be exploited. Unfortunately, all of the protections and safeguards put in place for Java can be circumvented if a program created in Java utilizes internal or external libraries that were created in a separate, insecure language such as C or C++. A secure Java program can then be made insecure and susceptible to even classic vulnerabilities such as stack overflows, string format attacks, and heap overflows and corruption. Through the internal or external libraries included in the Java program, an attacker could potentially hijack the execution flow of the program. Once the Attacker has control of where and how the program executes, the attacker can spread their influence to the rest of the system.

However, since these classic vulnerabilities are known weaknesses, special types of protections have been added to the compilers which create the executable code and the systems that run them. The most common forms of protection include Address SpaceLayout Randomization (ASLR), Non-eXecutable stack (NX Stack), and stack cookies or canaries. Of course, these protections and their implementations vary depending on the system. I intend to look specifically at the Android operating system which is used in the daily lives of a significant portion of the planet. Most Android applications execute in a Java context and leave little room for exploitability, however, there are also many applications which utilize external libraries to handle more computationally intensive tasks.

The goal of this thesis is to take a closer look at such applications and the protections surrounding them, especially how the default system protections as mentioned above are implemented and applied to the vulnerable external libraries. However, this is only half of the problem. The attacker must get their payload inside of the application in the first place. Since it is necessary to understand how this is occurring, I will also be exploring how the Android operating system gives outside information to applications and how developers have chosen to use that information.
Date Created
2020
Agent

Everything You Ever Wanted to Know About Bitcoin Mixers (But Were Afraid to Ask)

158251-Thumbnail Image.png
Description
The lack of fungibility in Bitcoin has forced its userbase to seek out tools that can heighten their anonymity. Third-party Bitcoin mixers utilize obfuscation techniques to protect participants from blockchain analysis. In recent years, various centralized and decentralized Bitcoin mixing

The lack of fungibility in Bitcoin has forced its userbase to seek out tools that can heighten their anonymity. Third-party Bitcoin mixers utilize obfuscation techniques to protect participants from blockchain analysis. In recent years, various centralized and decentralized Bitcoin mixing implementations have been proposed in academic literature. Although these methods depict a threat-free environment for users to preserve their anonymity, public Bitcoin mixers continue to be associated with theft and poor implementation.

This research explores the public Bitcoin mixer ecosystem to identify if today's mixing services have adopted academically proposed solutions. This is done through real-world interactions with publicly available mixers to analyze both implementation and resistance to common threats in the mixing landscape. First, proposed decentralized and centralized mixing protocols found in literature are outlined. Then, data is presented from 19 publicly announced mixing services available on the deep web and clearnet. The services are categorized based on popularity with the Bitcoin community and experiments are conducted on five public mixing services: ChipMixer, MixTum, Bitcoin Mixer, CryptoMixer, and Sudoku Wallet.

The results of the experiments highlight a clear gap between public and proposed Bitcoin mixers in both implementation and security. Today's mixing services focus on presenting users with a false sense of control to gain their trust rather then employing secure mixing techniques. As a result, the five selected services lack implementation of academically proposed techniques and display poor resistance to common mixer-related threats.
Date Created
2020
Agent

A Two-Way Pushdown Automaton for the Lua Language

131235-Thumbnail Image.png
Description
A two-way deterministic finite pushdown automaton ("2PDA") is developed for the Lua language. This 2PDA is evaluated against both a purpose-built Lua syntax test suite and the test suite used by the reference implementation of Lua, and fully passes both.
Date Created
2020-05
Agent