Full metadata
Title
Big Data Network Analysis of Genetic Variation and Gene Expression in Individuals with Breast Cancer
Description
The advent of big data analytics tools and frameworks has allowed for a plethora of new approaches to research and analysis, making data sets that were previously too large or complex more accessible and providing methods to collect, store, and investigate non-traditional data. These tools are starting to be applied in more creative ways, and are being used to improve upon traditional computation methods through distributed computing. Statistical analysis of expression quantitative trait loci (eQTL) data has classically been performed using the open source tool PLINK - which runs on high performance computing (HPC) systems. However, progress has been made in running the statistical analysis in the ecosystem of the big data framework Hadoop, resulting in decreased run time, reduced storage footprint, reduced job micromanagement and increased data accessibility. Now that the data can be more readily manipulated, analyzed and accessed, there are opportunities to use the modularity and power of Hadoop to further process the data. This project focuses on adding a component to the data pipeline that will perform graph analysis on the data. This will provide more insight into the relation between various genetic differences in individuals with breast cancer, and the resulting variation - if any - in gene expression. Further, the investigation will look to see if there is anything to be garnered from a perspective shift; applying tools used in classical networking contexts (such as the Internet) to genetically derived networks.
Date Created
2016-12
Contributors
- Randall, Jacob Christopher (Author)
- Buetow, Kenneth (Thesis director)
- Meuth, Ryan (Committee member)
- Almalih, Sara (Committee member)
- Computer Science and Engineering Program (Contributor)
- Barrett, The Honors College (Contributor)
Topical Subject
Resource Type
Extent
26 pages
Language
eng
Copyright Statement
In Copyright
Primary Member of
Series
Academic Year 2016-2017
Handle
https://hdl.handle.net/2286/R.I.40866
Level of coding
minimal
Cataloging Standards
System Created
- 2017-10-30 02:50:58
System Modified
- 2021-08-11 04:09:57
- 3 years 3 months ago
Additional Formats