Full metadata
Title
Machine Learning: A Sentiment Analysis of Customer Reviews
Description
Machine learning is the process of training a computer with algorithms to learn from data and make informed predictions. In a world where large amounts of data are constantly collected, machine learning is an important tool to analyze this data to find patterns and learn useful information from it. Machine learning applications expand to numerous fields; however, I chose to focus on machine learning with a business perspective for this thesis, specifically e-commerce.
The e-commerce market utilizes information to target customers and drive business. More and more online services have become available, allowing consumers to make purchases and interact with an online system. For example, Amazon is one of the largest Internet-based retail companies. As people shop through this website, Amazon gathers huge amounts of data on its customers from personal information to shopping history to viewing history. After purchasing a product, the customer may leave reviews and give a rating based on their experience. Performing analytics on all of this data can provide insights into making more informed business and marketing decisions that can lead to business growth and also improve the customer experience.
For this thesis, I have trained binary classification models on a publicly available product review dataset from Amazon to predict whether a review has a positive or negative sentiment. The sentiment analysis process includes analyzing and encoding the human language, then extracting the sentiment from the resulting values. In the business world, sentiment analysis provides value by revealing insights into customer opinions and their behaviors. In this thesis, I will explain how to perform a sentiment analysis and analyze several different machine learning models. The algorithms for which I compared the results are KNN, Logistic Regression, Decision Trees, Random Forest, Naïve Bayes, Linear Support Vector Machines, and Support Vector Machines with an RBF kernel.
The e-commerce market utilizes information to target customers and drive business. More and more online services have become available, allowing consumers to make purchases and interact with an online system. For example, Amazon is one of the largest Internet-based retail companies. As people shop through this website, Amazon gathers huge amounts of data on its customers from personal information to shopping history to viewing history. After purchasing a product, the customer may leave reviews and give a rating based on their experience. Performing analytics on all of this data can provide insights into making more informed business and marketing decisions that can lead to business growth and also improve the customer experience.
For this thesis, I have trained binary classification models on a publicly available product review dataset from Amazon to predict whether a review has a positive or negative sentiment. The sentiment analysis process includes analyzing and encoding the human language, then extracting the sentiment from the resulting values. In the business world, sentiment analysis provides value by revealing insights into customer opinions and their behaviors. In this thesis, I will explain how to perform a sentiment analysis and analyze several different machine learning models. The algorithms for which I compared the results are KNN, Logistic Regression, Decision Trees, Random Forest, Naïve Bayes, Linear Support Vector Machines, and Support Vector Machines with an RBF kernel.
Date Created
2020-05
Contributors
- Madaan, Shreya (Author)
- Meuth, Ryan (Thesis director)
- Nakamura, Mutsumi (Committee member)
- Computer Science and Engineering Program (Contributor, Contributor)
- Dean, W.P. Carey School of Business (Contributor)
- Barrett, The Honors College (Contributor)
Topical Subject
Resource Type
Extent
21 pages
Language
eng
Copyright Statement
In Copyright
Primary Member of
Series
Academic Year 2019-2020
Handle
https://hdl.handle.net/2286/R.I.56744
Level of coding
minimal
Cataloging Standards
System Created
- 2020-05-02 12:12:23
System Modified
- 2021-08-11 04:09:57
- 3 years 3 months ago
Additional Formats