IMG_0109.JPG

Amazon in Review

Analyzing the Validity of Amazon User Evaluation

Spring 2019

How many times have you excitedly ordered a highly-rated item online, only to find the product is lacking in quality or features despite the glowing reviews?

The Problem of Fake Reviews

As the e-commerce giant has grown, it has fought a losing battle with fake reviews. In every category you look, many of the most popular products come from no-name brands that somehow have thousands of shining 5-star reviews. However, upon closer inspection, the most up-voted reviews tell a different story of a cheaply made product. It's clear that these companies have been paying third parties to post thousands of false testimonies to boost sales.

Clearly, there is a strong motivation for businesses to engage in this sort of unethical activity. Research have shown that positive reviews influence 90% of purchase decisions and 86% of non-purchase decisions.

Key Indicators of Inauthenticity

Users:

1. Have only left one review

2. Rated the other products of same company or direct competitors

3. Left all reviews in a short period of time

4. Ratings are all 5 stars

5. Reviews are extremely short and non-descriptive

Reviews:

1. Sudden spikes in the number of reviews

2. Repetitive or exact words/phrases

3. Wrong product name

4. Extremely short in length

5. Post date - older reviews tend to be more accurate 

6. Does not include photo

A Solution through Data Analysis

Through analysis of Amazon reviews for different products, we can use data analysis to filter reviews on a large scale for inauthenticity. We can comb through for telling characteristics like those listed about and use tools that can analyze every single one of a customer's past reviews. Putting all these indicators together can give us a full picture of whats going on. 

After all, fake reviews hurt everyone by customer confidence and discouraging legitimate sellers on Amazon's marketplace.


Case Study: Apple Earbuds

I picked this product since I expected it to have a high number of fake reviews. Apple does not usually sell its products off its website. In addition, the more negative reviews a product has (25% in this case) the greater the motivation is for review manipulation.

Screen Shot 2020-06-03 at 1.23.08 AM.png

Ratings Analysis

Based on the graphs above we can see that this product has a large number of extremely positive and extremely negative reviews, which could be a sign on manipulation. Positive reviews also tended to be shorter than the ones of other ratings. On the chart, on the very right, I noticed that there were several periods of time when the only amount of 5-star ratings jumped. This could be a month where the company decided to buy fake reviews.

Screen Shot 2020-06-03 at 1.25.08 AM.png

Review Reliability Analysis

After going through all the reviews and coding them manually as real or fake, I made several additional plots to see if there are possible relations with some of the factors that I decided to scrape. In general, I found that reviewers that left fake reviews, tended to have a much higher average rating of their reviews than normal users. This is likely due to the fact that they are leaving extremely positive fake reviews for many products. In addition, consistent with our expectations, fake reviews tended to be very short, some even less than a sentence.

Prediction Model

I divided the data from the earphone reviews into 80% train data and 20% test data. After testing several models, I found that the logistic regression with a threshold of 0.39 had the highest accuracy on train data with 80.34% correct. I tested various different combinations of factors to arrive at this conclusion.

After using the prediction model, I found that the adjusted rating (average of rating which the model identified as reliable) for the product was 2.7 stars out of 5 with 35.84% of the reviews flagged as unreliable. Compare this with the Amazon rating of 3.8 stars out of 5.


Testing Amazon’s Most Popular Products

Scroll through the top products in each category to see how they stack up.

Beauty & Personal Care

beauty1.jpg

Dark Spot Corrector Treatment by Olay ProX

Amazon's Choice for Olay

Amazon Rating: 3.9 stars

Number of Reviews: 1,081 reviews

Percentage of Unreliable Reviews: 24.1%

Adjusted Rating: 3.77 stars

essence | Lash Princess False Lash Effect Mascara

#1 Bestseller in Mascara

Amazon Rating: 4.0 stars

Number of Reviews: 2,427 reviews

Percentage of Unreliable Reviews: 21.7%

Adjusted Rating: 3.64 stars

beauty3.jpg

Maybelline Instant Age Rewind Eraser Dark Circles Treatment Concealer

Amazon's Choice for Concealer

Amazon Rating: 4.1 stars

Number of Reviews: 5,412 reviews

Percentage of Unreliable Reviews: 32.6%

Adjusted Rating: 3.80 stars

Screen Shot 2020-06-03 at 4.43.48 PM.png
beauty4.jpg

Revlon One-Step Hair Dryer & Volumizer

Amazon's Choice for Hair Dryer & Styler

Amazon Rating: 4.4 stars

Number of Reviews: 5,634* reviews

Percentage of Unreliable Reviews: 26.6%

Adjusted Rating: 4.31 stars

Screen Shot 2020-06-03 at 4.43.55 PM.png
beauty5.jpg

Versace Bright Crystal Eau de Toilette Spray

Amazon Rating: 4.4 stars

Number of Reviews: 1,863* reviews

Percentage of Unreliable Reviews: 40.6%

Adjusted Rating: 4.13 stars

Home & Kitchen

Screen Shot 2020-06-03 at 5.07.01 PM.png

COSORI Air Fryer

Amazon's Choice for Air Fryer

Amazon Rating: 4.7 stars

Number of Reviews: 981* reviews

Percentage of Unreliable Reviews: 21.6%

Adjusted Rating: 4.67 stars

home2.jpg

Instant Pot Duo Mini

#1 Bestseller in Kitchen & Dining

Amazon Rating: 4.6 stars

Number of Reviews: 32,556* reviews

Percentage of Unreliable Reviews: 36.4%

Adjusted Rating: 4.39 stars

Screen Shot 2020-06-03 at 4.55.37 PM.png
Screen Shot 2020-06-03 at 5.06.48 PM.png

Keurig K-Classic Coffee Maker K-Cup Pod

Amazon's Choice for Keurig Coffee Maker

Amazon Rating: 3.9 stars

Number of Reviews: 9,309* reviews

Percentage of Unreliable Reviews: 37.6%

Adjusted Rating: 3.79 stars

home4.jpg

Linenspa LS06TTGRSP 6 Inch Innerspring Mattress

Amazon Rating: 3.9 stars

Number of Reviews: 4,353* reviews

Percentage of Unreliable Reviews: 33.6%

Adjusted Rating: 3.60 stars

Screen Shot 2020-06-03 at 4.56.09 PM.png
home5.jpg

Mellanni Bed Sheet Set - Brushed Microfiber 1800 Bedding

Amazon's Choice for Bed Sheet Set

Amazon Rating: 4.4 stars

Number of Reviews: 52,884* reviews

Percentage of Unreliable Reviews: 31.7%

Adjusted Rating: 4.11 stars

Screen Shot 2020-06-03 at 4.56.23 PM.png

Sports & Outdoors

Screen Shot 2020-06-03 at 5.28.56 PM.png

Coleman Dome Tent for Camping

Amazon's Choice for 2-Person Tent

Amazon Rating: 4.4 stars

Number of Reviews: 6,752* reviews

Percentage of Unreliable Reviews: 27.3%

Adjusted Rating: 4.22 stars

Fit Simplify Resistance Loop Exercise Bands

Amazon's Choice for Resistance Bands

Amazon Rating: 4.2 stars

Number of Reviews: 9,334* reviews

Percentage of Unreliable Reviews: 42.4%

Adjusted Rating: 3.61 stars

Screen Shot 2020-06-03 at 5.31.05 PM.png

LETSCOM Fitness Tracker HR

Amazon Rating: 3.9 stars

Number of Reviews: 4,114* reviews

Percentage of Unreliable Reviews: 25.4%

Adjusted Rating: 3.50 stars

sport4.jpg

Sport-Brella SPF 50+ Adjustable Umbrella

#1 Bestseller in Camping Sun Shelters

Amazon Rating: 4.0 stars

Number of Reviews: 4,410* reviews

Percentage of Unreliable Reviews: 26.1%

Adjusted Rating: 3.75 stars

UPOWEX Resistance Bands Set

#1 New Release in Exercise Bands

Amazon Rating: 4.8 stars

Number of Reviews: 2,000* reviews

Percentage of Unreliable Reviews: 42.3%

Adjusted Rating: 4.78 stars

Screen Shot 2020-06-03 at 5.31.38 PM.png

*Only the most recent 1,200 reviews were sampled from products with over 1,500 reviews due to limitations in Web Scraper's processing time.


Methodology

Collection

The Amazon product was pulled from the Best Sellers page. I picked a product that I believed could have engaged in ratings manipulation.

Data was gathered using the Web Scraper Chrome extension. I scraped all the reviews from each product, as well as went into each reviewer profile to scrape the reviews they had left for previous products. This resulted into two separate CSVs scraped: one with the review data for a specific product, one with all the past reviews of each user.

Due to the limitations on speed of the scraper, I was unable to scrape more than 1.5k reviews per product (or else the task would take 2+ hours). Since my analysis showed that date was not a significant predictor of reliability, for products with over 1,500 reviews, I scraped the 1,200 most recent ones.

Selector graph for the users

Selector graph for the users

Selector graph for each review

Selector graph for each review

Wrangling

Data was cleaned in R. This included trimming unnecessary strings, converting to numbers, and formatting dates. Then in order to calculate aggregate statistics such as the average date deviation of all of a user's reviews or their average star ratings, the reviewer data base was condensed into one entry per user. The two data frames for each product, reviews and reviewers, were then joined. During this process, I also performed sentiment analysis of the review text and title.

Modeling

To build the prediction model, I went through Amazon Apple earbuds data and individually coded each review as authentic or unreliable based on my own judgement since Amazon does not release this information. Next, I split the Earpods data into training and testing data using random sampling. Then, I used the predictors to build a logistic regression model to predict the binary variable (0 = fake, 1 =  real). I tweaked the predicting factors and threshold value and ended with an 80.34% accuracy on the test data. I also created a table to see the precision and recall of mis-assigned pairs.

The most significant negative predictors included: number of stars on the review and the average star rating for the user. Significant positive predictors included: length of the review, the average length of reviews for the user, the deviation in star ratings by the user, and the date range for all the reviews by a particular user.

I also tested other types of models (e.g. classification trees), but the logistics regression had the highest accuracy on the test data.

All visualizations were created in R using the ggplot2 package.