Skip to content

efea-umich/SI-206-Final

Repository files navigation

Isolating Satire with Machine Learning

SI 206 Final Project

By Efe Akinci and Michael Zhou

Notes

In the interest of keeping file sizes small, we have not included the training data used to create the machine learning algroithm (~111MB).
If you are interested in exploring the training data, you can find the dataset we used here.

Introduction

Onion articles (and other satirical articles) are pretty funny. Even though they are written like news, we can usually tell that they are meant to be satire. But, we wondered, could a computer do the same? We used Kaggle, Keras, Tensorflow, Pandas, and others to find out.

"I don't have a the equipment to train the machine learning model."

You can find a link to our pre-trained weights here (link is currently restricted to U-M accounts).
Warning: The file download is quite large at ~140 MB.

More info

Our final project report includes more information on what this project does, how it works, and how it could be improved. It is included in our project as Final_Project_Report.pdf.

Directions to Important Files

Getting Data

Our data collection file is onion_farmer.py

API's/Websites Used: CNN, The Onion, AP News, Clickhole, Pushshift API (for Reddit).

Data is stored in the database file static\onion_barn.db.

To limit data points collected to 25 at a time, we scrape webpages one article at a time. For Pushshift API, we used a recursive call to ensure that we could get more than 25 articles at once without exceeding the 25/call limit.

Processing Data

Averages, percentages, and more are calculated from the data in visuals.py.

Our JOIN statement is used in most_commented on line 34 of visuals.py.

The output from calculations is written to static\caulculations.csv.

Visualizations

Our visualizations are stored in static\visuals.

We have also generated wordclouds from our training datasets, found in static\visualizations.

Report

Our project report can be found at Final_Project_Report.pdf.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages