Isolating Satire with Machine Learning

SI 206 Final Project

By Efe Akinci and Michael Zhou

Notes

In the interest of keeping file sizes small, we have not included the training data used to create the machine learning algroithm (~111MB).
If you are interested in exploring the training data, you can find the dataset we used here.

Introduction

Onion articles (and other satirical articles) are pretty funny. Even though they are written like news, we can usually tell that they are meant to be satire. But, we wondered, could a computer do the same? We used Kaggle, Keras, Tensorflow, Pandas, and others to find out.

"I don't have a the equipment to train the machine learning model."

You can find a link to our pre-trained weights here (link is currently restricted to U-M accounts).
Warning: The file download is quite large at ~140 MB.

More info

Our final project report includes more information on what this project does, how it works, and how it could be improved. It is included in our project as Final_Project_Report.pdf.

Directions to Important Files

Getting Data

Our data collection file is onion_farmer.py

API's/Websites Used: CNN, The Onion, AP News, Clickhole, Pushshift API (for Reddit).

Data is stored in the database file static\onion_barn.db.

To limit data points collected to 25 at a time, we scrape webpages one article at a time. For Pushshift API, we used a recursive call to ensure that we could get more than 25 articles at once without exceeding the 25/call limit.

Processing Data

Averages, percentages, and more are calculated from the data in visuals.py.

Our JOIN statement is used in most_commented on line 34 of visuals.py.

The output from calculations is written to static\caulculations.csv.

Visualizations

Our visualizations are stored in static\visuals.

We have also generated wordclouds from our training datasets, found in static\visualizations.

Report

Our project report can be found at Final_Project_Report.pdf.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.vscode		.vscode
static		static
.gitignore		.gitignore
Final Presentation Grading Rubric-v3.pdf		Final Presentation Grading Rubric-v3.pdf
Final_Project_Report.pdf		Final_Project_Report.pdf
README.md		README.md
make_wordcloud.py		make_wordcloud.py
ml_train.py		ml_train.py
onion_farmer.py		onion_farmer.py
predict.py		predict.py
process_data.py		process_data.py
requirements.txt		requirements.txt
utils.py		utils.py
visuals.py		visuals.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Isolating Satire with Machine Learning

SI 206 Final Project

By Efe Akinci and Michael Zhou

Notes

Introduction

"I don't have a the equipment to train the machine learning model."

More info

Directions to Important Files

Getting Data

Processing Data

Visualizations

Report

About

Releases

Packages

Contributors 4

Languages

efea-umich/SI-206-Final

Folders and files

Latest commit

History

Repository files navigation

Isolating Satire with Machine Learning

SI 206 Final Project

By Efe Akinci and Michael Zhou

Notes

Introduction

"I don't have a the equipment to train the machine learning model."

More info

Directions to Important Files

Getting Data

Processing Data

Visualizations

Report

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages