Skip to content

Latest commit

 

History

History
33 lines (28 loc) · 1.52 KB

readme.md

File metadata and controls

33 lines (28 loc) · 1.52 KB

Analisis Netizen Indonesia

Background

This is a mini project to create an automated twitter post using Apache Airflow 🕘
The objective of this project is to observe the daily changes of public (netizen) thoughts towards a certain topic. In the meantime, the daily visualization that will be posted is a wordcloud that represents Indonesian netizens towards the Covid-19 pandemic.

Data Pipeline

Every 9 AM in Indonesia Local Time (WIB), Airflow will execute a container of tasks (called DAG or Directed Acyclic Graph) that include data scraping, data cleaning, wordcloud making, and twitter posting. image The tweets data are scraped via standard twitter API, which will later be cleaned (remove stopwords using NLTK data, remove mentions, links, hashtags, etc) and visualized using wordcloud. This task will be automated everyday, which means you will see a wordcloud post in my twitter account every 9 AM.

Tools that I used

Operating System :

Linux (Ubuntu WSL) 🐧

Softwares:

Python 3.8.10 🐍
Venv
Apache Airflow 2.1.2
VSCode

Libraries:

Airflow
Datetime
Numpy
Pandas
Matplotlib
Tweepy
Re
NLTK
Wordcloud

On Progress... 👷

I'm currently working on the sentiment analysis of the tweets and explore any other types of visualizations that I can use. So, stay tuned!