This is a mini project to create an automated twitter post using Apache Airflow 🕘
The objective of this project is to observe the daily changes of public (netizen) thoughts towards a certain topic. In the meantime, the daily visualization that will be posted is a wordcloud that represents Indonesian netizens towards the Covid-19 pandemic.
Every 9 AM in Indonesia Local Time (WIB), Airflow will execute a container of tasks (called DAG or Directed Acyclic Graph) that include data scraping, data cleaning, wordcloud making, and twitter posting. The tweets data are scraped via standard twitter API, which will later be cleaned (remove stopwords using NLTK data, remove mentions, links, hashtags, etc) and visualized using wordcloud. This task will be automated everyday, which means you will see a wordcloud post in my twitter account every 9 AM.
Linux (Ubuntu WSL) 🐧
Python 3.8.10 🐍
Venv
Apache Airflow 2.1.2
VSCode
Airflow
Datetime
Numpy
Pandas
Matplotlib
Tweepy
Re
NLTK
Wordcloud
I'm currently working on the sentiment analysis of the tweets and explore any other types of visualizations that I can use. So, stay tuned!