Skip to content

aryanshb/dbt-mini-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

DBT Mini Project

Real time Data Streaming and Visualization using Spark and Kafka

SRN Name
PES1UG20CS020 Aditya Mahesh
PES1UG20CS042 Ananya Jalan
PES1UG20CS084 Aryansh Bhargavan
PES1UG20CS093 Avni Gupta

Setup/Usage

  • Start Zookeeper
zookeeper-server-start.sh kafka\config\zookeeper.properties
  • Start the Kafka server
kafka-server-start.sh kafka\config\server.properties
  • Create the required topics in Kafka
kafka-topics.bat --create --bootstrap-server localhost:2181 --replication-factor 1 --partitions 1 --topic weather
kafka-topics.bat --create --bootstrap-server localhost:2181 --replication-factor 1 --partitions 1 --topic output
  • Execute the producer.py program. This will take the data from the API and start publishing to the Kafka topic "weather".
python producer.py
  • Start the consumer using the Spark-Submit. This will start processing the data using Spark Structured Streaming and send the output to the Kafka topic "output".
spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.1.2 consumer.py
  • Execute the output.py program. This will take the data from the Kafka topic "output" and visualize it.
python output_stream.py     # for stream data
python output_batch.py      # for batch data 

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages