In this project, we aim at building an end-to-end machine learning application to classify the messages into 36 categories related to disaster. This is helpful to extract only disaster related from multiple media sources, so any appropriate disaster relief agency can be reached out for help.
Project is comprised of components: ETL pipeline
, ML pipeline
and Flass App
.
To successfully run the project, below are list of dependencies need to be installed in the python environment:
- python >3.6
- sklearn==0.0
- nltk==3.5
- SQLAlchemy==1.3.22
- pandas==1.1.5
- numpy==1.19.4
- plotly==4.14.1
- Flask==1.1.2
- Perform Extract, Transform and Load data provided by Figure8 including
messages
and its correspondingcategories
. - Target table will be stored in SQLite under table
disaster_response
- Run
python process_data.py messages.csv categories.csv disaster_response.db disaster_response
indata
directory to execute the ETL pipeline
- Perform load data, build model, train, run cross validation for best param and export artifact model.
- Target output is
classifier.pkl
file which will be used later for prediction on Flask App. - Run
python train_classifier.py ./../data/disaster_response.db classifier.pkl
inmodels
directory to execute the ML pipeline
-
Visualize the report of the categories, message genre data
-
Classify input message from dashboard
-
Run
python run.py
inapp
directory to execute the Flask applicationThe dashboard looks as below:
Credits must be given to Figure Eight for the provided data.