To run the project, you may want to create a virtual environment:
# Create the environment
python3 -m venv venv
# Activate the environment
source venv/bin/activate
And the necessary library must be installled:
python3 -m pip install -r requirements.txt
The file exploration contain the first part and the insights of the project.
The file prediction contain our results using spark ml.