Please note: this infrastructure requires some environs to set (e.g. the location where to store the backend DB in S3 or the URL of the tracking server). Before running it it is necessary to solve all the required environs with TBD as value
- Docker container that launches MLflow with Nginx as server proxy
- Internal Backend with SQlite as DB
- Cronjob to sync Backend in S3
- External artifacts root in S3
- launch Linux AMI instance type with relevant Security Groups to be accessible externally via HTTPS
- add IAM role to have access on AWS S3
- install git, and clone this repository
sudo yum install git
sudo git clone <this_repo_url>
- prepare and launch docker
./prepare_docker.sh
./launch_docker.sh
- set up a cron job
sudo service crond start
crontab -e
- cronjob to include inside the crontab (please note: hardcoded S3_DATABASE_LOCATION since it's not working properly)
SHELL=/bin/sh
PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/ec2-user/.local/bin:/home/ec2-user/bin
0 1 * * * aws s3 sync /home/ec2-user/data/ s3://flixtech-primus-dev-mlflow-runs/backend_database/
- add environs in cronjob and launch cronjob with docker seamlessly (optional if no external DB)
- External DB (e.g. Snowflake or RDS)
- SSM to store DB credentials
- you have to manually install git at every run with sudo yum install git, machine with git preinstalled
- docker mlflow serve image stored in a repository
- CI/CD Gitlab pipeline AWS - GitLab