Current target is to download NOAA GFS from AWS S3, using Prefect machinery.
dnf install epel-release -y
yum install -y screen fail2ban git wget
- Install conda & env
# Conda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh ./Miniconda3-latest-Linux-x86_64.sh
conda config --set auto_activate_base false
# Create env
conda create -n prefect
conda activate prefect
conda install -c conda-forge prefect
- Docker (for running prefect server only)
# Cf. https://docs.docker.com/engine/install/centos/
sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install docker-ce docker-ce-cli containerd.io -y
sudo systemctl enable --now docker
sudo docker run hello-world
sudo usermod -Ga docker <<you_user>>
- Python reqs
git clone https://github.com/steph-ben/fetch-with-prefect.git
pip install -r requirements.txt
python setup.py develop
- Test our flow locally, wihtout any prefect server
python fetcher.py run
[2020-12-16 22:04:09+0000] INFO - prefect.FlowRunner | Beginning Flow run for 'aws-gfs-download'
[2020-12-16 22:04:10+0000] INFO - prefect.TaskRunner | Task 'check_run_availability': Starting task run...
[2020-12-16 22:04:10+0000] INFO - prefect.TaskRunner | 20201215 / 0 : Checking run availability ...
[2020-12-16 22:04:26+0000] INFO - prefect.TaskRunner | 20201215 / 0 : Run is available !
...
[2020-12-16 22:05:30+0000] INFO - prefect.TaskRunner | Task 'timestep_9_download': Finished task run for task with final state: 'Success'
[2020-12-16 22:05:30+0000] INFO - prefect.TaskRunner | Task 'timestep_18_download': Finished task run for task with final state: 'Success'
[2020-12-16 22:05:30+0000] INFO - prefect.FlowRunner | Flow run SUCCESS: all reference tasks succeeded
Working great ? Now let's run it inside prefect server
- Launching prefect and one local agent
prefect server start
prefect agent local start
- Register and trigger our flow
# Needed once
$ prefect create project "gfs-fetcher"
# Install our app in virtualenv, so the local agent can access our lib
$ python setup.py develop
# Interracting with Prefect !
$ python fetcher.py register
Result check: OK
Flow URL: http://localhost:8080/default/flow/ec5a906b-6b03-46ce-b7d8-b02e73d30d98
└── ID: 7f049f56-2c6d-49b4-8515-658b43850764
└── Project: gfs-fetcher
└── Labels: ['steph-laptop']
7f049f56-2c6d-49b4-8515-658b43850764
$ python fetcher.py trigger
{'data': {'flow': [{'id': '7f049f56-2c6d-49b4-8515-658b43850764', 'name': 'aws-gfs-download'}]}}
A new FlowRun has been triggered
Go check it on http://linux:8080/flow-run/1829c2f9-a5f4-43c4-ad3e-bd42a80068ea
Enjoy the view !