Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs and values to run Airflow locally #186

Open
wants to merge 3 commits into
base: fix/dataset-ingest-fixes
Choose a base branch
from

Conversation

ividito
Copy link
Contributor

@ividito ividito commented Jul 11, 2024

Summary:

Docs on getting Airflow set up using Kubernetes in Docker (KinD) to host a local cluster. This can enable local development without relying on MWAA environments, and can be more extensible than the mwaa-local-runner. This can also work for developing SM2A DAGs, while we prepare to migrate to that architecture.

@ividito ividito marked this pull request as ready for review July 12, 2024 01:15
Copy link

@ciaransweet ciaransweet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully small comments but 'blocked' from the rest of the steps because of the docker build

Comment on lines +76 to +77
kind create cluster --name airflow-cluster
kubectx airflow-cluster

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running just the create command appeared to switch context for me:

kind create cluster --name airflow-cluster
Creating cluster "airflow-cluster" ...
 ✓ Ensuring node image (kindest/node:v1.31.0) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
Set kubectl context to "kind-airflow-cluster"
You can now use your cluster with:

kubectl cluster-info --context kind-airflow-cluster

Not sure what to do next? 😅  Check out https://kind.sigs.k8s.io/docs/user/quick-start/


Assuming you are in the root directory of the project:
```bash
docker build -f ./local-setup/Dockerfile.airflow -t veda-airflow .

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails for what looks like issues installing rasterio:

2.305 Collecting rasterio>=1.3.3
2.319   Downloading rasterio-1.3.10.tar.gz (412 kB)
2.334      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 412.9/412.9 kB 29.6 MB/s eta 0:00:00
2.370   Installing build dependencies: started
6.820   Installing build dependencies: finished with status 'done'
6.821   Getting requirements to build wheel: started
7.083   Getting requirements to build wheel: finished with status 'error'
7.086   error: subprocess-exited-with-error
7.086
7.086   × Getting requirements to build wheel did not run successfully.
7.086   │ exit code: 1
7.086   ╰─> [3 lines of output]
7.086       <string>:22: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
7.086       WARNING:root:Failed to get options via gdal-config: [Errno 13] Permission denied: 'gdal-config'
7.086       ERROR: A GDAL API version must be specified. Provide a path to gdal-config using a GDAL_CONFIG environment variable or use a GDAL_VERSION environment variable.
7.086       [end of output]
7.086
7.086   note: This error originates from a subprocess, and is likely not a problem with pip.
7.087 error: subprocess-exited-with-error
7.087
7.087 × Getting requirements to build wheel did not run successfully.
7.087 │ exit code: 1
7.087 ╰─> See above for output.
7.087
7.087 note: This error originates from a subprocess, and is likely not a problem with pip.
7.155
7.155 [notice] A new release of pip available: 22.3.1 -> 24.2
7.155 [notice] To update, run: pip install --upgrade pip
------
Dockerfile.airflow:4
--------------------
   2 |     COPY dags/requirements.txt .
   3 |     COPY dags/requirements-constraints.txt .
   4 | >>> RUN pip install --no-cache-dir -r requirements.txt -c requirements-constraints.txt
   5 |     COPY --chown=airflow:root ./dags/ /opt/airflow/dags/
   6 |
--------------------
ERROR: failed to solve: process "/bin/bash -o pipefail -o errexit -o nounset -o nolog -c pip install --no-cache-dir -r requirements.txt -c requirements-constraints.txt" did not complete successfully: exit code: 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed on mac with: docker build --platform linux/amd64 -f ./local-setup/Dockerfile.airflow -t veda-airflow .

@ividito ividito changed the base branch from fix/dataset-ingest-fixes to dev September 26, 2024 18:09
@ividito ividito changed the base branch from dev to fix/dataset-ingest-fixes September 26, 2024 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants