generated from astronomer/airflow-provider-sample
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* First update * example dags added * Updates * example dags updated * Bug fix * Bug fix * pre-commit bug fixes * bug fix * bug fix * bug fixes and unit test updates * further updates * Documentation changes * Run only one dag for integration test * Added additional unit tests to increase coverage * minor changes * update * change log updated * updated
- Loading branch information
1 parent
42a918d
commit cfbcec7
Showing
18 changed files
with
1,394 additions
and
830 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,42 +1,81 @@ | ||
Code Samples | ||
============ | ||
|
||
There are two main scenarios for using this provider: | ||
Index | ||
----- | ||
- `Example 1: Ray jobs on an existing cluster`_ | ||
- `Ray Cluster Sample Spec (YAML)`_ | ||
- `Example 2: Using @ray.task for job lifecycle`_ | ||
- `Example 3: Using SubmitRayJob operator for job lifecycle`_ | ||
- `Example 4: SetupRayCluster, SubmitRayJob & DeleteRayCluster`_ | ||
|
||
Scenario 1: Setting up a Ray cluster on an existing Kubernetes cluster | ||
---------------------------------------------------------------------- | ||
Example 1: Ray jobs on an existing cluster | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
If you already have a Ray cluster set up, you can use the ``SubmitRayJob`` operator or ``ray.task()`` decorator to submit jobs directly. | ||
|
||
If you have an existing Kubernetes cluster and want to install a Ray cluster on it, and then run a Ray job, you can use the ``SetupRayCluster``, ``SubmitRayJob``, and ``DeleteRayCluster`` operators. | ||
In the example below (``ray_taskflow_example_existing_cluster.py``), the ``@ray.task`` decorator is used to define a task that will be executed on the Ray cluster: | ||
|
||
This will involve 2 steps - | ||
.. important:: | ||
**Set the Ray Dashboard URL connection parameter or RAY_ADDRESS on your airflow worker to connect to your cluster** | ||
|
||
Create a YAML file defining your Ray cluster configuration. | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
.. note:: | ||
.. literalinclude:: ../../example_dags/ray_taskflow_example_existing_cluster.py | ||
:language: python | ||
:linenos: | ||
|
||
|
||
Ray Cluster Sample Spec (YAML) | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
.. important:: | ||
``spec.headGroupSpec.serviceType`` must be a 'LoadBalancer' to spin a service that exposes your dashboard externally | ||
|
||
Save this file in a location accessible to your Airflow installation, and reference it in your DAG code. | ||
|
||
.. literalinclude:: ../../example_dags/scripts/ray.yaml | ||
:language: yaml | ||
|
||
Save this file in a location accessible to your Airflow installation, and reference it in your DAG code. | ||
|
||
Example 2: Using @ray.task for job lifecycle | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Sample DAG (``setup_teardown.py``): | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
The below example showcases how to use the ``@ray.task`` decorator to manage the full lifecycle of a Ray cluster: setup, job execution, and teardown. | ||
|
||
.. literalinclude:: ../../example_dags/setup-teardown.py | ||
This approach is ideal for jobs that require a dedicated, short-lived cluster, optimizing resource usage by cleaning up after task completion. | ||
|
||
.. literalinclude:: ../../example_dags/ray_taskflow_example.py | ||
:language: python | ||
:linenos: | ||
|
||
Scenario 2: Using an existing Ray cluster | ||
----------------------------------------- | ||
|
||
If you already have a Ray cluster set up, you can use the ``SubmitRayJob`` operator or ``task.ray()`` decorator to submit jobs directly. | ||
Example 3: Using SubmitRayJob operator for job lifecycle | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
In the example below (``ray_taskflow_example.py``), the ``@task.ray`` decorator is used to define a task that will be executed on the Ray cluster: | ||
This example demonstrates how to use the ``SubmitRayJob`` operator to manage the full lifecycle of a Ray cluster and job execution. | ||
|
||
.. literalinclude:: ../../example_dags/ray_taskflow_example.py | ||
This operator provides a more declarative way to define your Ray job within an Airflow DAG. | ||
|
||
.. literalinclude:: ../../example_dags/ray_single_operator.py | ||
:language: python | ||
:linenos: | ||
|
||
.. note:: | ||
Remember to adjust file paths, connection IDs, and other specifics according to your setup. | ||
|
||
Example 4: SetupRayCluster, SubmitRayJob & DeleteRayCluster | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
This example shows how to use separate operators for cluster setup, job submission, and teardown, providing more granular control over the process. | ||
|
||
This approach allows for more complex workflows involving Ray clusters. | ||
|
||
Key Points: | ||
|
||
- Uses SetupRayCluster, SubmitRayJob, and DeleteRayCluster operators separately. | ||
- Allows for multiple jobs to be submitted to the same cluster before deletion. | ||
- Demonstrates how to pass cluster information between tasks using XCom. | ||
|
||
This method is ideal for scenarios where you need fine-grained control over the cluster lifecycle, such as running multiple jobs on the same cluster or keeping the cluster alive for a certain period. | ||
|
||
.. important:: | ||
**The SubmitRayJob operator uses the xcom_task_key parameter "SetupRayCluster.dashboard" to retrieve the Ray dashboard URL. This URL, stored as an XCom variable by the SetupRayCluster task, is necessary for job submission.** | ||
|
||
.. literalinclude:: ../../example_dags/setup-teardown.py | ||
:language: python | ||
:linenos: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.