diff --git a/docs/CONTRIBUTING.rst b/docs/CONTRIBUTING.rst index 082cf32..fb70c17 100644 --- a/docs/CONTRIBUTING.rst +++ b/docs/CONTRIBUTING.rst @@ -16,7 +16,7 @@ To contribute to the **Ray provider** project: #. Please create a `GitHub Issue `_ describing your contribution. #. Open a feature branch off of the ``main`` branch and create a Pull Request into the ``main`` branch from your feature branch. #. Link your issue to the pull request. -#. Once developments are complete on your feature branch, request a review and it will be merged once approved. +#. Once development is complete on your feature branch, request a review and it will be merged once approved. Test Changes Locally -------------------- diff --git a/docs/_static/connection.png b/docs/_static/connection.png new file mode 100644 index 0000000..294bda6 Binary files /dev/null and b/docs/_static/connection.png differ diff --git a/docs/api/ray_provider.decorators.rst b/docs/api/ray_provider.decorators.rst index 92fad9b..e2f5347 100644 --- a/docs/api/ray_provider.decorators.rst +++ b/docs/api/ray_provider.decorators.rst @@ -1,21 +1,8 @@ -ray\_provider.decorators package -================================ -Submodules +Decorators ---------- -ray\_provider.decorators.ray module ------------------------------------ - .. automodule:: ray_provider.decorators.ray :members: :undoc-members: :show-inheritance: - -Module contents ---------------- - -.. automodule:: ray_provider.decorators - :members: - :undoc-members: - :show-inheritance: diff --git a/docs/api/ray_provider.hooks.rst b/docs/api/ray_provider.hooks.rst index 50987a9..6cf932c 100644 --- a/docs/api/ray_provider.hooks.rst +++ b/docs/api/ray_provider.hooks.rst @@ -1,21 +1,7 @@ -ray\_provider.hooks package -=========================== - -Submodules ----------- - -ray\_provider.hooks.ray module ------------------------------- +Hook +----- .. automodule:: ray_provider.hooks.ray :members: :undoc-members: :show-inheritance: - -Module contents ---------------- - -.. automodule:: ray_provider.hooks - :members: - :undoc-members: - :show-inheritance: diff --git a/docs/api/ray_provider.operators.rst b/docs/api/ray_provider.operators.rst index 9bd1002..af1bfa6 100644 --- a/docs/api/ray_provider.operators.rst +++ b/docs/api/ray_provider.operators.rst @@ -1,21 +1,7 @@ -ray\_provider.operators package -=============================== - -Submodules ----------- - -ray\_provider.operators.ray module ----------------------------------- +Operators +--------- .. automodule:: ray_provider.operators.ray :members: :undoc-members: :show-inheritance: - -Module contents ---------------- - -.. automodule:: ray_provider.operators - :members: - :undoc-members: - :show-inheritance: diff --git a/docs/api/ray_provider.triggers.rst b/docs/api/ray_provider.triggers.rst index bbe58f0..4b71046 100644 --- a/docs/api/ray_provider.triggers.rst +++ b/docs/api/ray_provider.triggers.rst @@ -1,21 +1,7 @@ -ray\_provider.triggers package -============================== - -Submodules ----------- - -ray\_provider.triggers.ray module ---------------------------------- +Trigger +-------- .. automodule:: ray_provider.triggers.ray :members: :undoc-members: :show-inheritance: - -Module contents ---------------- - -.. automodule:: ray_provider.triggers - :members: - :undoc-members: - :show-inheritance: diff --git a/docs/conf.py b/docs/conf.py index a942bcd..b5feb91 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -29,5 +29,7 @@ # -- Options for HTML output ------------------------------------------------- # https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output -html_theme = "pydata_sphinx_theme" -html_title = "astro-provider-ray" +html_theme = "furo" +html_title = "Package documentation" + +html_last_updated_fmt = "%b %d, %Y" diff --git a/docs/getting_started/code_samples.rst b/docs/getting_started/code_samples.rst index 6dde7f1..c4bca26 100644 --- a/docs/getting_started/code_samples.rst +++ b/docs/getting_started/code_samples.rst @@ -1,22 +1,42 @@ Code Samples -^^^^^^^^^^^^^^^ +============ There are two main scenarios for using this provider: Scenario 1: Setting up a Ray cluster on an existing Kubernetes cluster -"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" +---------------------------------------------------------------------- -If you have an existing Kubernetes cluster and want to install a Ray cluster on it, and then run a Ray job, you can use the ``SetupRayCluster``, ``SubmitRayJob``, and ``DeleteRayCluster`` operators. Here's an example DAG (``setup_teardown.py``): +If you have an existing Kubernetes cluster and want to install a Ray cluster on it, and then run a Ray job, you can use the ``SetupRayCluster``, ``SubmitRayJob``, and ``DeleteRayCluster`` operators. + +This will involve 2 steps - + +Create a YAML file defining your Ray cluster configuration. +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +.. note:: + ``spec.headGroupSpec.serviceType`` must be a 'LoadBalancer' to spin a service that exposes your dashboard externally + +.. literalinclude:: ../../example_dags/scripts/ray.yaml + +Save this file in a location accessible to your Airflow installation, and reference it in your DAG code. + + +Sample DAG (``setup_teardown.py``): +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. literalinclude:: ../../example_dags/setup-teardown.py + :language: python + :linenos: Scenario 2: Using an existing Ray cluster -""""""""""""""""""""""""""""""""""""""""" +----------------------------------------- If you already have a Ray cluster set up, you can use the ``SubmitRayJob`` operator or ``task.ray()`` decorator to submit jobs directly. -In the below example(``ray_taskflow_example.py``), the ``@task.ray`` decorator is used to define a task that will be executed on the Ray cluster. +In the example below (``ray_taskflow_example.py``), the ``@task.ray`` decorator is used to define a task that will be executed on the Ray cluster: .. literalinclude:: ../../example_dags/ray_taskflow_example.py + :language: python + :linenos: -Remember to adjust file paths, connection IDs, and other specifics according to your setup. +.. note:: + Remember to adjust file paths, connection IDs, and other specifics according to your setup. diff --git a/docs/getting_started/setup.rst b/docs/getting_started/setup.rst index 753affd..2c0d093 100644 --- a/docs/getting_started/setup.rst +++ b/docs/getting_started/setup.rst @@ -1,47 +1,42 @@ -Getting started -~~~~~~~~~~~~~~~ +Getting Started +=============== -1. Pre-requisites -^^^^^^^^^^^^^^^^^ - -The ``SetupRayCluster`` and the ``DeleteRayCluster`` operator require helm to install the kuberay operator. See the `installing Helm `_ page for more details. - -2. Installation -^^^^^^^^^^^^^^^ +**1. Install Helm:** .. code-block:: sh - pip install astro-provider-ray + curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 + chmod 700 get_helm.sh + ./get_helm.sh -The astro-provider-ray source code is available on this GitHub `page `_ +See the `installing Helm `_ page for other options. -3. Setting up the connection -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +.. note:: + This step is only required if you intend to use the ``SetupRayCluster`` & ``DeleteRayCluster`` operators. -For SubmitRayJob operator (using an existing Ray cluster) -""""""""""""""""""""""""""""""""""""""""""""""""""""""""" +**2. Install the python package:** -- **Connection Type**: "Ray" -- **Connection ID**: e.g., "ray_conn" -- **Ray dashboard URL**: URL of the Ray dashboard -- **Optional fields**: Cookies, Metadata, Headers, Verify SSL +.. code-block:: sh + + pip install astro-provider-ray -For SetupRayCluster and DeleteRayCluster operators -"""""""""""""""""""""""""""""""""""""""""""""""""" -- **Connection Type**: "Ray" -- **Connection ID**: e.g., "ray_k8s_conn" -- **Kube config path** OR **Kube config content (JSON format)** : Kubeconfig of the kubernetes cluster where Ray cluster must be setup -- **Namespace**: The k8 namespace where your cluster must be created. If not provided, "default" is used -- **Optional fields**: Cluster context, Disable SSL, Disable TCP keepalive +**3. Setting up the connection** -5. Setting up the Ray cluster spec -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +.. image:: ../_static/connection.png + :align: center -Create a YAML file defining your Ray cluster configuration. Example: +- For SubmitRayJob operator (using an existing Ray cluster) -**Note:** ``spec.headGroupSpec.serviceType`` must be a 'LoadBalancer' to spin a service that exposes your dashboard externally + - **Connection Type**: "Ray" + - **Connection ID**: e.g., "ray_conn" + - **Ray dashboard URL**: URL of the Ray dashboard + - **Optional fields**: Cookies, Metadata, Headers, Verify SSL -.. literalinclude:: ../../example_dags/scripts/ray.yaml +- For SetupRayCluster and DeleteRayCluster operators -Save this file in a location accessible to your Airflow installation, and reference it in your DAG code. + - **Connection Type**: "Ray" + - **Connection ID**: e.g., "ray_k8s_conn" + - **Kube config path** OR **Kube config content (JSON format)**: Kubeconfig of the Kubernetes cluster where Ray cluster must be set up + - **Namespace**: The K8s namespace where your cluster must be created. If not provided, "default" is used + - **Optional fields**: Cluster context, Disable SSL, Disable TCP keepalive diff --git a/docs/index.rst b/docs/index.rst index bd7ee41..9698b41 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,5 +1,5 @@ -Welcome to astro-provider-ray documentation! -=================================================== +Welcome to Ray provider documentation! +====================================== .. toctree:: :hidden: @@ -27,12 +27,18 @@ Benefits of using this provider include: Table of Contents ----------------- +- `Quickstart`_ - `What is the Ray provider?`_ - `Components`_ - `Contact the Devs`_ - `Changelog`_ - `Contributing Guide`_ +Quickstart +---------- + +See the :doc:`Getting Started ` page for detailed instructions on how to begin using the provider. + What is the Ray provider? ------------------------- @@ -55,7 +61,7 @@ The architecture diagram above shows how we can deploy both Airflow & Ray on a K Use Cases -^^^^^^^^ +^^^^^^^^^ - **Scalable ETL**: Orchestrate and monitor Ray jobs on on-demand compute clusters using the Ray Data library. These operations could be custom Python code or ML model inference. - **Model Training**: Schedule model training or fine-tuning jobs on flexible cadences (daily/weekly/monthly). Benefits include: @@ -88,12 +94,12 @@ Decorators Operators ^^^^^^^^^ - **SetupRayCluster**: Sets up or Updates a ray cluster on kubernetes using a kubeconfig input provided through the Ray connection -- **DeleteRayCluster**: Deletes and existing Ray cluster on kubernetes using the same Ray connection +- **DeleteRayCluster**: Deletes an existing Ray cluster on kubernetes using the same Ray connection - **SubmitRayJob**: Submits jobs to a Ray cluster using a specified host name and monitors its execution Triggers ^^^^^^^^ -- **RayJobTrigger**: Monitors asynchronous job execution submitted via ``SubmitRayJob`` or using the ``@task.ray()`` decorator and prints real-time logs to the the Airflow UI +- **RayJobTrigger**: Monitors asynchronous job execution submitted via the ``SubmitRayJob`` operator or using the ``@ray.task()`` decorator and prints real-time logs to the the Airflow UI Contact the devs diff --git a/pyproject.toml b/pyproject.toml index 85d834d..266c2e6 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -102,6 +102,7 @@ dependencies = [ "sphinx", "sphinx-autoapi", "sphinx-autobuild", + "furo", ] [tool.hatch.envs.docs.scripts]