Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Listing DaskGateway clusters created via Python code alongside those created via the dask labextension UI #204

Open
consideRatio opened this issue Jul 28, 2021 · 4 comments

Comments

@consideRatio
Copy link

What happened:

I can create a dask-gateway cluster via the dask-labextension view and I'll see it visible there then.

starting-new-dask-cluster

But, if I create a dask-gateway cluster from a notebook using code like below, then no dask cluster shows up in the list of clusters.

from dask_gateway import Gateway
gateway = Gateway()
cluster = gateway.new_cluster()

My wish

My wish is that the dask clusters I've created should be listed visually. I'm not sure if this is possible or not, but I'd like to describe this wish here to explore if we can make it happen one way or another.

Environment:

JupyterHub (1.1.1 Helm chart) + Dask-Gateway (0.9.0 Helm chart).

$ conda list | grep dask
dask                      2021.6.0           pyhd8ed1ab_0    conda-forge
dask-core                 2021.6.0           pyhd8ed1ab_0    conda-forge
dask-gateway              0.9.0            py38h578d9bd_0    conda-forge
dask-glm                  0.2.0                      py_1    conda-forge
dask-kubernetes           2021.3.1           pyhd8ed1ab_0    conda-forge
dask-labextension         5.0.2              pyhd8ed1ab_0    conda-forge
dask-ml                   1.9.0              pyhd8ed1ab_0    conda-forge
pangeo-dask               2021.06.05           hd8ed1ab_0    conda-forge
$ python --version
Python 3.8.10

Operating System: Ubuntu 20.04
Install method: conda-forge

# The current environment and dask configuration via environment
DASK_DISTRIBUTED__DASHBOARD_LINK=/user/{JUPYTERHUB_USER}/proxy/{port}/status
DASK_GATEWAY__ADDRESS=http://10.100.116.39:8000/services/dask-gateway/
DASK_GATEWAY__AUTH__TYPE=jupyterhub
DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE={JUPYTER_IMAGE_SPEC}
DASK_GATEWAY__PROXY_ADDRESS=gateway://traefik-prod-dask-gateway.prod:80
DASK_GATEWAY__PUBLIC_ADDRESS=/services/dask-gateway/
DASK_LABEXTENSION__FACTORY__CLASS=GatewayCluster
DASK_LABEXTENSION__FACTORY__MODULE=dask_gateway
DASK_ROOT_CONFIG=/srv/conda/etc
@ian-r-rose
Copy link
Collaborator

Thanks for raising this @consideRatio . In general, this is a hard problem, as dask doesn't really have a built-in cluster discovery method. Short of port sniffing, I'm not sure I know of a good way to handle auto-detecting any cluster in a given notebook (or set of notebooks). Indeed, part of the reason for creating the cluster manager sidebar in the first place was to be able to build some user interfaces around starting, stopping, and scaling clusters that the extension can actually keep track of and reason about.

That being said, my goal for this extension is to get out of the game of managing clusters directly, and instead investigate a solution like dask-ctl. This could allow different cluster providers to set up their own discovery and control services, which the labextension could then consume. There is some detailed discussion of this in #189, I encourage you to weigh in!

@jacobtomlinson
Copy link
Member

I would really like dask-ctl to be the solution for this.

@dharhas
Copy link

dharhas commented Apr 13, 2023

Related question. How do you configure the lab extension to use dask-gateway for creating new clusters. I can't find that anywhere in the docs but clearly from the screenshot above it is possible.

@ian-r-rose
Copy link
Collaborator

@dharhas I haven't tried it recently myself, but the configuration that @consideRatio posted above looks like the correct approach to me (though it could also be configured using a yml file or what have you):

# The current environment and dask configuration via environment
DASK_DISTRIBUTED__DASHBOARD_LINK=/user/{JUPYTERHUB_USER}/proxy/{port}/status
DASK_GATEWAY__ADDRESS=http://10.100.116.39:8000/services/dask-gateway/
DASK_GATEWAY__AUTH__TYPE=jupyterhub
DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE={JUPYTER_IMAGE_SPEC}
DASK_GATEWAY__PROXY_ADDRESS=gateway://traefik-prod-dask-gateway.prod:80
DASK_GATEWAY__PUBLIC_ADDRESS=/services/dask-gateway/
DASK_LABEXTENSION__FACTORY__CLASS=GatewayCluster
DASK_LABEXTENSION__FACTORY__MODULE=dask_gateway
DASK_ROOT_CONFIG=/srv/conda/etc

In particular, the factory class and factory module options tell the labextension what to use when starting a new cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants