57 questions
0
votes
0
answers
23
views
Dask worker running long calls
The code running on the dask worker calls asyncio.run() and proceeds to exectue a series of async calls (on the worker running event_loop) that gather data, and then run a small computation.
This ...
0
votes
1
answer
247
views
Connection problems with dask scheduler
I've set up a kubernetes cluster with GKE and installed the dask-kubernetes-operator.
When i try to start the cluster like this
cluster: KubeCluster = KubeCluster(custom_cluster_spec="cluster....
0
votes
1
answer
78
views
Mounting a folder in Dask distributed AWS ECS/EC2 cluster
I am using dask distributed package to create a EC2/ECS cluster, I want to read the ML models within the workers, something like
def read_model(model_path):
model = pickle.load(model_path)
...
0
votes
1
answer
306
views
Issue deploying latest version of daskhub helm chart in GKE
I am trying to deploy the latest version of 'daskhub' in a GKE cluster (v1.21.12-gke.1700), but getting the below error with 'traefik'
helm upgrade --wait --install --dry-run --debug --render-...
0
votes
1
answer
229
views
'HelmCluster' object has no attribute 'status' when connecting to Dask release from GCE
I've deployed dask helm chart on gke, can access to the cluster with distributed.Client.
Now I need to connect to dask cluster with dask_kubernetes.HelmCluster, but it raises this exception. Code ...
2
votes
1
answer
442
views
Is it possible to use system memory instead of GPU memory for processing Dask tasks
We have been running DASK clusters on Kubernetes for some time. Up to now, we have been using CPUs for processing and, of course, system memory for storing our Dataframe of around 1,5 TB (per DASK ...
0
votes
1
answer
444
views
Dask Distributed: How to delete uploaded files from the cluster
I wanted to know if there's a function in dask.distributed that removes the files uploaded to the cluster using the client.upload_file()?
Basically, the opposite of the upload_file() function.
best ...
0
votes
1
answer
613
views
dask-kubernetes KubeCluster stuck
I'm trying to get up and running with dask on kubernetes. Below is effectively a hello world for dask-kubernetes, but I'm stuck on the error below.
main.py:
import os
from dask_kubernetes import ...
2
votes
1
answer
683
views
Transitioning from Local Dask to a Cluster …
I have a simple embarrassingly parallel program that I am successfully running locally on Dask. Yay! Now I want to move it to a cluster and crank up the problem size. In this case, I am using GCP. I ...
2
votes
1
answer
458
views
Why does starting daskdev/dask into a Pod fail?
Why does kubectl run dask --image daskdev/dask fail?
# starting the container with docker to make sure it basically works
➜ ~ docker run --rm -it --entrypoint bash daskdev/dask:latest
(base) root@...
3
votes
0
answers
189
views
How to View Dask Daskboard in Dask Gateway when using a private IP address/VPC?
We deployed Dask Gateway on Kubernetes on Google Cloud Platform. We are currently using an internal TCP load balancer to expose the traefik proxy for security purposes. Our users are able to create a ...
2
votes
1
answer
2k
views
How am I supposed to connect to a Dask-gateway deployed in Kubernetes from an outside service?
I'm a bit confused on how exactly I'm supposed to connect to a deployed Dask cluster created via Dask-helm chart from an external service. I deployed a Dask cluster as explained here
After a ...
0
votes
1
answer
322
views
How do Dask bag partitions and workers correlate?
I'm using a vanilla Dask-Kubernetes setup with two workers and one scheduler to iterate over the lines of some JSON file (and apply some functions which don't appear here for simplicity). I see only ...
2
votes
0
answers
176
views
ValueError: Unknown fields ['image']
I am trying to deploy Dask Gateway integrated with JupyterHub which is the reason I decided to give DaskHub Chart a try.
After following the instructions on https://docs.dask.org/en/latest/setup/...
1
vote
0
answers
127
views
Why does my Dask job's performance get worse after five workers?
I am running Dask on an eight-node Kubernetes cluster with my manifest specifying one scheduler replica and eight worker replicas. My code is processing 80 files of about equal size, and I wanted to ...
0
votes
1
answer
584
views
dask kubernetes aks (azure) virtual nodes
Using the code bellow it is possible to create a dask kubernetes cluster in azure aks.
It uses a remote scheduler (dask.config.set({"kubernetes.scheduler-service-type": "LoadBalancer&...
2
votes
1
answer
147
views
How to configure GCP cluster for dask-workers in another region than scheduler was started
I have one kubernetes cluster in region us-east1 where dask-scheduler was started and i want to start another cluster in region us-west1 where would like run dask-workers. As I understand connection ...
1
vote
1
answer
2k
views
Why do my Dask Futures get stuck in 'pending' and never finish?
I have some long-running code (~5-10 minute processing) that I'm trying to run as a Dask Future. It's a series of several discrete steps that I can either run as one function:
result : Future = ...
0
votes
1
answer
495
views
What causes Dask futures to get stuck in 'pending' state?
I created my own very slightly modified Dockerfile based on the dask-docker Dockerfile that installs adlfs and copies one of my custom libraries into the container in order to make it available to all ...
0
votes
1
answer
113
views
How to Send .pem file to Dask Cluster?
I have a dask expression as follows where I'm trying to run a sqlalchemy query in a distributed way. However, it references a .pem key file that's inputted in the connect_args parameter. How do I ...
1
vote
1
answer
336
views
dask-kubernetes: Issue creating pod with uppercase username
I am learning dask-kubernetes on GKE.
I stumbled across an asyncio error (ERROR:asyncio:Task exception was never retrieved).
See steps below for the issue.
However, additional guidance on using ...
0
votes
1
answer
372
views
How does Dask execute code on multiple vm's in the cloud
I wrote a program with dask and delayed and now I want to run it on several machines in the cloud. But there's one thing I don't understand - how does dask run the code on multiple machines in the ...
1
vote
1
answer
556
views
How do you mount volumes on Dask workers with dask-kubernetes?
I used the following code to create a cluster
from dask_kubernetes import KubeCluster
cluster = KubeCluster.from_yaml('worker.yaml')
cluster.adapt(minimum=1, maximum=10)
with the following yaml code (...
3
votes
0
answers
967
views
How to pick proper number of threads, workers, processes for Dask when running in an ephemeral environment as single machine and cluster
Our company is currently leveraging prefect.io for data workflows (ELT, report generation, ML, etc). We have just started adding the ability to do parallel task execution, which is powered by Dask. ...
0
votes
1
answer
258
views
dask-kubernetes zero workers on GKE
Noob here. I want to have a Dask install with a worker pool that can grow and shrink based on current demands. I followed the instructions in zero to jupyterhub to install on GKE, and then went ...