Skip to main content
Filter by
Sorted by
Tagged with
2 votes
0 answers
50 views

I am trying to analyze the 30 day standardized precipitation index for a multi-state range of the southeastern US for the year 2016. I'm using xclim to process a direct pull of gridded daily ...
helpmeplease's user avatar
0 votes
0 answers
29 views

I need advice from you. Right now i do some computation with pandas library. Program is using multiprocessing and df.apply. The simple example showing my idea is here: import multiprocessing import ...
luki's user avatar
  • 309
0 votes
0 answers
26 views

Using Python streamz and dask, I want to distribute the data of textfiles that are generated to threads. Which then will process every newline generated inside those files. from streamz import Stream ...
Ayan Banerjee's user avatar
0 votes
0 answers
38 views

import os from dask_cloudprovider.gcp import GCPCluster os.environ["GOOGLE_APPLICATION_CREDENTIALS"]=r'C:\Users\Me\Documents\credentials\compute_engine_default_key\test-project123-...
Adriano Matos's user avatar
0 votes
1 answer
72 views

I am trying to deploy a dask cluster with 0 workers and 1 scheduler, based on the work load need to scale up the worker to required, i found that the adaptive deployment is the correct way, i am using ...
Arun Kumar's user avatar
0 votes
1 answer
237 views

I am trying to run a Dask Scheduler and Workers on a remote cluster using SLURMRunner from dask-jobqueue. I want to bind the Dask dashboard to 0.0.0.0 (so it’s accessible via port forwarding) and ...
user1834164's user avatar
0 votes
0 answers
120 views

I am trying to get this code to work and then use it to train various models on two gpu's: from dask_cuda import LocalCUDACluster from dask.distributed import Client if __name__ == "__main__&...
Danilo Caputo's user avatar
1 vote
1 answer
55 views

I am trying to learn dask, and have created the following toy example of a delayed pipeline. +-----+ +-----+ +-----+ | baz +--+ bar +--+ foo | +-----+ +-----+ +-----+ So baz has a dependency on ...
Steve Lorimer's user avatar
0 votes
1 answer
43 views

I am using dask to parallelize an operation that is memory-bound. So, I want to ensure each dask worker has access to a single NUMA node and prevent cross-node memory access. I can do this in the ...
kgully's user avatar
  • 682
0 votes
0 answers
23 views

The code running on the dask worker calls asyncio.run() and proceeds to exectue a series of async calls (on the worker running event_loop) that gather data, and then run a small computation. This ...
Dirich's user avatar
  • 442
0 votes
1 answer
90 views

I need to find a way for a python process to figure out if it was launched as part of a multiprocessing pool. I am using dask to parallelize calculations, using dask.distributed.LocalCluster. For UX ...
pnjun's user avatar
  • 151
0 votes
0 answers
108 views

I am trying to read 23 CSV files into dask dataframes, merge them together using dask, and ouptut to parquet. However, it's failing due to memory issues. I used to use pandas to join these together ...
ifightfortheuserz's user avatar
0 votes
0 answers
56 views

I have been trying to setup logging using the logging module in a Python script, and I have got it working properly. It can now log to both the console and a log file. But if fails when I setup a Dask ...
RogUE's user avatar
  • 363
1 vote
1 answer
87 views

I'm trying to modularize my functions that use Dask, but I keep encountering the error "No module named 'setup'". I can't import any local module that is related to Dask, and currently, ...
Anderson's user avatar
0 votes
1 answer
80 views

I’m using dask to make parallel processing of a simulation. It consists of a series of differential equations that are numerically solved using numpy arrays that are compiled using numba @jit ...
nsantana's user avatar
0 votes
0 answers
172 views

I have a piece of data code that performs interpolation on a large number of arrays. This is extremely quick with numpy, but: The data the code will work with in reality will often not fit in memory ...
abinitio's user avatar
  • 849
0 votes
1 answer
296 views

I made my own filesystem in the fsspec library and I am trying to read in dask dataframes from this filesystem object to open the dataframe file. However I am getting an error when I try to do this. ...
Brian Moths's user avatar
  • 1,225
-1 votes
1 answer
192 views

I have implemented some data analysis in Dask using dask-distributed, but the performance is very far from the same analysis implemented in numpy/pandas and I am finding it difficult to understand the ...
abinitio's user avatar
  • 849
0 votes
0 answers
139 views

Running Dask Scheduler on system A and workers on system A and B. NFS volume from system A is shared on the network through NFS with system B, and contains the data files. This folder has a symbolic ...
Steffan's user avatar
  • 556
1 vote
0 answers
43 views

I have a program that I wrote. I define a class in this program that is a subclass of a class I import. If I run this code without Dask, I successfully run it. When I plug in Dask, I get an error ...
olivarb's user avatar
  • 257
1 vote
0 answers
92 views

This is an example: import numpy as np import zarr from dask.distributed import Client, LocalCluster from dask import array as da from dask.distributed import progress def same(x): return x x = ...
Kang Liang's user avatar
1 vote
0 answers
186 views

I am trying to write some code using dask.distributed.Client and rioxarray to_raster that: Concatenates two rasters (dask arrays) Applies a function across all blocks in the concatenated array Writes ...
katieb1's user avatar
  • 11
0 votes
1 answer
70 views

How does Dask manage file descriptors? For example when creating a dask.array from an hdf5 file. When the array is large enough to be chunked. Do the created tasks inherit the file descriptor created ...
Mitchou's user avatar
  • 37
1 vote
1 answer
68 views

Dask shows slightly smaller size than the actual size of a numpy array. Here is an example of a numpy array that is exactly 32 Mb: import dask as da import dask.array import numpy as np shape = (1000,...
Ress's user avatar
  • 810
1 vote
0 answers
187 views

Trying to read the results of a query (from an AWS athena database) to a dask dataframe. Following the read_sql_query method of the official documentation. Here is how I am calling it. from dask ...
Della's user avatar
  • 1,730

1
2 3 4 5
23