Is it possible to use dask distributed to pandas with apply working with multiprocessing?

I need advice from you. Right now i do some computation with pandas library. Program is using multiprocessing and df.apply. The simple example showing my idea is here:

import multiprocessing
import pandas as pd
   

def f2(row, item):
    # do some computation with item and  rows values and return something
    return 'something'


def f1(item):
    d1 = {'col1': [4,5,6], 'col2': [7,8,9]}
    df = pd.DataFrame(d1)

    df['col3'] = df.apply(f2, args=(item,))


if __name__ == '__main__':

    l1 = [1,2,3]

    for item in l1:
        x = multiprocessing.Process(target=f1, args(item, ))

I have PC and another one. That is why I am thinking about "local cluster'. How can I run this code using dask distributed library?
What should I change in this code? Does dask distributed works with multiprocessing?

Will dask distributed be faster than work on single machine?

It computes on small df - c.a. 25000 rows

asked Aug 5 at 20:03

luki

3091 silver badge11 bronze badges

Hi, is your example really working (sorry, didn't test it). It looks a bit weird to me, you'll end up with three different dataframes? With Dask, you don't have to worry about using multi processing, it does that for you, but you might want to check Dask documentation that gives answer to your questions. You'll be able to use multi processing, or distributed, or threaded mode, and apply you function in a aprallel way.

Guillaume EB
– Guillaume EB

2025-08-08 17:38:30 +00:00
Commented Aug 8 at 17:38
This pice of code is a draft. Actually i want to check it with dask distributed, but i do not knowa how to configure it. If i have PC1 and PC2 in local network. Do you have some example. I will need example for running python script. What should I do on PC1 and what on PC2. Right now I am reading manuals, but it describe only local cluster. Wha

luki
– luki

2025-08-16 16:45:27 +00:00
Commented Aug 16 at 16:45

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Is it possible to use dask distributed to pandas with apply working with multiprocessing?

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest