0

I do some computationally expensive tasks in python and found the thread module for parallelization. I have a function which does the computation and returns a ndarray as result. Now I want to know how I can parallize my function and get back the calculated Arrays from each thread.

The followed example is strongly simplified with light functions and calculations.

import numpy as np

def calculate_result(input):
    a=np.linspace(1.0, 1000.0, num=10000)   # just an example
    result = input*a
  return(result)

input =[1,2,3,4]

for i in range(0,len(input(i))):
    t.Thread(target=calculate_result, args=(input))
    t. start()  
    #Here I want to receive the return value from the thread

I am looking for a way to get the return value from the thread / function for each thread, because in my task each thread calculates different values.

I found an other Question (how to get the return value from a thread in python?) where someone is looking for a similar problem (no ndarrays) and which is handled with ThreadPool and async...

-------------------------------------------------------------------------------

Thanks for your answers ! Due to your help now I am looking for a way to solve my problem with the multiprocessing modul. To give you a better understanding what I do, see my following Explanation.

Explanation:

My 'input_data' is an ndarray with 282240 elements of type uint32

In the 'calculation_function()'I use a for loop to calculate from every 12 bit a result and put it into the 'output_data'

Because this is very slow, I split my input_data into e.g. 4 or 8 parts and calculate each part in the calculation_function().

Now I am looking for a way, how to parallize the 4 or 8 function calls

The order of the data is elementary, because the data is in image and each pixel have to be at the correct Position. So function call no. 1 calculates the first and the last function call the last pixel of the image.

The calculations work fine and the image can be completly rebuilt from my algo but I need the parallelization to speed up for time critical aspects.

Summary: One input ndarray is devided into 4 or 8 parts. In each part are 70560 or 35280 uint32 values. From each 12 bit I calculate one Pixel with 4 or 8 function calls. Each function returns one ndarray with 188160 or 94080 pixel. All return values will be put together in a row and reshaped into an image.

What allready works: Calculations are allready working and I can reconstruct my image

Problem: Function calls are done seriall and in a row but each image reconstruction is very slow

Main Goal: Speed up the function calls by parallize the function calls.

Code:

def decompress(payload,WIDTH,HEIGHT):
    # INPUTS / OUTPUTS
    n_threads = 4                                                                           
    img_input = np.fromstring(payload, dtype='uint32')                                      
    img_output = np.zeros((WIDTH * HEIGHT), dtype=np.uint32)                            
    n_elements_part = np.int(len(img_input) / n_threads)                                    
    input_part=np.zeros((n_threads,n_elements_part)).astype(np.uint32)                      
    output_part =np.zeros((n_threads,np.int(n_elements_part/3*8))).astype(np.uint32)        

    # DEFINE PARTS (here 4 different ones)
    start = np.zeros(n_threads).astype(np.int)                          
    end = np.zeros(n_threads).astype(np.int)                            
    for i in range(0,n_threads):
        start[i] = i * n_elements_part
        end[i] = (i+1) * n_elements_part -1

    # COPY IMAGE DATA
    for idx in range(0,n_threads):
        input_part [idx,:] = img_input[start[idx]:end[idx]+1]


    for idx in range(0,n_threads):                          # following line is the function_call that should be parallized
        output_part[idx,:] = decompress_part2(input_part[idx],output_part[idx])



    # COPY PARTS INTO THE IMAGE
    img_output[0     : 188160] = output_part[0,:]
    img_output[188160: 376320] = output_part[1,:]
    img_output[376320: 564480] = output_part[2,:]
    img_output[564480: 752640] = output_part[3,:]

    # RESHAPE IMAGE
    img_output = np.reshape(img_output,(HEIGHT, WIDTH))

    return img_output

Please dont take care of my beginner programming style :) Just looking for a solution how to parallize the function calls with the multiprocessing module and get back the return ndarrays.

Thank you so much for your help !

4
  • 1
    Due to GIL, threads are not good for paralleling computationally expensive tasks. Use the multiprocessing module instead. Commented Nov 2, 2017 at 14:38
  • write to a global variable Commented Nov 2, 2017 at 14:44
  • Do you each thread's result or a list of all results? Commented Nov 2, 2017 at 14:49
  • each thread results one ndarray which will be put together later in the algo Commented Nov 2, 2017 at 14:53

2 Answers 2

1

You can use process pool from the multiprocessing module

        def test(a):
           return a

        from multiprocessing.dummy import Pool
        p = Pool(3)
        a=p.starmap(test, zip([1,2,3]))
        print(a)
        p.close()
        p.join()
Sign up to request clarification or add additional context in comments.

2 Comments

What does the 3 stand for in the Pool function? Number of processes?
yes thats correct . check this docs.python.org/2/library/multiprocessing.html . kindly upvote if you found my answer useful . thanks :)
0

kar's answer works, however keep in mind that he's using the .dummy module which might be limited by the GIL. Heres more info on it: multiprocessing.dummy in Python is not utilising 100% cpu

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.