3

I'm trying to learn how to use multiprocessing in Python. I read about multiprocessing, and I trying to do something like this:

I have the following class(partial code), which has a method to produce voronoi diagrams:

class ImageData:    

    def generate_voronoi_diagram(self, seeds):
    """
    Generate a voronoi diagram with *seeds* seeds
    :param seeds: the number of seed in the voronoi diagram
    """
    nx = []
    ny = []
    gs = []
    for i in range(seeds):
        # Generate a cell position
        pos_x = random.randrange(self.width)
        pos_y = random.randrange(self.height)
        nx.append(pos_x)
        ny.append(pos_y)

        # Save the f(x,y) data
        x = Utils.translate(pos_x, 0, self.width, self.range_min, self.range_max)
        y = Utils.translate(pos_y, 0, self.height, self.range_min, self.range_max)
        z = Utils.function(x, y)

        gs.append(z)

    for y in range(self.height):
        for x in range(self.width):
            # Return the Euclidean norm
            d_min = math.hypot(self.width - 1, self.height - 1)
            j = -1
            for i in range(seeds):
                # The distance from a cell to x, y point being considered
                d = math.hypot(nx[i] - x, ny[i] - y)
                if d < d_min:
                    d_min = d
                    j = i
            self.data[x][y] = gs[j]

I have to generate a large number of this diagrams, so, this consumes a lot of time, so I thought this is a typical problem to be parallelized. I was doing this, in the "normal" approach, like this:

if __name__ == "__main__":
    entries = []
    for n in range(images):
        entry = ImD.ImageData(width, height)
        entry.generate_voronoi_diagram(seeds)
        entry.generate_heat_map_image("ImagesOutput/Entries/Entry"+str(n))
        entries.append(entry)

Trying to parallelize this, I tried this:

if __name__ == "__main__":
    entries = []
    seeds = np.random.poisson(100)
    p = Pool()
    entry = ImD.ImageData(width, height)
    res = p.apply_async(entry.generate_voronoi_diagram,(seeds))
    entries.append(entry)
    entry.generate_heat_map_image("ImagesOutput/Entries/EntryX")

But, besides it doesn't work not even to generate a single diagram, I don't know how to specify that this have to be made N times.

Any help would be very appreciated. Thanks.

2 Answers 2

1

Python's multiprocessing doesn't share memory (unless you explicitly tell it to). That means that you won't see "side effects" of any function that gets run in a worker processes. Your generate_voronoi_diagram method works by adding data to an entry value, which is a side effect. In order to see the results, you need to be passing it back as a return values from your function.

Here's one approach that handles the entry instance as an argument and return value:

def do_voroni(entry, seeds):
    entry.generate_voronoi_diagram(seeds)
    return entry

Now, you can use this function in your worker processes:

if __name__ == "__main__":
    entries = [ImD.ImageData(width, height) for _ in range(images)]
    seeds = numpy.random.poisson(100, images) # array of values

    pool = multiprocessing.Pool()
    for i, e in enumerate(pool.starmap_async(do_voroni, zip(entries, seeds))):
        e.generate_heat_map_image("ImagesOutput/Entries/Entry{:02d}".format(i))

The e values in the loop are not references to the values in the entries list. Rather, they're copies of those objects, which have been passed out to the worker process (which added data to them) and then passed back.

Sign up to request clarification or add additional context in comments.

4 Comments

Huuum, thank you. But I'm getting a "AttributeError: 'Pool' object has no attribute 'starmap_async'". But looking at the reference I can find this method.
Yeah, actually, I don't have Python 3 installed, since I other things that I need to use, such as matplotlib, has things that didn't work with it at all. It's possible to do in another way? Besides, the seed is unique for all voronoi diagram - it represent the number of sites.
@pceccon: Ah, it does seem the starmap methods are only in Python 3. You can use the map_async method in Python 2, but you'll need to change the do_voroni function to accept a 2-tuple which you unpack in the function body to get the entry and seeds values: def do_voroni(tup): entry, seeds = tup; ...
@pceccon: By the way, if you're using Python 2, you probably want to make your ImageData class inherit from object, so you get a "new-style" class rather than the "old-style" that's deprecated (and only kept for backwards compatibility reasons). Inheriting from object (or any other new-style base class) will do the job. I doubt it will make a major difference in how your code works, but there are a few language features that only work with new-style classes. New-style is the default in Python 3!
0

I might be wrong, but I think you should use

res = p.apply_async(entry.generate_voronoi_diagram,(seeds))

res.get(timeout=1)

you may get Can't pickle type 'instancemethod'

i think the easiest way is something like

import random
from multiprocessing import Pool


class ImageData:

    def generate_voronoi_diagram(self, seeds):
        ooxx

    def generate_heat_map_image(self, path):
        ooxx

def allinone(obj, seeds, path):
    obj.generate_voronoi_diagram(seeds)
    obj.generate_heat_map_image(path)

if __name__ == "__main__":
    entries = []
    seeds = random.random()
    p = Pool()
    entry = ImageData()
    res = p.apply_async(allinone, (entry, seeds, 'tmp.txt'))
    res.get(timeout=1)   

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.