Julia code seems to take longer the longer the number of threads it's been run

Ask Question

Asked 3 months ago

Modified 3 months ago

Viewed 67 times

Essentially I'm following the example in the guide

(posting only the relevant part, the full example is at this repo

embryos = [fertilising_room(population_model) for _ in 1:POPULATION_SIZE]

chunks = Iterators.partition(embryos, length(embryos) ÷ nthreads())
tasks = map(chunks) do chunk
    @spawn get_offspring(chunk)
end
all_offspring = vcat([fetch(task) for task in tasks]...)

@info "All offspring -> $(length(all_offspring))"

This is taking this much:

 % time julia examples/multi_only_crossover.jl 
  Activating project at `~/Code/julia/BraveNewAlgorithm.jl`
WARNING: using Distances.pairwise in module BraveNewAlgorithm conflicts with an existing identifier.
[ Info: Number of threads -> 1
[ Info: Reading parameters file
[ Info: All offspring -> 1000000
julia examples/multi_only_crossover.jl  7,25s user 0,31s system 114% cpu 6,629 total
% time julia --threads 2 examples/multi_only_crossover.jl
  Activating project at `~/Code/julia/BraveNewAlgorithm.jl`
WARNING: using Distances.pairwise in module BraveNewAlgorithm conflicts with an existing identifier.
[ Info: Number of threads -> 2
[ Info: Reading parameters file
[ Info: All offspring -> 1000000
julia --threads 2 examples/multi_only_crossover.jl  7,36s user 0,36s system 118% cpu 6,508 total
% time julia --threads 4 examples/multi_only_crossover.jl
  Activating project at `~/Code/julia/BraveNewAlgorithm.jl`
WARNING: using Distances.pairwise in module BraveNewAlgorithm conflicts with an existing identifier.
[ Info: Number of threads -> 4
[ Info: Reading parameters file
[ Info: All offspring -> 1000000
julia --threads 4 examples/multi_only_crossover.jl  7,88s user 0,35s system 134% cpu 6,139 total

Am I doing something wrong here?

asked Jul 30 at 17:10

jjmerelo

23.6k8 gold badges44 silver badges98 bronze badges

2

Your problem appears not to benefit from multithreading. It is probably limited by some other more fundamental and slower resource like memory management or disk IO. Profiling the code to see where it is actually spending its time may be useful. Might be worth checking if the Warning is trying to tell you something important too...

Martin Brown
– Martin Brown

2025-07-30 19:46:38 +00:00
Commented Jul 30 at 19:46
1

Does it still happen if you choose a less ambitious population size like 1000 or 10000? I suspect that you may be running into virtual memory limitations here unless you have a lot of ram.

Martin Brown
– Martin Brown

2025-07-31 07:48:47 +00:00
Commented Jul 31 at 7:48
@MartinBrown I do have a lot of RAM... yes, it happens all across the board...

jjmerelo
– jjmerelo

2025-07-31 10:35:18 +00:00
Commented Jul 31 at 10:35
1

@jjmerelo This is not dependent of the amount of RAM unless you actually run out of RAM and the swap is used instead (which drastically reduce performance, even in sequential). If the code is memory bound, then what matter is the memory throughput. The later is basically 8 * ram_frequency * number_of_channel bytes/s for modern (DDR4/DDR5) RAM DIMMs. See this article to get information about RAM on Linux. RAM Throughput is a common bottleneck on most parallel codes (especially with many threads). But yes, profiling is critical here.

Jérôme Richard
– Jérôme Richard

2025-07-31 17:12:47 +00:00
Commented Jul 31 at 17:12
1

@jjmerelo As to know if you have enough available RAM (or whether the slow swap storage is used instead), a basic htop (or top) command is generally enough on Linux (or the task manager on Windows).

Jérôme Richard
– Jérôme Richard

2025-07-31 17:14:09 +00:00
Commented Jul 31 at 17:14

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Julia code seems to take longer the longer the number of threads it's been run

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest