I have the code:
for (int i = 0; i < (int)(kpts.size()); i++) {
perform_operation(kpts1[i], *kpts2[i]);
}
where kpt1 and kpt2 are a std::vector<> types. The function perform_operation takes kpt1[i], performs an operation on it and stores it in kpt2[i].
It seems like I should be able to multithread this. Since each cycle of the for loop is independent of one another, then I should be able to run this parallely with as many processes as there are CPU cores, right?
I've seem several SO questions kinda answering this, but they don't really get at how to parallelize a simple for loop; and I'm not sure if reading the same kpt1 variable and writing to the same kpt2 variable is possible.
Or am I misunderstanding something? - is this not parallelizable?
I'd be happy if I could find a solution in C++ or C, but right now I am stuck.
kpts.size()is large and/orperform_operation()is very expensive then it might be worth parallelizing.kpts1so that the max number of threads equals the cores (I'm still deciding which AWS instance I will be using, so I don't know the max cores at the moment). But for argument, lets say I want to split this up into8processes. How could I do that?