Parallelizing Traveling Salesman Problem Code in C using OpenMP

Question

I have a C code that solves the Traveling Salesman Problem using a greedy algorithm. However, the current implementation is sequential, and I'd like to parallelize it using OpenMP to achieve better performance.

Here is the existing code:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <math.h>

#define N 5000
#define LARGE (2 * pow(N * 10, 2))

int X[N];
int Y[N];
float distance[N][N + 1];

int main(int na, char * arg[]) {
  assert(na == 2);

  printf("Dimension %s\n", arg[1]);
  int nn = atoi(arg[1]);
  assert(nn <= N);

  for (int i = 0; i < nn; i++)
    X[i] = rand() % (nn * 10);
  for (int i = 0; i < nn; i++)
    Y[i] = rand() % (nn * 10);

  for (int i = 0; i < nn; i++)
    distance[i][i] = 0;

  for (int i = 0; i < nn; i++)
    for (int j = i + 1; j < nn; j++)
      distance[i][j] = distance[j][i] = sqrt(pow(X[i] - X[j], 2) + pow(Y[i] - Y[j], 2));

  float best = LARGE;
  int good[nn];
  for (int first = 0; first < nn; first++) {
    float dist = 0;
    int path[nn];

    for (int i = 0; i < nn; i++)
      path[i] = -1;

    path[first] = 0;
    int current = first;

    for (int i = 1; i < nn; i++) {
      float dmin = LARGE;
      int index = 0;

      for (int j = 0; j < nn; j++) {
        if (path[j] == -1 && current != j && distance[current][j] < dmin) {
          dmin = distance[current][j];
          index = j;
        }
      }

      current = index;
      path[current] = i;
      dist += dmin;

      if (dist >= best) {
        dist = 0;
        break;
      }
    }

    if (dist) {
      float dmin = distance[current][first];
      dist += dmin;

      if (dist < best) {
        for (int i = 0; i < nn; i++)
          good[path[i]] = i;
        best = dist;
      }

      distance[first][nn] = dist;
    }
  }

  printf("Solution :\n");
  for (int i = 0; i < nn; i++)
    printf("%d\n", good[i]);
  printf("Distance %g == %g\n", best, distance[good[0]][nn]);

  exit(0);
}

I'm looking for guidance on how to effectively parallelize this code using OpenMP to achieve the best possible speedup. Are there specific sections of the code that are more amenable to parallelization? What OpenMP constructs should I consider using, and are there any potential pitfalls I should be aware of?

Any insights, code snippets, or recommendations on how to approach parallelizing this code would be greatly appreciated. Thank you!

Note that this problem is NP complete so all known algorithm are exponential but heuristics can be far faster than the naive greedy algorithm. The algorithm matters a lot compared to optimizations that can "just" speed up the execution by a constant factor like using multiple threads. — Jérôme Richard
– Jérôme Richard, Commented Mar 7, 2024 at 16:06
The Held–Karp algorithm runs in O(n² 2^n) rather than O(n!) which is in practice far faster while providing still exact results. AFAIK branch&bound algorithms are known to be pretty good for this problem. — Jérôme Richard
– Jérôme Richard, Commented Mar 7, 2024 at 16:11
As for OpenMP, you should focus on what is private to each thread and what is shared (and possibly protected). Then, generally directives like #pragma omp parallel and #pragma omp for with some custom clauses are often enough (there are many tutorials showing how to use them, even official ones, not to mention books). — Jérôme Richard
– Jérôme Richard, Commented Mar 7, 2024 at 16:14
I don't think you can parallelize a greedy algorithm. Other algorithms for the TSP are much more amenable to parallelism. — Victor Eijkhout
– Victor Eijkhout, Commented Mar 7, 2024 at 19:31

Laci · Accepted Answer · 2024-03-10 15:06:14Z

0

If your objective is to parallelize your code using OpenMP and you prefer not to alter the algorithm, then you must utilize either a user-defined reduction or a manual reduction. Below, I've provided an example in response to your request in the comments using manual reduction.

 // these are the critical variables you have to take care
 float best = LARGE;
 int good[nn];
  
  #pragma omp parallel
  {  
      // create private variables for each thread
      float private_best = LARGE;
      int private_good[nn];
      
      #pragma omp for
      for (int first = 0; first < nn; first++) {
        // use the above defined private variables inside the for loop
        // best ---> private_best
        // good ---> private_good
         .....
         .....
      }
      // Find the minimum of 'private_best' and copy it to 'best' and 
      // also copy the 'private_good' array back to 'good' array 
      #pragma omp critical    
      if(private_best < best){
          best = private_best;
          for (int i = 0; i < nn; i++)
              good[i] = private_good[i];          
      }
  }

answered Mar 10, 2024 at 15:06

Laci

2,8381 gold badge15 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Chounima Over a year ago

Could I use the omp clause reduction(min:best)? And do the manual reduction for the good array?

Chounima Over a year ago

How would the user-defined approach look like?

Laci Over a year ago

Read this answer to have an idea about how to use user defined reduction. Your problem is very similar. You cannot mixreduction clause and manual reduction.

Collectives™ on Stack Overflow

Parallelizing Traveling Salesman Problem Code in C using OpenMP

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related