1

I have a C code that solves the Traveling Salesman Problem using a greedy algorithm. However, the current implementation is sequential, and I'd like to parallelize it using OpenMP to achieve better performance.

Here is the existing code:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <math.h>

#define N 5000
#define LARGE (2 * pow(N * 10, 2))

int X[N];
int Y[N];
float distance[N][N + 1];

int main(int na, char * arg[]) {
  assert(na == 2);

  printf("Dimension %s\n", arg[1]);
  int nn = atoi(arg[1]);
  assert(nn <= N);

  for (int i = 0; i < nn; i++)
    X[i] = rand() % (nn * 10);
  for (int i = 0; i < nn; i++)
    Y[i] = rand() % (nn * 10);

  for (int i = 0; i < nn; i++)
    distance[i][i] = 0;

  for (int i = 0; i < nn; i++)
    for (int j = i + 1; j < nn; j++)
      distance[i][j] = distance[j][i] = sqrt(pow(X[i] - X[j], 2) + pow(Y[i] - Y[j], 2));

  float best = LARGE;
  int good[nn];
  for (int first = 0; first < nn; first++) {
    float dist = 0;
    int path[nn];

    for (int i = 0; i < nn; i++)
      path[i] = -1;

    path[first] = 0;
    int current = first;

    for (int i = 1; i < nn; i++) {
      float dmin = LARGE;
      int index = 0;

      for (int j = 0; j < nn; j++) {
        if (path[j] == -1 && current != j && distance[current][j] < dmin) {
          dmin = distance[current][j];
          index = j;
        }
      }

      current = index;
      path[current] = i;
      dist += dmin;

      if (dist >= best) {
        dist = 0;
        break;
      }
    }

    if (dist) {
      float dmin = distance[current][first];
      dist += dmin;

      if (dist < best) {
        for (int i = 0; i < nn; i++)
          good[path[i]] = i;
        best = dist;
      }

      distance[first][nn] = dist;
    }
  }

  printf("Solution :\n");
  for (int i = 0; i < nn; i++)
    printf("%d\n", good[i]);
  printf("Distance %g == %g\n", best, distance[good[0]][nn]);

  exit(0);
}

I'm looking for guidance on how to effectively parallelize this code using OpenMP to achieve the best possible speedup. Are there specific sections of the code that are more amenable to parallelization? What OpenMP constructs should I consider using, and are there any potential pitfalls I should be aware of?

Any insights, code snippets, or recommendations on how to approach parallelizing this code would be greatly appreciated. Thank you!

11
  • not the solution but use double not float. Commented Mar 7, 2024 at 15:50
  • Note that this problem is NP complete so all known algorithm are exponential but heuristics can be far faster than the naive greedy algorithm. The algorithm matters a lot compared to optimizations that can "just" speed up the execution by a constant factor like using multiple threads. Commented Mar 7, 2024 at 16:06
  • The Held–Karp algorithm runs in O(n² 2^n) rather than O(n!) which is in practice far faster while providing still exact results. AFAIK branch&bound algorithms are known to be pretty good for this problem. Commented Mar 7, 2024 at 16:11
  • As for OpenMP, you should focus on what is private to each thread and what is shared (and possibly protected). Then, generally directives like #pragma omp parallel and #pragma omp for with some custom clauses are often enough (there are many tutorials showing how to use them, even official ones, not to mention books). Commented Mar 7, 2024 at 16:14
  • I don't think you can parallelize a greedy algorithm. Other algorithms for the TSP are much more amenable to parallelism. Commented Mar 7, 2024 at 19:31

1 Answer 1

0

If your objective is to parallelize your code using OpenMP and you prefer not to alter the algorithm, then you must utilize either a user-defined reduction or a manual reduction. Below, I've provided an example in response to your request in the comments using manual reduction.

 // these are the critical variables you have to take care
 float best = LARGE;
 int good[nn];
  
  #pragma omp parallel
  {  
      // create private variables for each thread
      float private_best = LARGE;
      int private_good[nn];
      
      #pragma omp for
      for (int first = 0; first < nn; first++) {
        // use the above defined private variables inside the for loop
        // best ---> private_best
        // good ---> private_good
         .....
         .....
      }
      // Find the minimum of 'private_best' and copy it to 'best' and 
      // also copy the 'private_good' array back to 'good' array 
      #pragma omp critical    
      if(private_best < best){
          best = private_best;
          for (int i = 0; i < nn; i++)
              good[i] = private_good[i];          
      }
  }
Sign up to request clarification or add additional context in comments.

3 Comments

Could I use the omp clause reduction(min:best)? And do the manual reduction for the good array?
How would the user-defined approach look like?
Read this answer to have an idea about how to use user defined reduction. Your problem is very similar. You cannot mixreduction clause and manual reduction.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.