I have a C code that solves the Traveling Salesman Problem using a greedy algorithm. However, the current implementation is sequential, and I'd like to parallelize it using OpenMP to achieve better performance.
Here is the existing code:
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <math.h>
#define N 5000
#define LARGE (2 * pow(N * 10, 2))
int X[N];
int Y[N];
float distance[N][N + 1];
int main(int na, char * arg[]) {
assert(na == 2);
printf("Dimension %s\n", arg[1]);
int nn = atoi(arg[1]);
assert(nn <= N);
for (int i = 0; i < nn; i++)
X[i] = rand() % (nn * 10);
for (int i = 0; i < nn; i++)
Y[i] = rand() % (nn * 10);
for (int i = 0; i < nn; i++)
distance[i][i] = 0;
for (int i = 0; i < nn; i++)
for (int j = i + 1; j < nn; j++)
distance[i][j] = distance[j][i] = sqrt(pow(X[i] - X[j], 2) + pow(Y[i] - Y[j], 2));
float best = LARGE;
int good[nn];
for (int first = 0; first < nn; first++) {
float dist = 0;
int path[nn];
for (int i = 0; i < nn; i++)
path[i] = -1;
path[first] = 0;
int current = first;
for (int i = 1; i < nn; i++) {
float dmin = LARGE;
int index = 0;
for (int j = 0; j < nn; j++) {
if (path[j] == -1 && current != j && distance[current][j] < dmin) {
dmin = distance[current][j];
index = j;
}
}
current = index;
path[current] = i;
dist += dmin;
if (dist >= best) {
dist = 0;
break;
}
}
if (dist) {
float dmin = distance[current][first];
dist += dmin;
if (dist < best) {
for (int i = 0; i < nn; i++)
good[path[i]] = i;
best = dist;
}
distance[first][nn] = dist;
}
}
printf("Solution :\n");
for (int i = 0; i < nn; i++)
printf("%d\n", good[i]);
printf("Distance %g == %g\n", best, distance[good[0]][nn]);
exit(0);
}
I'm looking for guidance on how to effectively parallelize this code using OpenMP to achieve the best possible speedup. Are there specific sections of the code that are more amenable to parallelization? What OpenMP constructs should I consider using, and are there any potential pitfalls I should be aware of?
Any insights, code snippets, or recommendations on how to approach parallelizing this code would be greatly appreciated. Thank you!
doublenotfloat.O(n² 2^n)rather thanO(n!)which is in practice far faster while providing still exact results. AFAIK branch&bound algorithms are known to be pretty good for this problem.#pragma omp paralleland#pragma omp forwith some custom clauses are often enough (there are many tutorials showing how to use them, even official ones, not to mention books).