1

If have a vector which is most of the time pseudo 3D but it can also behave as pseudo 4D. With pseudo 3D/4D I mean that the 3/4 dimensions are not stored as 3/4 different arrays but all in one long array. So there is some conversion from 3/4 indices to a row. This is achieved by an overload so.

T vector::at(int i, int j, int k, int s = 0){
    row = i*NJ*NK*NS + j*NK*NS + k*NS + s;
    return vector[row];
}

With NI, NJ, NK, NS the size in the i, j, k, s dimension respectively.

I need to loop over all the elements and set the values of another vector type (without the overload) to my vector with overload:

int n=0;
for (int i=0; i<NI; ++i)
    for (int j=0; j<NJ; ++j)
        for (int k=0; k<NK; ++k)
            for(int s=0; s<NS; ++s)
                myvector(i, j, k, s) = othervector[n++];

Now what I'm wondering is: would this be very inefficient if most of times NS=1. In this question I saw that today's CPU are heavily optimized for linear access to memory. That would in this example still be the case, but my feeling says that the loop would be very inefficient if the most inner loop is actually only one iteration all the time. Any ideas whether this is very undesirable from a performance point of view?

(I could do the following but I'm just wondering if the compiler/CPU is smart enough to make it efficient itself:

n=0;
if (NS>1){
for (int i=0; i<NI; ++i)
    for (int j=0; j<NJ; ++j)
        for (int k=0; k<NK; ++k)
            for(int s=0; s<NS; ++s)
                myvector(i, j, k, s) = othervector[n++];
} else {
for (int i=0; i<NI; ++i)
    for (int j=0; j<NJ; ++j)
        for (int k=0; k<NK; ++k)
            myvector(i, j, k) = othervector[n++];
}

)

1
  • 1
    There's a good chance that a modern c++ compiler catches that pattern for optimization. Compilers usually do loop unrolling of all kinds, if there's enough information in the code to optimize it that way. At least Duff's device could be applied here. Commented Dec 13, 2020 at 16:13

1 Answer 1

1

For the whole array copying, using std::copy_n (or memcpy) is much more efficient:

std::copy_n(othervector_pointer, (NI*NJ*NK*Ns), myvector_pointer);

You need to include `agorithm` header. Or memcpy in `cstring`:

memcpy(myvector_pointer, othervector_pointer, (NI*NJ*NK*NS));
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.