This is a follow up to Restricted access for allocated arrays belonging to separate object after discovering that returning by value across compilation units doesn't help to hint the compiler about disjoint memory allocations.
I have two functions, make_v1 and make_v2, that return an array-like object, by value.
(This could be std::vector for example, the important point is that the copy constructor clearly allocates memory).
#include<vector>
template<class It> void modify_v(It);
void dreaded_function();
template<class It1, class It2>
void foo(It1 a, It2 b) {
*a = 5;
*b = 6;
if(*a == *b) dreaded_function(); // this is never called if a and b do not overlap
}
int main() {
std::vector<int> v1 = make_v1();
std::vector<int> v2 = make_v2();
foo(v1.begin(), v2.begin());
}
As you can see in the compiled code, there is no assumption made optimizing the call to dreaded_function.
This is just an example of pointer aliasing, just to diagnose the situation.
There are well-known cases of pointer aliasing creating worst performance problems.
https://godbolt.org/z/345PKrsnx
What high-level hints can I give the compiler, in C++ (or perhaps using extensions), that the iterator ranges are at memory regions that cannot overlap?
If this were C, or if I were using pointers and C++ extensions, I could use __restrict and a pointer interface, but I want something more high-level.
The only option that I found was to be very explicit about the copy:
...
std::vector<int> v1 = static_cast<std::vector<int> const&>(make_v1());
std::vector<int> v2 = static_cast<std::vector<int> const&>(make_v2());
This at least convinces clang that it can make the optimization (and at the cost of an extra copy I am pretty sure).
https://godbolt.org/z/nYfKeTKqj
But this is ugly and it is also making a copy, also if make_v becomes an included function later, the solution will have a cost.
(Also it still doesn't convince GCC!)
Is this the only way I can assure the compiler that v1 and v2 memory do not overlap? Is there a more streamlined solution?
Should [[assume]] work for this eventually?
At this point, the best I am hoping is some keyword/extension that "simulates" that the result of make_v1/make_v2 is copied into v1/v2.
__restrict?foolikefoo_restrictedthat takes restricted pointer and call itfoo_restricted_contiguous(&*v1.begin(), &*v2.begin())?__restrict__actually works: godbolt.org/z/dr79r967z - Now I wantrestrictas a keyword in C++ too :-)foo_detail(&*a, &*b);where that function uses__restrict__.arr.RESTRICT_begin()or bymake_restricted(arr.begin()). Not sure how it will scale in general.