How to use thrust::transform on larger Vector derived from smaller Vector?

Question

Input and starting arrays:

dv_A = { 5, -3, 2, 6} //4 elements
dv_B = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }

Expected output:

dv_B = { 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1 }

For every element in dv_A{}, there are (dv_A.size - 1) elements in dv_B{}. This is because each element of dv_A should have a child element in dv_B for each of the other dv_A elements (i.e. should exclude itself). Therefore, if there are 4 elements in dv_A, there should be 3 elements in dv_B for each of the dv_A elements.

I want to transform each dv_B element to have a value of 1 if its corresponding dv_A element has a value > 0. Correspondence is determined based on the position of the element in dv_B. For example:

The first 3 dv_B values will be transformed by the value in dv_A[0], The second 3 dv_B values will be transformed by the value in dv_A[1], Etc.

Here's my attempt so far

thrust::transform(
    dv_B.begin(),
    dv_B.end(),
    thrust::make_transform_iterator(
        dv_A.begin(),
        _1 % dv_A
    ), 
    dv_B.begin(),
    _2 > 0 //When 2nd argument is greater than 0 then set the position in dv_A to 1.
);

You can nest counting_iterator, transform_iterator (i / (A_size - 1)) and permutation_iterator to get the view into dv_A that you need to do the transform. — paleonix
– paleonix, Commented Sep 25, 2022 at 14:46
@paleonix I think that's what I'm after, but just can't get it to work. Updated my code with my best attempt before nesting the iterators, which I could not get to work either. — aiwyn
– aiwyn, Commented Sep 25, 2022 at 15:02
@paleonix that works-- turn it into an answer so that I can mark it. — aiwyn
– aiwyn, Commented Sep 25, 2022 at 16:42

Abator Abetor · Accepted Answer · 2022-09-26 10:07:33Z

2

The serial code could look something like this:

for(int i = 0; i < dv_b.size(); i++){
    const int readIndex = i / (dv_a.size() - 1);
    if(dv_a[readIndex] > 0) dv_b[i] = 1;
    else dv_b[i] = 0;
}

which can easily be written using for_each. I think this makes the code more clear compared to using transform together with various fancy iterators.

thrust::for_each(
    thrust::device,
    thrust::make_counting_iterator(0),
    thrust::make_counting_iterator(0) + dv_b.size(),
    [
     s = dv_a.size() - 1,
     dv_a = thrust::raw_pointer_cast(dv_a.data()),
     dv_b = thrust::raw_pointer_cast(dv_b.data())
    ] __device__ (int i){
        const int readIndex = i / s;
        if(dv_a[readIndex] > 0) dv_b[i] = 1;
        else dv_b[i] = 0;
    }
);

edited Sep 26, 2022 at 10:07

answered Sep 26, 2022 at 8:41

Abator Abetor

2,7181 gold badge12 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

aiwyn Over a year ago

This is more readable than the nested iterator idea from @paleonix's original comment on my post. One slight code improvement on this is to remove the if() in the last argument and naively set dv_b[i] = dv_a[readIndex] > 0 to eliminate the conditional (since >0 will return 1, and <=0 will return 0 anyway).

Abator Abetor Over a year ago

@aiwyn I don't think shortening the if is a code improvement. The way it is now clearly expresses that the value should be 1, not just the implicit conversion of a boolean value to int (which happens to be 1). In any case, the resulting ptx code is the same, anyways.

aiwyn Over a year ago

You're right, looks like the compiler handles it the same. Also, this version of the solution is 4x faster than the other one so I'm marking this as the answer.

paleonix · Accepted Answer · 2022-09-26 16:12:18Z

Packing the fancy iterator creation into an appropriately named factory function makes this version quite readable as well. Especially if you need this kind of pattern more than once, this solution might be more elegant.

#include <thrust/copy.h>
#include <thrust/device_vector.h>
#include <thrust/host_vector.h>
#include <thrust/iterator/counting_iterator.h>
#include <thrust/iterator/permutation_iterator.h>
#include <thrust/iterator/transform_iterator.h>
#include <thrust/transform.h>

#include <iostream>

// nvcc doesn't seem to like __device__ or __host__ __device__ lambdas in auto
// return types, so I defined this functor instead
template <typename T>
class Div {
    T div_{};
    public:
    Div(T div) : div_{div} {}
    __host__ __device__ T operator()(T in) const noexcept { return in / div_; }
};

// I stole "repeat" from numpy. Another version using modulo (%) and therefore
// repeating the whole input instead of single elements would be called "tile".
template <class It>
auto make_repeat_it_begin(It input, int repetitions) {
    using diff_t = typename It::difference_type;
    return thrust::make_permutation_iterator(
                input,
                thrust::make_transform_iterator(
                    thrust::make_counting_iterator(diff_t{0}),
                    Div{static_cast<diff_t>(repetitions)}));
}

int main() {
    int A[] = {5, -3, 2, 6};
    constexpr int size_A = sizeof(A) / sizeof(A[0]);

    thrust::host_vector<int> hv_A(A, A + size_A);
    thrust::device_vector<int> dv_A(hv_A);
    thrust::device_vector<int> dv_B(size_A * (size_A - 1));

    auto A_repeat_it = make_repeat_it_begin(dv_A.begin(), size_A - 1);
    
    thrust::transform(A_repeat_it, A_repeat_it + dv_B.size(), 
                      dv_B.begin(),
                      [] __device__ (int a){ return a > 0 ? 1 : 0; });

    thrust::host_vector<int> hv_B(dv_B);
    thrust::copy(hv_B.begin(), hv_B.end(),
                 std::ostream_iterator<int>(std::cout, ","));
}

Due to the device lambda, nvcc needs the -extended-lambda flag.

Collectives™ on Stack Overflow

How to use thrust::transform on larger Vector derived from smaller Vector?

2 Answers 2

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related