Skip to main content
Filter by
Sorted by
Tagged with
3 votes
1 answer
111 views

I am currently attempting to use the thrust::remove function on a thrust::device_vector of structs in my main function as shown bellow: #include <iostream> #include <thrust/device_vector.h>...
AowynB's user avatar
  • 33
2 votes
1 answer
101 views

I am maintaining a matrix of integers with dimensions 𝑎 × 𝑏 as a flattened array in row-major order. Now, I need to rearrange the rows of this matrix according to a priority array (length 𝑎) (Think ...
Samiran K's user avatar
3 votes
1 answer
300 views

Thrust has the thrust::reduce_by_key algorithm which works well for a problem of mine. I wanted to try to use CUB for finer control of memory and streams as well as interaction with my own kernels, ...
Treeman's user avatar
  • 116
0 votes
1 answer
145 views

I’m using Thrust for some tasks at work and have found that there seems to be a maximum number of iterators when constructing a zip_iterator. For example #include <thrust/iterator/zip_iterator.h>...
DocWho's user avatar
  • 11
1 vote
0 answers
219 views

This test program compiled fine with CUDA 12.4 and lower, but fails to compile w/ 12.5.1: #include <thrust/host_vector.h> #include <thrust/scan.h> #include <iostream> int main() { ...
Matt's user avatar
  • 20.8k
2 votes
0 answers
91 views

I am a little bit new to thrust I am trying to migrate the following code to use make use of gpus but this one seems a little difficult #include <iostream> #include <complex> #include <...
kiragon kiriyo's user avatar
2 votes
1 answer
1k views

My project contains many fill, copy and other basic operations. However, I'm new to CUDA programming, my current implementation just uses a for loop to operate on device_vector which is far less ...
Dylan's user avatar
  • 89
2 votes
2 answers
638 views

I'm receiving the compiler error static_assert failed: 'Attempt to use an extended __device__ lambda in a context that requires querying its return type in host code. Use a named function object, a ...
0xbadf00d's user avatar
  • 18.4k
49 votes
5 answers
26k views

I am a newbie to Thrust. I see that all Thrust presentations and examples only show host code. I would like to know if I can pass a device_vector to my own kernel? How? If yes, what are the ...
Ashwin Nanjappa's user avatar
0 votes
1 answer
141 views

I have a data structure already running on CUDA and collect the data as below: struct SearchDataOnDevice { size_t npair; int * id1; int * id2; }; I'd like to remove the duplicated id pair ...
holmessh's user avatar
1 vote
2 answers
385 views

Compiling the 3-line program test-cuda.cpp #include <thrust/execution_policy.h> int main() { return 0; } results in a compiler warning/error: $ g++ -std=c++17 test-cuda.cpp -I/opt/cuda/targets/...
Matt's user avatar
  • 20.8k
1 vote
1 answer
185 views

I have an algorithm I would like to implement, which involves coordinatewise addition, coordinatewise multiplication, and cyclic rotation of coordinates. My addition and multiplication are a little ...
Mark Schultz-Wu's user avatar
3 votes
1 answer
194 views

I have two operations for manipulating elements in device vectors using CUDA Thrust. Which methods can implement these two functions more efficiently? Replace part of values of a vector in batch with ...
Chris's user avatar
  • 33
1 vote
1 answer
2k views

I want to sort an array using raw device pointers with thrust::sort() and thrust::sort_by_key() because it uses radix sort. The data is in a raw uint64_t device pointer, and I initialize with random ...
PlatinumFrog's user avatar
1 vote
1 answer
267 views

In Cuda C++, I have a big array Arr of integers, and a function F: int -> int. I want to find the first index of some items in Arr that makes F maximal. How can I write a kernel that always keeps ...
Mojtaba Valizadeh's user avatar
1 vote
1 answer
464 views

I want to store partial reduction results in an array. Say I have data[8] = {10,20,30,40,50,60,70,80}. And if I divide the data with the chunk_size of 2, the chunks will be {10,20}, {30,40}, ... , {70,...
Sangjun Lee's user avatar
-2 votes
2 answers
330 views

Input and starting arrays: dv_A = { 5, -3, 2, 6} //4 elements dv_B = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } Expected output: dv_B = { 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1 } For every element in dv_A{},...
aiwyn's user avatar
  • 278
0 votes
1 answer
193 views

Let's say we have two inputs, the first one is an array, and the second is a bitmap thrust::device_vector<point_t> points; Bitset bits; // Imagine this can be accessed within the kernel. What I ...
geng liang's user avatar
1 vote
1 answer
158 views

I have a CUDA kernel which essentially looks like the following. __global__ void myOpKernel(double *vals, double *data, int *nums, double *crit, int N, int K) { int index = blockIdx.x*blockDim.x + ...
Sangjun Lee's user avatar
2 votes
1 answer
411 views

I'm trying to transfer some data manipulations from CPU to GPU (CUDA), but there's one small part that requires instructions to be run in a specific order. In principle I could do the first few ...
defladamouse's user avatar
0 votes
1 answer
371 views

I've implemented a for loop consisting of several Thrust transformations. My aim is to calculate r[i] for each value of i from 0 to N. To put simply, r is a column vector and each of its elements can ...
Muhteva's user avatar
  • 2,840
2 votes
1 answer
290 views

I am using the NVIDIA HPC SDK (2022) to compile the following code, the basic purpose of which is to sum a NxM matrix into a vector of size N. #include <thrust/host_vector.h> #include <thrust/...
batman216's user avatar
0 votes
1 answer
432 views

Can I assume that Thrust stable_sort_by_key performed on unsigned int has complexity O(n)? If not what should I do to be sure that this complexity will be achieved? (Except of implementing radix sort ...
complikator's user avatar
1 vote
1 answer
1k views

I'm new to CUDA and the thrust library. I'm learning and trying to implement a function that will have a for loop doing a thrust function. Is there a way to convert this loop into another thrust ...
KLi2708's user avatar
  • 13
-1 votes
1 answer
72 views

I have a device vector that I continuously modify and then want to save in an HDF5 file. Because of the size of the device vector I cannot make multiple modifications and then save them to reduce the ...
Luluio's user avatar
  • 129

1
2 3 4 5
20