OpenCL Reduce kernel returning only one value

Question

I'm testing a OpenCL 2.0 reducing kernel to get a max value of 2d-workgroups.

I call:

//get global max
float max_v;
max_v = work_group_reduce_max(hsv_fi.z);

if(get_local_id(0) == 0 && get_local_id(1) == 0){
    result[get_group_id(0)] = 1337;// max_v;
}

at the end of my kernel. The 1337 for test purposes. I expect an array of workgroupsize full of 1337 values.

But i somehow get only one value. Is there anything obvious I'm doing wrong in kernel?

I try to read it via:

int errcode;
float* resultData = (float*)oclEnvironment._commandQueue.enqueueMapBuffer(resultBuffer, true, CL_MAP_READ, 0, size, 0, 0, &errcode);
std::copy((float*)resultData, (float*)(resultData + size), (float*)dest);

my *dest pointer is a:

std::vector<float> resultArray;

which I reserve with the workgroup size and hand over the pointer to it (*dest) by:

resultArray.data()

Any suggestions? I feel like it's something minor that I'm missing.

Thanks !

EDIT:

cl::NDRange global(width, height);
//Intel HD 530 can have a max. workgroup size of 256.
int dim1 = 16;
int dim2 = 16;
cl::NDRange local(dim1, dim2);

//Calculate the number of workgroups
int numberOfWorkgroups = ceil((width * height) / (float)(dim1 * dim2));

//each workgroup reduces the data to a single element. This elements are then reduced on host in the final reduction step.
    oclEnvironment._commandQueue.enqueueNDRangeKernel(_kernel, cl::NullRange, global, local);

edit2:

I still don't understand what OpenCL is doing. I have an Image with 480x360 that I want as a buffer with 3 bytes per element (rgb). I get like 30 valid maxima in the list now but not the 675 I suspect. The image has a resolution of 480 * 360. My maximum work group size is (16,16,1). So I get the 675 work groups I thought. Somehow I receive only 30 values back, the others are zeros. I tried to pad the input to 480*368 to have both dimension be multiples of 16. Still the same. It would be okay if those 30 values reduced my whole data set and the max of that list I receive is the max. But since I am not sure whats happening I can't be sure that is the case.

Shouldn't be a problem, but what are the additional (float*) casts in std::copy for? — user10605163
– user10605163, Commented Dec 6, 2018 at 17:37
Is size the size in bytes or the number of floats? The usage seems inconsistent. — user10605163
– user10605163, Commented Dec 6, 2018 at 17:51
The (float*) aren't necessary indeed. I can remove them, makes no difference for my problem though. Size is the number of floats or number of work groups. — Jonas Bräuer
– Jonas Bräuer, Commented Dec 7, 2018 at 6:56
I suspect that enqueueMapBuffer takes the size in number of bytes, not number of items. — user10605163
– user10605163, Commented Dec 7, 2018 at 9:09

Jonas Bräuer · Accepted Answer · 2018-12-11 10:42:09Z

The comments of user10605163 lead me to the issue. As he suspected correctly I read only the size of the workgroups and not the size of the workgroups times sizeof(float) from the buffer. This confused me because I also forgot to create the buffer with the sizeof(float) multiplier and so did not think it was wrong in the first place. Once I debugged and tried alot around my code this simple hint is the solution of my problem and OpenCL works as I expect it to do now.

One thing I like to drop here cause it was giving me headaches as well is the fact that if you try to avoid using padding in your data, don't write something in kernel like

if(get_global_linear_id() >= bufferSize)
{
    return;
}else{
    ... work + blocking call (barrier or work_group_reduce_*) ...
}

This will crash your system or occupy your GPU till you restart your system. If you try to debug use a CPU platform so it won't freeze your system. But change that behavior above and either use padding, with for example zeros, or something else so your work-items don't get stuck like mine did.

This had nothing to do with the actual solution but cost me time to figure out and maybe helps someone in future.

Collectives™ on Stack Overflow

OpenCL Reduce kernel returning only one value

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related