1

I'm testing a OpenCL 2.0 reducing kernel to get a max value of 2d-workgroups.

I call:

//get global max
float max_v;
max_v = work_group_reduce_max(hsv_fi.z);

if(get_local_id(0) == 0 && get_local_id(1) == 0){
    result[get_group_id(0)] = 1337;// max_v;
}

at the end of my kernel. The 1337 for test purposes. I expect an array of workgroupsize full of 1337 values.

But i somehow get only one value. Is there anything obvious I'm doing wrong in kernel?

I try to read it via:

int errcode;
float* resultData = (float*)oclEnvironment._commandQueue.enqueueMapBuffer(resultBuffer, true, CL_MAP_READ, 0, size, 0, 0, &errcode);
std::copy((float*)resultData, (float*)(resultData + size), (float*)dest);

my *dest pointer is a:

std::vector<float> resultArray;

which I reserve with the workgroup size and hand over the pointer to it (*dest) by:

resultArray.data()

Any suggestions? I feel like it's something minor that I'm missing.

Thanks !

EDIT:

cl::NDRange global(width, height);
//Intel HD 530 can have a max. workgroup size of 256.
int dim1 = 16;
int dim2 = 16;
cl::NDRange local(dim1, dim2);

//Calculate the number of workgroups
int numberOfWorkgroups = ceil((width * height) / (float)(dim1 * dim2));

//each workgroup reduces the data to a single element. This elements are then reduced on host in the final reduction step.
    oclEnvironment._commandQueue.enqueueNDRangeKernel(_kernel, cl::NullRange, global, local);

edit2:

I still don't understand what OpenCL is doing. I have an Image with 480x360 that I want as a buffer with 3 bytes per element (rgb). I get like 30 valid maxima in the list now but not the 675 I suspect. The image has a resolution of 480 * 360. My maximum work group size is (16,16,1). So I get the 675 work groups I thought. Somehow I receive only 30 values back, the others are zeros. I tried to pad the input to 480*368 to have both dimension be multiples of 16. Still the same. It would be okay if those 30 values reduced my whole data set and the max of that list I receive is the max. But since I am not sure whats happening I can't be sure that is the case.

8
  • Shouldn't be a problem, but what are the additional (float*) casts in std::copy for? Commented Dec 6, 2018 at 17:37
  • 1
    Is size the size in bytes or the number of floats? The usage seems inconsistent. Commented Dec 6, 2018 at 17:51
  • The (float*) aren't necessary indeed. I can remove them, makes no difference for my problem though. Size is the number of floats or number of work groups. Commented Dec 7, 2018 at 6:56
  • I suspect that enqueueMapBuffer takes the size in number of bytes, not number of items. Commented Dec 7, 2018 at 9:09
  • I tried it with size * sizeof(float) but it crashes. Commented Dec 7, 2018 at 9:58

1 Answer 1

0

The comments of user10605163 lead me to the issue. As he suspected correctly I read only the size of the workgroups and not the size of the workgroups times sizeof(float) from the buffer. This confused me because I also forgot to create the buffer with the sizeof(float) multiplier and so did not think it was wrong in the first place. Once I debugged and tried alot around my code this simple hint is the solution of my problem and OpenCL works as I expect it to do now.

One thing I like to drop here cause it was giving me headaches as well is the fact that if you try to avoid using padding in your data, don't write something in kernel like

if(get_global_linear_id() >= bufferSize)
{
    return;
}else{
    ... work + blocking call (barrier or work_group_reduce_*) ...
}

This will crash your system or occupy your GPU till you restart your system. If you try to debug use a CPU platform so it won't freeze your system. But change that behavior above and either use padding, with for example zeros, or something else so your work-items don't get stuck like mine did.

This had nothing to do with the actual solution but cost me time to figure out and maybe helps someone in future.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.