0

consider the following code, when p is a pointer allocated GPU-side.

thrust::device_ptr<float> pWrapper(p); thrust::device_ptr<float> fDevPos = thrust::min_element(pWrapper, pWrapper + MAXX * MAXY, thrust::minimum<float>()); fRes = *fDevPos; *fDicVal = fRes; 

after applying the same thing on cpu side.

float *hVec = new float[MAXX * MAXY]; cudaMemcpy(hVec, p, MAXX*MAXY*sizeof(float), cudaMemcpyDeviceToHost); float min = 999; int index = -1; for(int i = 0 ; i < MAXX* MAXY; i++) { if(min > hVec[i]) { min = hVec[i]; index = i; } } printf("index :%d a wrapper : %f, as vectorDevice : %f\n",index, fRes, min); delete hVec; 

i get that min != fRes. what am i doing wrong here?

0

1 Answer 1

2

thrust::minimum_element requires the user to supply a comparison predicate. That is, a function which answers the yes-or-no question "is x smaller than y?"

thrust::minimum is not a predicate; it answers the question "which of x or y is smaller?".

To find the smallest element using minimum_element, pass the thrust::less predicate:

ptr_to_smallest_value = thrust::min_element(first, last, thrust::less<T>()); 

Alternatively, don't pass anything. thrust::less is the default:

ptr_to_smallest_value = thrust::min_element(first, last); 

If all you're interested in is the value of the smallest element (not an iterator pointing to the smallest element), you can combine thrust::minimum with thrust::reduce:

smallest_value = thrust::reduce(first, last, std::numeric_limits<T>::max(), thrust::minimum<T>()); 
Sign up to request clarification or add additional context in comments.

2 Comments

and which way is faster? i used the way you wrote me last night and it returned a pointer (device_ptr). on which i had to use the * operator to extract the value.
They should be nearly the same speed as they're both limited by bandwidth of reading the array.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.