gpuarray.dot() works too slow at the first calling #309

decoli · 2021-08-25T06:46:07Z

I found it will cost much time when the first calling of gpuarray.dot().
Here is my code:

...
# the first time calling
start.record()
# res_gpu = gpuarray.dot(coef_gpu, image_gpu)
gpuarray.dot(coef_gpu, image_gpu)
end.record()
end.synchronize()
secs = start.time_till(end)
print("\ntime cost: {:.3f}ms\n".format(secs)) # time cost: 813.931ms


# the second time calling
start.record()
# res_gpu = gpuarray.dot(coef_gpu, image_gpu)
gpuarray.dot(coef_gpu, image_gpu)
end.record()
end.synchronize()
secs = start.time_till(end)
print("\ntime cost: {:.3f}ms\n".format(secs)) # time cost: 0.056ms
...

Why it will happen? And how can I solve the problem?

The text was updated successfully, but these errors were encountered:

inducer · 2021-08-25T15:57:01Z

That's because the first time the function is called, a few kernels are compiled behind the scenes to do the work. The basic assumption is that your program will run for long enough (otherwise, why are you using a GPU to speed it up?) that this cost will be more than amortized. Also, that cost should only be incurred once. The kernels should be in the disk cache after that, making them quick to load.

decoli · 2021-08-26T00:41:52Z

Thanks for your reply.
I guess...before actual use of gpuarray.dot(), I can call it for the kernels being compiled, like code:

...
gpuarray.dot(like_coef_gpu, like_image_gpu) # just for the kernels being compiled
...
...
gpuarray.dot(coef_gpu, image_gpu) # really calling

Is this a good solution for it?

inducer · 2021-08-26T03:13:05Z

If that works for your use case, then yes, that should avoid compilation/module load delays on subsequent runs of the kernel.

decoli · 2021-08-26T07:40:14Z

Oh... I found gpuarray.dot() is different from numpy.dot().

It seems that

import skcuda.linalg as linalg
linalg.dot()

can be regarded as a package that can run on the GPU and can be used with pycuda.

However, it will get error:
CUSOLVER library only available in CUDA 7.0 and later

New problem...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpuarray.dot() works too slow at the first calling #309

gpuarray.dot() works too slow at the first calling #309

decoli commented Aug 25, 2021

inducer commented Aug 25, 2021

decoli commented Aug 26, 2021

inducer commented Aug 26, 2021 •

edited

Loading

decoli commented Aug 26, 2021

gpuarray.dot() works too slow at the first calling #309

gpuarray.dot() works too slow at the first calling #309

Comments

decoli commented Aug 25, 2021

inducer commented Aug 25, 2021

decoli commented Aug 26, 2021

inducer commented Aug 26, 2021 • edited Loading

decoli commented Aug 26, 2021

inducer commented Aug 26, 2021 •

edited

Loading