2

I built a docker image for a server that does inference with tensorflow. I installed tensorflow-gpu with pip in the docker image. It works fine for my machine with titan x gpus. But when I ran the docker container on another machine with 1080 ti gpus. The first run becomes incredibly slow, takes about 90 seconds, usually it takes 7 seconds on the first run and 1 second in the following runs. I tried to set tf_cudnn_use_autotune to 0, and also mount a folder to save the cuda cache. But it doesn't really solve the problem. Any one has any suggestion?

1 Answer 1

2

Here's a link. I find this.

After running TensorFlow once, the compiled kernels are cached by CUDA. If using a docker container, the data is not cached and the penalty is paid each time TensorFlow starts.

Sign up to request clarification or add additional context in comments.

1 Comment

Any idea, where these cached kernels can be located?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.