It seems that RTX 3080 has a issue with CUDA 10.1

* 영작 연습 및 정보 기록용으로 올린 포스팅입니다.

I bought GIGABYTE RTX 3080 gaming oc 10GB for deep learning and used it to train a model.

But the validation loss was nan but training loss was fine.

I tested the same script with 4 environments(OS : Windows 10 x64):

1. 3700x + RTX 3080 (CUDA 10.1)

2. 3700x only (no GPU)

3. Other laptop (i7 8750H + GTX 1050ti)

4. 3700x + RTX 3080 (CUDA 11.0 + cudnn 8.0.3)

The validation losses were fine except for the 1st environment.

So i think that there are some issues with RTX 3080 + CUDA 10.1 setting.

If you has some issues with RTX 3080, using Tensorflow nightly build and CUDA 11.0 can be a solution. And a contributor of Tensorflow said that tensorflow 2.4.0 will support CUDA 11.0.

Edit) 10/21/2020 - I tested (Tensorflow nightly-build + CUDA 11.1 + cudnn 8.0.4) combination and it worked.

저작자표시 비영리 동일조건 (새창열림)

'Coding > Machine Learning' 카테고리의 다른 글

Example : Multiprocessing with shared large numpy array in Jupyter, Windows 10 (0)	2020.12.02
An example of custom loss using model internals (0)	2020.11.14
Implementation of Guided Grad-CAM with Tensorflow 2 (0)	2020.11.11

와삭바삭

It seems that RTX 3080 has a issue with CUDA 10.1

'Coding > Machine Learning' 카테고리의 다른 글

티스토리툴바

It seems that RTX 3080 has a issue with CUDA 10.1

'Coding > Machine Learning' 카테고리의 다른 글

'Coding/Machine Learning' Related Articles

티스토리툴바