본문 바로가기

Coding/Machine Learning

It seems that RTX 3080 has a issue with CUDA 10.1

* 영작 연습 및 정보 기록용으로 올린 포스팅입니다.

 

I bought GIGABYTE RTX 3080 gaming oc 10GB for deep learning and used it to train a model.

 

But the validation loss was nan but training loss was fine.

I tested the same script with 4 environments(OS : Windows 10 x64):

 

  1. 3700x + RTX 3080 (CUDA 10.1)

  2. 3700x only (no GPU)

  3. Other laptop (i7 8750H + GTX 1050ti)

  4. 3700x + RTX 3080 (CUDA 11.0 + cudnn 8.0.3)

 

The validation losses were fine except for the 1st environment.

So i think that there are some issues with RTX 3080 + CUDA 10.1 setting.

 

If you has some issues with RTX 3080, using Tensorflow nightly build and CUDA 11.0 can be a solution. And a contributor of Tensorflow said that tensorflow 2.4.0 will support CUDA 11.0.

 


Edit) 10/21/2020 - I tested (Tensorflow nightly-build + CUDA 11.1 + cudnn 8.0.4) combination and it worked.