Windows 安装配置tensorflow-gpu2.6、cuda、cudnn的一些问题及解决办法

前言

tensorflow-gpu、cuda、cudnn的版本一定要对应！！！

详见：在 Windows 环境中从源代码构建 | TensorFlow (google.cn)

1.配置环境

我的配置环境：windows11(家庭版，x64) + tensorflow-gpu2.6.0 + cudatoolkit11.2.0 + cudnn8.1.0.77 + Anaconda3(2024.06_1) + pycharm 2023.3.2 + RTX4050显卡

2.安装步骤

1.安装顺序

安装pycharm安装Anaconda3管理员方式打开Anaconda promt，通过conda或pip命令安装接下来步骤中的安装包通过conda命令安装cudatoolkit通过conda命令安装cudnn通过pip命令安装tensorflow-gpu

2.安装参考链接

按照上述顺序安装，即可完成环境配置。我参考的博文如下：

Anaconda3安装：1.Windows下的Anaconda详细安装教程_windows安装anaconda-CSDN博客

2. 还是搞不懂Anaconda是什么?读这一篇文章就够了-CSDN博客（辅助参考）

步骤3~6安装：十分钟安装Tensorflow-gpu2.6.0+本机CUDA12 以及numpy+matplotlib各包版本协调问题_tensorflow cuda12-CSDN博客

注意点：

1.安装Anaconda3第一个参考链接中的“conda默认虚拟环境路径修改”，我的是使用方法1生效（方法2也可以）：

Windows 安装配置tensorflow-gpu2.6、cuda、cudnn的一些问题及解决办法

2.配置tensorflow-gpu、cuda、cudnn时，按照参考博文步骤，最终输入conda list命令所显示的安装包版本会与博主所展示的些许差别，可以先不用着急改，我安装完后 list 如下：

Windows 安装配置tensorflow-gpu2.6、cuda、cudnn的一些问题及解决办法

3.测试验证

1.TensorFlow-gpu测试

测试代码如下：




import tensorflow as tf


 


"""


    tensorflow-gpu-2.6.0 test


"""


print(tf.__version__)


print(tf.test.gpu_device_name())


print(tf.config.experimental.set_visible_devices)


print('GPU:', tf.config.list_physical_devices('GPU'))


print('CPU:', tf.config.list_physical_devices(device_type='CPU'))


print(tf.config.list_physical_devices('GPU'))


print(tf.test.is_gpu_available())


# 输出可用的GPU数量


print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))


# 查询GPU设备

得到报错结果如下：显示“Could not load dynamic library ‘cudart64_110.dll‘； dlerror: cudart64_110.dll”等不能加载动态库的错误。

Windows 安装配置tensorflow-gpu2.6、cuda、cudnn的一些问题及解决办法

在网上搜了一大堆基本都是缺什么dll文件，就寻找下载对应的的dll文件，然后放在C:WindowsSystem路径下。这是治标不治本的方法。仔细一想，我的tensorflow-gpu、cuda、cudnn的版本都是按照tensorflow官网建议来的，应该不会错。于是，先查看了我的conda虚拟环境下的是否有这些缺失的dll文件，我创建的路径是D:ProgramDataDataAnaconda_envsenvs f-gpu-2.6Libraryin：

Windows 安装配置tensorflow-gpu2.6、cuda、cudnn的一些问题及解决办法

报错不能加载的dll文件都在，怀疑是系统找不到该路径，于是打开系统环境，找到path，果然，没有添加。添加后，再次运行测试代码，结果如下：

Windows 安装配置tensorflow-gpu2.6、cuda、cudnn的一些问题及解决办法

得到上述结果，测试通过，则说明TensorFlow-gpu环境配置初步成功。Congratulations！！！

2. 调用cuda加速测试

测试代码如下：




import cv2


import tensorflow as tf


from mtcnn import MTCNN


 


config = tf.compat.v1.ConfigProto()


config.gpu_options.per_process_gpu_memory_fraction = 0.6


config.gpu_options.allow_growth = True


sess = tf.compat.v1.Session(config=config)


 


detector = MTCNN()


 


img = cv2.imread("Lena3.png")


 


output = detector.detect_faces(img)


 


print(output)

得到报错信息如下，主要是

错误1. Call to CreateProcess failed. Error code: 2

告警2. Couldn’t get ptxas version string: Internal: Couldn’t invoke ptxas.exe –version

告警3. Internal: Failed to launch ptxas




D:ProgramDataDataAnaconda_envsenvs	f-gpu-2.6python.exe D:CodesPython	f_gpu_testcuda_test.py 


2024-08-29 17:45:39.732544: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2


To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


2024-08-29 17:45:40.595161: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3684 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4050 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9


2024-08-29 17:45:40.618800: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3684 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4050 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9


2024-08-29 17:45:40.827761: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)


2024-08-29 17:45:41.754506: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8100


2024-08-29 17:45:42.639977: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2


2024-08-29 17:45:42.641084: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2


2024-08-29 17:45:42.641227: W tensorflow/stream_executor/gpu/asm_compiler.cc:77] Couldn't get ptxas version string: Internal: Couldn't invoke ptxas.exe --version


2024-08-29 17:45:42.648187: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2


2024-08-29 17:45:42.648554: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Failed to launch ptxas


Relying on driver to perform ptx compilation. 


Modify $PATH to customize ptxas location.


This message will be only logged once.


2024-08-29 17:45:45.448664: I tensorflow/stream_executor/cuda/cuda_blas.cc:1760] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.


[{'box': [202, 182, 158, 210], 'confidence': 0.9971387386322021, 'keypoints': {'left_eye': (267, 265), 'right_eye': (336, 267), 'nose': (316, 317), 'mouth_left': (265, 348), 'mouth_right': (322, 351)}}]


 


Process finished with exit code 0

对于错误1 Call to CreateProcess failed，该博主的方法1：运行 conda install -c nvidia cuda-nvcc 来确保ptxas在conda环境中 (因为我的cudatoolkit是通过在conda虚拟环境中安装的，没有通过从官网下载安装包的方式，因此方法2不适合我)。完整命令如下：

conda install -c nvidia cuda-nvcc

试了，还是报错，和上面的报错信息一模一样，没有任何改善……. （别急着卸载cuda-nvcc）

通过在电脑中搜索 ptxas.exe ，发现它就在我的conda虚拟环境路径下：

Windows 安装配置tensorflow-gpu2.6、cuda、cudnn的一些问题及解决办法

于是老方法，将该路径添加到系统环境变量的path中。关闭pycharm，重新打开，运行测试代码，上述错误和警告均解决，nice!!! 正确结果如下：




D:ProgramDataDataAnaconda_envsenvs	f-gpu-2.6python.exe D:CodesPython	f_gpu_testcuda_test.py 


2024-08-29 19:11:11.700032: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2


To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


2024-08-29 19:11:14.105552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3684 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4050 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9


2024-08-29 19:11:14.129957: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3684 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4050 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9


2024-08-29 19:11:14.347828: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)


2024-08-29 19:11:15.332231: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8100


2024-08-29 19:11:21.279191: I tensorflow/stream_executor/cuda/cuda_blas.cc:1760] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.


[{'box': [202, 182, 158, 210], 'confidence': 0.9971387386322021, 'keypoints': {'left_eye': (267, 265), 'right_eye': (336, 267), 'nose': (316, 317), 'mouth_left': (265, 348), 'mouth_right': (322, 351)}}]


 


Process finished with exit code 0

3.人脸识别代码测试

下面跑一段人脸识别代码测试：




import cv2


import tensorflow as tf


from mtcnn import MTCNN


 


config = tf.compat.v1.ConfigProto()


config.gpu_options.per_process_gpu_memory_fraction = 0.6


config.gpu_options.allow_growth = True


sess = tf.compat.v1.Session(config=config)


 


detector = MTCNN()


img = cv2.imread('Lena3.png')


output = detector.detect_faces(img)


print(output)


 


x,y,width,height = output[0]['box']


left_eye_X,left_eye_Y = output[0]['keypoints']['left_eye']


right_eye_X,right_eye_Y = output[0]['keypoints']['right_eye']


nose_X,nose_Y = output[0]['keypoints']['nose']


mouth_left_X,mouth_left_Y = output[0]['keypoints']['mouth_left']


mouth_right_X,mouth_right_Y = output[0]['keypoints']['mouth_right']


 


# opencv的三色顺序为BGR (Blue,Green,Red)


cv2.rectangle(img,pt1=(x,y),pt2=(x+width,y+height),color=(0,255,0),thickness=2)


cv2.circle(img,center=(left_eye_X,left_eye_Y),color=(0,0,255),thickness=2,radius=1)


cv2.circle(img,center=(right_eye_X,right_eye_Y),color=(0,0,255),thickness=2,radius=1)


cv2.circle(img,center=(nose_X,nose_Y),color=(255,0,0),thickness=2,radius=1)


cv2.circle(img,center=(mouth_left_X,mouth_left_Y),color=(255,0,0),thickness=2,radius=1)


cv2.circle(img,center=(mouth_right_X,mouth_right_Y),color=(255,0,0),thickness=2,radius=1)


cv2.imshow('result',img)


 


cv2.waitKey(0)

结果如下：

Windows 安装配置tensorflow-gpu2.6、cuda、cudnn的一些问题及解决办法