问题描述:

nvidia-smi也有显示,显卡驱动是在的,而且nvcc显示出来的cuda版本9.0也没错,不是9.1。不知道问题所在,索性重装全部。

sudo tee /proc/acpi/bbswitch <<<ON
# ON
nvidia-smi

显示如下:

Tue May 28 22:21:07 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.67 Driver Version: 390.67 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 950M Off | 00000000:01:00.0 Off | N/A |
| N/A 50C P0 N/A / N/A | 0MiB / 2004MiB | 0% Default |
+-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
nvcc --version

显示如下:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
lspci | grep -i nvidia

显示如下:

01:00.0 3D controller: NVIDIA Corporation GM107M [GeForce GTX 950M] (rev a2)

检查pytorch调用cuda是否正常:

python -c 'import torch; print(torch.cuda.is_available())'

显示如下:

False

卸载cuda

sudo /usr/local/cuda-9.0/bin/uninstall_cuda_9.0.pl
#这里之后只剩下cudnn的东西,也可以完全删了。
sudo rm -rf /usr/local/cuda-9.0/

卸载nvidia驱动及大黄蜂bunmblebee

sudo apt-get remove --purge nvidia-cuda-dev nvidia-cuda-toolkit nvidia-nsight nvidia-visual-profiler
sudo apt autoremove --purge bumblebee-nvidia nvidia-driver nvidia-settings

安装显卡驱动和大黄蜂bumblebee

sudo apt-get install nvidia-smi
sudo apt-get install bumblebee-nvidia nvidia-driver nvidia-settings

安装显卡驱动测试程序

sudo apt-get install mesa-utils

显示N卡相关信息:

optirun glxinfo|grep NVIDIA

运行测试程序

optirun glxgears -info

成功调用显卡驱动,信息如下:

GL_RENDERER   = GeForce GTX 950M/PCIe/SSE2
GL_VERSION = 4.6.0 NVIDIA 390.67
GL_VENDOR = NVIDIA Corporation

安装cuda

https://developer.nvidia.com/cuda-90-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=runfilelocal

下载runfile

sudo ./cuda_9.0.176_384.81_linux.run

安装过程只有这个选no

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81?
(y)es/(n)o/(q)uit: n

下载安装cudnn

<https://developer.nvidia.com/rdp/cudnn-archive>

登录下载对应版本我是选择了

cudnn-9.0-linux-x64-v7.5.0.56

这个版本的

把对应的额外的cudnn库放入cuda对应的位置:

sudo cp lib64/* /usr/local/cuda/lib64/
sudo cp include/* /usr/local/cuda/include/

然后检查环境变量并开启默认N卡

# 检查LD_LIABRARY_PATH和PATH
sudo vim ~/.bashrc # 用大黄蜂开启默认N卡
sudo tee /proc/acpi/bbswitch<<<ON

再次检查pytorch是否能调用cuda

python -c "import torch;print(torch.cuda.is_available())"

显示如下:

True

检查tensorflow是否正常调用gpu

python3 -c "import tensorflow as tf;print(tf.test.is_gpu_available());print(tf.test.gpu_device_name())"

显示如下:

2019-05-28 22:52:25.862539: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-05-28 22:52:26.319239: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-05-28 22:52:26.319674: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce GTX 950M major: 5 minor: 0 memoryClockRate(GHz): 1.124
pciBusID: 0000:01:00.0
totalMemory: 1.96GiB freeMemory: 1.92GiB
2019-05-28 22:52:26.319696: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0

都正常了,没有比我这更复杂了吧,卸了重装,有卸载过程和安装过程。

最新文章

  1. mac安装Mysql官方示例数据库employee
  2. java-GUI图形用户界面
  3. Windbg符号与源码 《第二篇》
  4. ip地址定位库
  5. SharePoint 2010 BCS - 概述
  6. mysql系统表加trigger和对特定的库禁用 DDL 语句
  7. git 删除分支和删除文件夹
  8. Cow Sorting(置换)
  9. 高斯消元法~get√
  10. TR90眼镜_百度百科
  11. 纯 Swift 封装的 SQLite 框架:SQLite.swift
  12. windows 纤程
  13. 《前端之路》之四 JavaScript 的闭包、作用域、作用域链
  14. 可遇不可求的Question之Sqlserver2005文件组的迁移篇
  15. Lodop控件NewPage();测试输出空白页
  16. (转载)python: getopt的使用;
  17. 移动端的1px边框问题
  18. maven 相关问题
  19. 【docker】docker部署spring boot服务 选择配置文件启动
  20. 发现一个ReactNative大神

热门文章

  1. 利用urllib.urlopen向有道翻译发送数据获得翻译结果
  2. Jmeter启动jmeter-server.bat 报java.io.FileNotFoundException:rmi_keystore.jks 解决方法
  3. python修炼之路---面向对象
  4. python之同步IO和异步IO
  5. XGboost数据比赛实战之调参篇(完整流程)
  6. 点对点协议PPP
  7. kong CentOS7网关安装
  8. java+web+大文件上传下载
  9. 在Linux下使用命令行打印文件
  10. 【bzoj1059】[ZJOI2007]矩阵游戏