kubenetes GPU
https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#deploying-nvidia-gpu-device-plugin
1. 安装 nvidia-docker(ubuntu14.04)
https://github.com/NVIDIA/nvidia-docker
卸载旧版
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge -y nvidia-docker
# Add the package repositories
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update # Install nvidia-docker2 and reload the Docker daemon configuration
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd
2. 设置docker runtime
First you will need to check and/or enable the nvidia runtime as your default runtime on your node. We will be editing the docker daemon config file which is usually present at /etc/docker/daemon.json
:
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
} 重起docker
root@ogs-gpu02:/etc/ssl/certs# docker run --runtime=nvidia --rm registry.bst-1.cns.bstjpc.com:5000/nvidia/cuda nvidia-smi
Fri Mar 23 05:30:37 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.111 Driver Version: 384.111 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K20m Off | 00000000:04:00.0 Off | 0 |
| N/A 27C P0 48W / 225W | 0MiB / 4742MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K20m Off | 00000000:43:00.0 Off | 0 |
| N/A 27C P0 48W / 225W | 0MiB / 4742MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla K20m Off | 00000000:84:00.0 Off | 0 |
| N/A 31C P0 47W / 225W | 0MiB / 4742MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla K20m Off | 00000000:C4:00.0 Off | 0 |
| N/A 30C P0 48W / 225W | 0MiB / 4742MiB | 43% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
kubelet 启动参数增加 --feature-gates="DevicePlugins=true"
用k8s 启动 nvidia-device-plugin
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
用k8s 自带的gpu功能, kubelet 启动参数 --feature-gates="Accelerators=true"
最新文章
- ZK 长时操作带进度条
- jQuery下拉框插件8种效果
- 怎样取出cobbler kopts中设置的参数?
- 【ibus】设置ibus输入法(pinyin &; sunpinyin)
- WIN10系统 Solidworks 2015 Toolbox插件提示 failed to create toolboxl ibrary object 解决方法
- 10款让人惊叹的HTML5/jQuery图片动画特效
- 163. Missing Ranges
- 编写Swift代码的其他工具
- JSON跟JSONP的区别以及实战
- Junit4 java.lang.Exception: No runnable methods
- osg做的路面项目
- Linux学习----gdb调试(指针的指针)
- modelsim如何使用tcl脚本来写编译文件
- jQuery截取字符串的几种方法
- Js时间格式[转载]
- equals与“==”的区别
- python基础易错题
- 一维码UPC A简介及其解码实现(zxing-cpp)
- 【题解】玲珑杯河南专场17B
- 微信小程序开发动感十足的加载动画--都在这里!