本文首发于个人博客https://kezunlin.me/post/95370db7/,欢迎阅读最新内容!

keras multi gpu training

Guide

multi_gpu_model

import tensorflow as tf
from keras.applications import Xception
from keras.utils import multi_gpu_model
import numpy as np G = 8
batch_size_per_gpu = 32
batch_size = batch_size_per_gpu * G num_samples = 1000
height = 224
width = 224
num_classes = 1000 # Instantiate the base model (or "template" model).
# We recommend doing this with under a CPU device scope,
# so that the model's weights are hosted on CPU memory.
# Otherwise they may end up hosted on a GPU, which would
# complicate weight sharing.
with tf.device('/cpu:0'):
model = Xception(weights=None,
input_shape=(height, width, 3),
classes=num_classes) # Replicates the model on 8 GPUs.
# This assumes that your machine has 8 available GPUs.
parallel_model = multi_gpu_model(model, gpus=G)
parallel_model.compile(loss='categorical_crossentropy',
optimizer='rmsprop') # Generate dummy data.
x = np.random.random((num_samples, height, width, 3))
y = np.random.random((num_samples, num_classes)) # This `fit` call will be distributed on 8 GPUs.
# Since the batch size is 256, each GPU will process 32 samples.
parallel_model.fit(x, y, epochs=20, batch_size=batch_size) # Save model via the template model (which shares the same weights):
model.save('my_model.h5')

results

results from Multi-GPU training with Keras, Python, and deep learning on Onepanel.io

To validate this, we trained MiniGoogLeNet on the CIFAR-10 dataset with 4 V100 GPU.

Using a single GPU we were able to obtain 63 second epochs with a total training time of 74m10s.

However, by using multi-GPU training with Keras and Python we decreased training time to 16 second epochs with a total training time of 19m3s.

4x times speedup!

Reference

History

  • 20190910:: created.

Copyright

最新文章

  1. O365(世纪互联)SharePoint 之使用Designer报错
  2. CGRectInset & CGRectOffset
  3. 关于Warn:name or service not known的解决办法
  4. Flash Builder 4.6 BUG 远程访问受阻
  5. How to: Add Missing ContentPlaceHolder
  6. android 登陆案例_sd卡
  7. How to hide an entry in the Add/Remove Programs applet?
  8. statfs函数说明
  9. 机器学习实战笔记5(logistic回归)
  10. centos7如何安装pandoc
  11. windows下使用git管理github项目
  12. struts2中的<s:select>默认选项
  13. ERP-非财务人员的财务培训教(一.一)------基本会计知识
  14. 使用proxychains 代理终端
  15. SHELL 脚本小技巧
  16. 在Ubuntu 16.04下安装nodejs
  17. [JOISC2014]たのしい家庭菜園
  18. dumpe2fs 命令的使用,转储 ext2/ext3/ext4 文件系统信息
  19. BZOJ4822[Cqoi2017]老C的任务——树状数组(二维数点)
  20. webpack+vue多入口环境搭建

热门文章

  1. Java IO系列之 ByteArrayInputStream
  2. 创建可执行的JAR包并运行
  3. shell中字典的一个用法示例
  4. OverLoad怎么用
  5. table-layout:fixed
  6. PHP中RBAC权限管理
  7. 使用npm link 加速调试
  8. Linux Shell之监测磁盘空间
  9. 树莓派Raspberry pi安装系统/烧录系统
  10. RAC ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance