【转载】 【TensorFlow】static_rnn 和dynamic_rnn的区别
原文地址:
https://blog.csdn.net/qq_20135597/article/details/88980975
---------------------------------------------------------------------------------------------
tensorflow中提供了rnn接口有两种,一种是静态的rnn,一种是动态的rnn
通常用法:
1、静态接口:static_rnn
主要使用 tf.contrib.rnn
x = tf.placeholder("float", [None, n_steps, n_input])
x1 = tf.unstack(x, n_steps, 1)
lstm_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)
outputs, states = tf.contrib.rnn.static_rnn(lstm_cell, x1, dtype=tf.float32)
pred = tf.contrib.layers.fully_connected(outputs[-1],n_classes,activation_fn = None)
静态 rnn 的意思就是在图中创建一个固定长度(n_steps)的网络。这将导致
缺点:
- 生成过程耗时更长,占内存更多,导出的模型更大;
- 无法传递比最初指定的更长的序列(> n_steps)。
优点:
模型中带有某个序列中间台的信息,便与调试。
2、动态接口:dynamic_rnn
主要使用 tf.nn.dynamic_rnn
x = tf.placeholder("float", [None, n_steps, n_input])
lstm_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)
outputs,_ = tf.nn.dynamic_rnn(lstm_cell ,x,dtype=tf.float32)
outputs = tf.transpose(outputs, [1, 0, 2])
pred = tf.contrib.layers.fully_connected(outputs[-1],n_classes,activation_fn = None)
动态的tf.nn.dynamic_rnn被执行时,它使用循环来动态构建图形。这意味着
优点:
- 图形创建速度更快,占用内存更少;
- 并且可以提供可变大小的批处理。
缺点:
- 模型中只有最后的状态。
动态rnn的意思是只创建样本中的一个序列RNN,其他序列数据会通过循环进入该RNN运算
区别:
1、输入输出不同:
dynamic_rnn实现的功能就是可以让不同迭代传入的batch可以是长度不同数据,但同一次迭代一个batch内部的所有数据长度仍然是固定的。例如,第一时刻传入的数据shape=[batch_size, 10],第二时刻传入的数据shape=[batch_size, 12],第三时刻传入的数据shape=[batch_size, 8]等等。
但是static_rnn不能这样,它要求每一时刻传入的batch数据的[batch_size, max_seq],在每次迭代过程中都保持不变。
2、训练方式不同:
具体参见参考文献1
多层LSTM的代码实现对比:
1、静态多层RNN
import tensorflow as tf
# 导入 MINST 数据集
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("c:/user/administrator/data/", one_hot=True) n_input = 28 # MNIST data 输入 (img shape: 28*28)
n_steps = 28 # timesteps
n_hidden = 128 # hidden layer num of features
n_classes = 10 # MNIST 列别 (0-9 ,一共10类)
batch_size = 128 tf.reset_default_graph()
# tf Graph input
x = tf.placeholder("float", [None, n_steps, n_input])
y = tf.placeholder("float", [None, n_classes]) gru = tf.contrib.rnn.GRUCell(n_hidden*2)
lstm_cell = tf.contrib.rnn.LSTMCell(n_hidden)
mcell = tf.contrib.rnn.MultiRNNCell([lstm_cell,gru]) x1 = tf.unstack(x, n_steps, 1)
outputs, states = tf.contrib.rnn.static_rnn(mcell, x1, dtype=tf.float32) pred = tf.contrib.layers.fully_connected(outputs[-1],n_classes,activation_fn = None) learning_rate = 0.001
# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # Evaluate model
correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) training_iters = 100000 display_step = 10 # 启动session
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
step = 1
# Keep training until reach max iterations
while step * batch_size < training_iters:
batch_x, batch_y = mnist.train.next_batch(batch_size)
# Reshape data to get 28 seq of 28 elements
batch_x = batch_x.reshape((batch_size, n_steps, n_input))
# Run optimization op (backprop)
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
if step % display_step == 0:
# 计算批次数据的准确率
acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y})
# Calculate batch loss
loss = sess.run(cost, feed_dict={x: batch_x, y: batch_y})
print ("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
"{:.6f}".format(loss) + ", Training Accuracy= " + \
"{:.5f}".format(acc))
step += 1
print (" Finished!") # 计算准确率 for 128 mnist test images
test_len = 100
test_data = mnist.test.images[:test_len].reshape((-1, n_steps, n_input))
test_label = mnist.test.labels[:test_len]
print ("Testing Accuracy:", \
sess.run(accuracy, feed_dict={x: test_data, y: test_label}))
2、动态多层RNN
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("c:/user/administrator/data/", one_hot=True) n_input = 28 # MNIST data 输入 (img shape: 28*28)
n_steps = 28 # timesteps
n_hidden = 128 # hidden layer num of features
n_classes = 10 # MNIST 列别 (0-9 ,一共10类)
batch_size = 128 tf.reset_default_graph()
# tf Graph input
x = tf.placeholder("float", [None, n_steps, n_input])
y = tf.placeholder("float", [None, n_classes]) gru = tf.contrib.rnn.GRUCell(n_hidden*2)
lstm_cell = tf.contrib.rnn.LSTMCell(n_hidden)
mcell = tf.contrib.rnn.MultiRNNCell([lstm_cell,gru]) outputs,states = tf.nn.dynamic_rnn(mcell,x,dtype=tf.float32)#(?, 28, 256)
outputs = tf.transpose(outputs, [1, 0, 2])#(28, ?, 256) 28个时序,取最后一个时序outputs[-1]=(?,256) pred = tf.contrib.layers.fully_connected(outputs[-1],n_classes,activation_fn = None) learning_rate = 0.001
# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # Evaluate model
correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) training_iters = 100000 display_step = 10 # 启动session
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
step = 1
# Keep training until reach max iterations
while step * batch_size < training_iters:
batch_x, batch_y = mnist.train.next_batch(batch_size)
# Reshape data to get 28 seq of 28 elements
batch_x = batch_x.reshape((batch_size, n_steps, n_input))
# Run optimization op (backprop)
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
if step % display_step == 0:
# 计算批次数据的准确率
acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y})
# Calculate batch loss
loss = sess.run(cost, feed_dict={x: batch_x, y: batch_y})
print ("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
"{:.6f}".format(loss) + ", Training Accuracy= " + \
"{:.5f}".format(acc))
step += 1
print (" Finished!") # 计算准确率 for 128 mnist test images
test_len = 100
test_data = mnist.test.images[:test_len].reshape((-1, n_steps, n_input))
test_label = mnist.test.labels[:test_len]
print ("Testing Accuracy:", \
sess.run(accuracy, feed_dict={x: test_data, y: test_label}))
【参考文献】:
1、https://www.jianshu.com/p/1b1ea45fab47
2、What's the difference between tensorflow dynamic_rnn and rnn?
------------------------------------------------------------------------
最新文章
- Codeforces Round #366 (Div. 2) ABC
- C++ STL模板
- java编写冒泡排序
- 在VS 2015中边调试边分析性能
- T 泛型转换
- 【黑金原创教程】【FPGA那些事儿-驱动篇I 】【实验一】流水灯模块
- php字符串首字母转换大小写的实例
- ChannelOption.TCP_NODELAY, true->;浅谈tcp_nodelay的作用
- 15个最好的Bootstrap设计工具推荐
- 【Permutations II】cpp
- 基于局部敏感哈希的协同过滤推荐算法之E^2LSH
- Centos7.0挂载优盘安装jdk1.7和tomcat7.0
- 蔡勒(Zeller)公式
- IO流---字符流(FileWriter, FileReader ,BufferedWriter,BufferedReader)
- TCP协议滑动窗口(一)——控制数据传输速率
- 一、Java 23 种设计模式简介
- Windows NT 的历史
- DWZ富客户端框架使用手册【申明:来源于网络】
- AX_InventDim
- Docker服务器的图形显示方案
热门文章
- Python_soket
- HTML&;CSS基础-内联样式和内部样式表
- ACM International Collegiate Programming Contest, Tishreen Collegiate Programming Contest (2017)- K. Poor Ramzi -dp+记忆化搜索
- Strength(HDU6563+2018年吉林站+双指针瞎搞)
- 2019安徽省程序设计竞赛 D.自驾游(最短路)
- CentOS7卸载 OpenJDK 安装Sun的JDK8
- 51nod 2489 小b和灯泡
- python通过LXML库读取xml命名空间
- SOLOR介绍
- 8-html表格