RNN适用场景

循环神经网络(Recurrent Neural Network)适合处理和预测时序数据

RNN的特点

RNN的隐藏层之间的节点是有连接的,他的输入是输入层的输出向量.extend(上一时刻隐藏层的状态向量)。

demo:单层全连接网络作为循环体的RNN

输入层维度:x

隐藏层维度:h

每个循环体的输入大小为:x+h

每个循环体的输出大小为:h

循环体的输出有两个用途:

  1. 下一时刻循环体的输入的一部分
  2. 经过另一个全连接神经网络,得到当前时刻的输出

序列长度

理论上RNN支持任意序列长度,但过长会导致优化时梯度消散的问题,因此一般都设定一个最大长度。超过该长度是,进行截断。

论文原文:On the difficulty of training Recurrent Neural Networks

长短时记忆网络(LSTM结构)

论文原文:Long Short-term memory

循环体:拥有输入门、遗忘门、输出门的特殊网络结构

遗忘门:决定忘记当前输入、上一时刻状态和上一时刻输出中的哪一部分

输入门:决定当前输入、上一时刻状态、上一时刻输出中,哪些部分将进入当前时刻的状态

RNN的变种

  1. 双向RNN
  2. 深层RNN

RNN的dropout

不同层的循环体之间使用dropout,同一层循环体之间不使用dropout

demo

import os
import re
import io
import requests
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from zipfile import ZipFile
from tensorflow.python.framework import ops
ops.reset_default_graph()

about zipfile

1. start a graph session and set RNN parameters

sess = tf.Session()

epochs = 20 # run 20 epochs. An epoch equals to all batches of this training set.
batch_size = 250
max_sequence_length = 25
rnn_size = 10 # The RNN will be of size 10 units.
embedding_size = 50 # every word will be embedded in a trainable vector of size 50
min_word_frequency = 10 # We will only consider words that appear at least 10 times in our vocabulary
learning_rate = 0.0005
dropout_keep_prob = tf.placeholder(tf.float32)

2. Download or open data

Check if it was already downloaded and, if so,read in the file.

Otherwise, download the data and save it

# Download or open data

data_dir = 'data'

data_file = 'text_data.txt'

if not os.path.exists(data_dir):

    os.makedirs(data_dir)

if not os.path.isfile(os.path.join(data_dir, data_file)):

    zip_url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/00228/smsspamcollection.zip'

    r = requests.get(zip_url)

    z = ZipFile(io.BytesIO(r.content))

    file = z.read('SMSSpamCollection')

    # Format Data

    text_data = file.decode()

    text_data = text_data.encode('ascii',errors='ignore')

    text_data = text_data.decode().split('\n')

    # Save data to text file

    with open(os.path.join(data_dir, data_file), 'w') as file_conn:

        for text in text_data:

            file_conn.write("{}\n".format(text)) # append "\n" to each row. Format method is from re lib. 

else:

    # Open data from text file

    text_data = []

    with open(os.path.join(data_dir, data_file), 'r') as file_conn:

        for row in file_conn:

            text_data.append(row)

    text_data = text_data[:-1]

text_data = [x.split('\t') for x in text_data if len(x)>=1]

[text_data_target, text_data_train] = [list(x) for x in zip(*text_data)]

3. Create a text cleaning function then clean the data

def clean_text(text_string):

    text_string = re.sub(r'([^\s\w]|_|[0-9])+', '', text_string) # \w匹配包括下划线的任何单词字符 [^\s\w]匹配空格开头字符串

    text_string = " ".join(text_string.split())

    text_string = text_string.lower()

    return(text_string)

# Clean texts

text_data_train = [clean_text(x) for x in text_data_train]

4. Change texts into numeric vectors

This will convert a text to an appropriate list of indices


x_shuffled = text_processed[shuffled_ix] y_shuffled = text_data_target[shuffled_ix] # Split train/test set ix_cutoff = int(len(y_shuffled)*0.80) x_train, x_test = x_shuffled[:ix_cutoff], x_shuffled[ix_cutoff:] y_train, y_test = y_shuffled[:ix_cutoff], y_shuffled[ix_cutoff:] vocab_size = len(vocab_processor.vocabulary_) print("Vocabulary Size: {:d}".format(vocab_size)) print("80-20 Train Test split: {:d} -- {:d}".format(len(y_train), len(y_test))) # Create placeholders x_data = tf.placeholder(tf.int32, [None, max_sequence_length]) y_output = tf.placeholder(tf.int32, [None]) # Create embedding embedding_mat = tf.Variable(tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0)) embedding_output = tf.nn.embedding_lookup(embedding_mat, x_data) #embedding_output_expanded = tf.expand_dims(embedding_output, -1) # Define the RNN cell #tensorflow change >= 1.0, rnn is put into tensorflow.contrib directory. Prior version not test. if tf.__version__[0]>='1': cell=tf.contrib.rnn.BasicRNNCell(num_units = rnn_size) else: cell = tf.nn.rnn_cell.BasicRNNCell(num_units = rnn_size) output, state = tf.nn.dynamic_rnn(cell, embedding_output, dtype=tf.float32) output = tf.nn.dropout(output, dropout_keep_prob) # Get output of RNN sequence output = tf.transpose(output, [1, 0, 2]) last = tf.gather(output, int(output.get_shape()[0]) - 1) weight = tf.Variable(tf.truncated_normal([rnn_size, 2], stddev=0.1)) bias = tf.Variable(tf.constant(0.1, shape=[2])) logits_out = tf.matmul(last, weight) + bias # Loss function losses = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits_out, labels=y_output) # logits=float32, labels=int32 loss = tf.reduce_mean(losses) accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(logits_out, 1), tf.cast(y_output, tf.int64)), tf.float32)) optimizer = tf.train.RMSPropOptimizer(learning_rate) train_step = optimizer.minimize(loss) init = tf.global_variables_initializer() sess.run(init) train_loss = [] test_loss = [] train_accuracy = [] test_accuracy = [] # Start training for epoch in range(epochs): # Shuffle training data shuffled_ix = np.random.permutation(np.arange(len(x_train))) x_train = x_train[shuffled_ix] y_train = y_train[shuffled_ix] num_batches = int(len(x_train)/batch_size) + 1 # TO DO CALCULATE GENERATIONS ExACTLY for i in range(num_batches): # Select train data min_ix = i * batch_size max_ix = np.min([len(x_train), ((i+1) * batch_size)]) x_train_batch = x_train[min_ix:max_ix] y_train_batch = y_train[min_ix:max_ix] # Run train step train_dict = {x_data: x_train_batch, y_output: y_train_batch, dropout_keep_prob:0.5} sess.run(train_step, feed_dict=train_dict) # Run loss and accuracy for training temp_train_loss, temp_train_acc = sess.run([loss, accuracy], feed_dict=train_dict) train_loss.append(temp_train_loss) train_accuracy.append(temp_train_acc) # Run Eval Step test_dict = {x_data: x_test, y_output: y_test, dropout_keep_prob:1.0} temp_test_loss, temp_test_acc = sess.run([loss, accuracy], feed_dict=test_dict) test_loss.append(temp_test_loss) test_accuracy.append(temp_test_acc) print('Epoch: {}, Test Loss: {:.2}, Test Acc: {:.2}'.format(epoch+1, temp_test_loss, temp_test_acc)) # Plot loss over time epoch_seq = np.arange(1, epochs+1) plt.plot(epoch_seq, train_loss, 'k--', label='Train Set') plt.plot(epoch_seq, test_loss, 'r-', label='Test Set') plt.title('Softmax Loss') plt.xlabel('Epochs') plt.ylabel('Softmax Loss') plt.legend(loc='upper left') plt.show() # Plot accuracy over time plt.plot(epoch_seq, train_accuracy, 'k--', label='Train Set') plt.plot(epoch_seq, test_accuracy, 'r-', label='Test Set') plt.title('Test Accuracy') plt.xlabel('Epochs') plt.ylabel('Accuracy') plt.legend(loc='upper left') plt.show()

Vocabulary Size: 1124

80-20 Train Test split: 4459 -- 1115

C:\Users\Diane\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_impl.py

最新文章

  1. jQuery的deferred对象详解
  2. Xcode6如何自己添加pch文件?
  3. Java线程面试题 Top 50 (转载)
  4. 代码覆盖率工具 EMMA
  5. Python之路【第三篇补充】:Python基础(三)
  6. java中memcached
  7. JavaScript-2.2 document.write 输出到页面的内容
  8. Android 串口设置校验位、速率、停止位等参数
  9. maven配置本地仓库(从本地仓库下载jar包到.m2仓库)
  10. cookie、LocalStorage、sessionStorage三者区别以及使用方式
  11. UnicodeDecodeError: 'gbk' codec can't decode byte 0xa7 in position 166: illegal multibyte sequence
  12. Swift 中 insetBy(dx: CGFloat, dy: CGFloat) -> CGRect 用法详解
  13. Swift5 语言指南(十三) 方法
  14. iOS UI进阶-4.0 地图与定位
  15. android本地数据库,微信数据库WCDB for Android 使用实例
  16. [NOI1995]石子合并 四边形不等式优化
  17. 【转】系统去掉 Android 4.4.2 的StatusBar和NavigationBar
  18. jenkins的war包下载地址
  19. 使用HTTP头去绕过WAF(bypasswaf)
  20. CCF CSP 201703-4 地铁修建

热门文章

  1. ELKB是什么?
  2. 如何安装Ruby(Windows)
  3. Linux笔记(开机自动将kerne log保存到SD卡中)
  4. js 金额小写转换为大写
  5. 【Java】重载(Overload)与重写(Override)
  6. jquery 筛选元素(1)
  7. 51nod 1298 圆与三角形——计算几何
  8. BZOJ1509: [NOI2003]逃学的小孩(树的直径)
  9. 用python画小猪佩奇(非原创)
  10. 交换机基础配置之三层交换机实现vlan间通信