一、简介:

      把每三个三个单词作为一个整体进行训练。

举一个例子:

input:

      my dream is that I can be an engineer, so I design more applications for people to use.

      my dream is that I can be a bird, so I can fly to everywhere I want.

      it is also my dream that I can be a house, so I can warm you in the cold winter.

生成的马尔可夫链:

    

{'START': ['my dream is'], 'my dream is': ['that i can'], 'dream is that': ['i can be'], 'is that i': ['can be a'], 'that i can': ['be a house,'], 'i can be': ['a house, so'], 'can be an': ['engineer, so i'], 'be an engineer,': ['so i design'], 'an engineer, so': ['i design more'], 'engineer, so i': ['design more applications'], 'so i design': ['more applications for'], 'i design more': ['applications for people'], 'design more applications': ['for people to'], 'more applications for': ['people to use.\nmy'], 'applications for people': ['to use.\nmy dream'], 'for people to': ['use.\nmy dream is'], 'people to use.\nmy': ['dream is that'], 'to use.\nmy dream': ['is that i'], 'use.\nmy dream is': ['that i can'], 'can be a': ['house, so i'], 'be a bird,': ['so i can'], 'a bird, so': ['i can fly'], 'bird, so i': ['can fly to'], 'so i can': ['warm you in'], 'i can fly': ['to everywhere i'], 'can fly to': ['everywhere i want.\nit'], 'fly to everywhere': ['i want.\nit is'], 'to everywhere i': ['want.\nit is also'], 'everywhere i want.\nit': ['is also my'], 'i want.\nit is': ['also my dream'], 'want.\nit is also': ['my dream that'], 'is also my': ['dream that i'], 'also my dream': ['that i can'], 'my dream that': ['i can be'], 'dream that i': ['can be a'], 'be a house,': ['so i can'], 'a house, so': ['i can warm'], 'house, so i': ['can warm you'], 'i can warm': ['you in the'], 'can warm you': ['in the cold'], 'warm you in': ['the cold winter.'], 'you in the': ['cold winter.'], 'in the cold': ['winter.'], 'END': ['the cold winter.', 'winter.', 'cold winter.']}

生成的文本:

my dream is that i can be a house, so i can warm you in the cold winter.

代码:

 # I sperate the input.txt with space, and use dictionary to store the next three words after the current 3 words.
# in the same time, store the first three word as the beginning, and the last three or two or one words as the end
# how to generate output.txt: form the start, start to look for the next three words in ramdom, once meets the end, the geration is end.
import random fhand=open("E:\\a2.txt",'r',encoding='UTF-8')
dataset_file=fhand.read() # dataset_file='my friend makes the best raspberry pies'
dataset_file=dataset_file.lower().split(' ')
model={} for i, word in enumerate(dataset_file):
if i == len(dataset_file) - 3:
model['END'] = model.get('END', []) + [dataset_file[i] + " " + dataset_file[i + 1] + " " + dataset_file[i + 2]]
model['END'] = model.get('END', []) + [dataset_file[i + 2]]
model['END'] = model.get('END', []) + [dataset_file[i + 1] +" "+dataset_file[i + 2]]
elif i == 0:
model['START'] = model.get('START', []) + [dataset_file[i] + " " + dataset_file[i + 1] + " " + dataset_file[i + 2]]
# model['START']=model.get('START',[])+[dataset_file[i]]
# model['START']=model.get('START',[])+[dataset_file[i]+" "+dataset_file[i+1]]
model[dataset_file[i] + " " + dataset_file[i + 1] + " " + dataset_file[i + 2]] = model.get(word, []) + [
dataset_file[i + 3] + " " + dataset_file[i + 4] + " " + dataset_file[i + 5]]
elif i <= (len(dataset_file) - 6):
model[dataset_file[i] + " " + dataset_file[i + 1] + " " + dataset_file[i + 2]] = model.get(word, []) + [
dataset_file[i + 3] + " " + dataset_file[i + 4] + " " + dataset_file[i + 5]]
elif i == (len(dataset_file) - 5):
model[dataset_file[i] + " " + dataset_file[i + 1] + " " + dataset_file[i + 2]] = model.get(word, []) + [
dataset_file[i + 3] + " " + dataset_file[i + 4]]
elif i == (len(dataset_file) - 4):
model[dataset_file[i] + " " + dataset_file[i + 1] + " " + dataset_file[i + 2]] = model.get(word, []) + [
dataset_file[i + 3]]
print(model) generated = []
while True:
if not generated:
words = model['START']
elif generated[-1] in model['END']:
break
else:
words = model[generated[-1]]
generated.append(random.choice(words)) fhand=open("E:\output.txt",'a')
for word in generated:
fhand.write(word+" ") print(word,end=' ')

最新文章

  1. Linux系统下如何查看已经登录用户
  2. javascript 设计模式-----外观模式
  3. Session 知识点再整理(二) 自定义 Session 存储机制
  4. 【原创】有关Silverlight中“DataGrid中级联动态绑定父/子ComboBox ”的示例。
  5. bzoj3140
  6. html5_canvas初学
  7. Linux网络编程“惊群”问题总结
  8. tp5实现邮件发送
  9. hibernate之映射文件VS映射注解
  10. Linux基础知识第九讲,linux中的解压缩,以及软件安装命令
  11. Linux中什么是动态网站环境及如何部署
  12. Vue 组件异步加载(懒加载)
  13. 【CSS学习】--- 文本水平对齐属性text-align和元素垂直对齐属性vertical-align
  14. tomat startup.bat 日志乱码问题解决
  15. __call__方法和Flask中 first_or_404
  16. NVIDIA 驱动安装(超详细)
  17. ubuntu系统安装微信小程序开发工具
  18. react中多语言切换的实现方式
  19. 序列化与反序列化的单例模式实现和readResolve()【转】
  20. laravel 查询指定字段的值

热门文章

  1. 洛谷 P1047 校门外的树
  2. IDEA 编译时 报 “常量字符串过长” 解决办法
  3. Altium Designer 10 使用技巧
  4. Sqlite 参数化 模糊查询 解决方案
  5. npm install 插件 --save与 --save -dev的区别
  6. webpack使用中遇到的问题
  7. CSS选择器中的特殊性
  8. C语言-第2次作业得分
  9. android studio连接真机大概问题
  10. Linux下tomcat运行时jvm内存分配