前言

学了这么久的python理论知识,需要开始实战来练手巩固了。

准备

首先安装爬虫urllib库

pip install urllib

获取有道翻译的链接url



需要发送的参数在form data里

示例

import urllib.request
import urllib.parse url = 'http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
data = {}
data['i'] = 'i love python'
data['from'] = 'AUTO'
data['to'] = 'AUTO'
data['smartresult'] = 'dict'
data['client'] = 'fanyideskweb'
data['salt'] = '16057996372935'
data['sign'] = '0965172abb459f8c7a791df4184bf51c'
data['lts'] = '1605799637293'
data['bv'] = 'f7d97c24a497388db1420108e6c3537b'
data['doctype'] = 'json'
data['version'] = '2.1'
data['keyfrom'] = 'fanyi.web'
data['action'] = 'FY_BY_REALTlME'
data = urllib.parse.urlencode(data).encode('utf-8')
response = urllib.request.urlopen(url,data)
html = response.read().decode('utf-8')
print(html)

运行会出现50的错误,这里需要将url链接的_o删除掉



删除后运行成功



但是这个结果看起来还是太复杂,需要在进行优化

导入json,然后转换成字典进行过滤

import urllib.request
import urllib.parse
import json url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'
data = {}
data['i'] = 'i love python'
data['from'] = 'AUTO'
data['to'] = 'AUTO'
data['smartresult'] = 'dict'
data['client'] = 'fanyideskweb'
data['salt'] = '16057996372935'
data['sign'] = '0965172abb459f8c7a791df4184bf51c'
data['lts'] = '1605799637293'
data['bv'] = 'f7d97c24a497388db1420108e6c3537b'
data['doctype'] = 'json'
data['version'] = '2.1'
data['keyfrom'] = 'fanyi.web'
data['action'] = 'FY_BY_REALTlME'
data = urllib.parse.urlencode(data).encode('utf-8')
response = urllib.request.urlopen(url,data)
html = response.read().decode('utf-8') req = json.loads(html)
result = req['translateResult'][0][0]['tgt']
print(result)



但是这个程序只能翻译一个单词,用完就废了。于是我在进行优化

import urllib.request
import urllib.parse
import json def translate():
centens = input('输入要翻译的语句:')
url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'
head = {}#增加请求头,防反爬虫
head['User-Agent'] = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'
data = {}#带上from data的数据进行请求
data['i'] = centens
data['from'] = 'AUTO'
data['to'] = 'AUTO'
data['smartresult'] = 'dict'
data['client'] = 'fanyideskweb'
data['salt'] = '16057996372935'
data['sign'] = '0965172abb459f8c7a791df4184bf51c'
data['lts'] = '1605799637293'
data['bv'] = 'f7d97c24a497388db1420108e6c3537b'
data['doctype'] = 'json'
data['version'] = '2.1'
data['keyfrom'] = 'fanyi.web'
data['action'] = 'FY_BY_REALTlME'
data = urllib.parse.urlencode(data).encode('utf-8')
req = urllib.request.Request(url,data,head)
response = urllib.request.urlopen(req)
html = response.read().decode('utf-8')
req = json.loads(html)
result = req['translateResult'][0][0]['tgt']
# print(f'中英互译的结果:{result}')
return result
t = translate()
print(f'中英互译的结果:{t}')

优化完成,效果还行。

最新文章

  1. iOS开发--ChildViewController实现订单页的切换
  2. (原) 2.1 Zookeeper原生API使用
  3. React知识点总结1
  4. Installing Hadoop on Mac OSX Yosemite Tutorial Part 1.
  5. C# Closure
  6. 魔改——MFC SDI程序 转换为 MDI程序
  7. Perfection Kills
  8. 阻止Application_End事件的解决方案
  9. upgrade和update的区别
  10. 如何使用for循环连续的实例化多个对象!
  11. web技术发展历程--读《大型网站技术架构_核心原理与案例分析》
  12. [HNOI2002]营业额统计_Treap
  13. LeetCode & Q119-Pascal's Triangle II-Easy
  14. EF提交插入数据catch捕获具体异常方法
  15. C++(+类型加强 +加入面向对象)
  16. Linux下如何杀死终端
  17. TensorFlow常用API汇总
  18. 机器人排除标准 robot.txt robot exclusion standard
  19. Confluence 6 的 Crowd 设置
  20. Python学习笔记:lambda表达式

热门文章

  1. JSP标签语法、JSTL标签库、EL表达式辨析
  2. 模型评价指标:AUC
  3. Linux系统搭建Hadoop集群
  4. 配置域名与Https
  5. GPRS DTU工作的原理与应用场景
  6. C++ 设计模式 3:结构型模式
  7. sqlserver with(NOLOCK) 或 with(READPAST)
  8. 经典c程序100例==71--80
  9. [MIT6.006] 3. Insertation Sort, Mege Sort 插入排序,归并排序
  10. 92. Reverse Linked List II 翻转链表II