链家网爬虫同步VS异步执行时间对比
2024-08-28 23:40:44
异步执行时间
import time
import asyncio
import aiohttp
from lxml import etree start_time = time.time()
async def get_url(url):
session = aiohttp.ClientSession()
response = await session.get(url)
result = await response.text() terr = etree.HTML(result) ret = terr.xpath('//*[@id="content"]/div[1]/ul/li')
for li in ret:
title = li.xpath('.//div[@class="title"]//text()')
print(title) async def get_html(): result = await get_url(url)
print(result)
if __name__ == '__main__':
url = "https://sz.lianjia.com/ershoufang/pg{}"
tasks = [asyncio.ensure_future(get_url(url.format(rl))) for rl in range(1,30)] # 创建task对象 loop = asyncio.get_event_loop() # 创建事件循环对象
loop.run_until_complete(asyncio.wait(tasks)) # 将所有task对象注册到事件循环对象中 end_time = time.time()
print("执行时间{}".format(end_time - start_time)) # 执行时间6.241659641265869
同步执行时间
import time
import requests
from lxml import etree headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36'
} start_time = time.time()
def get_url():
url = "https://sz.lianjia.com/ershoufang/pg{}" for i in range(1,30):
urli = url.format(i) result = requests.get(urli, headers=headers).text
terr = etree.HTML(result) ret = terr.xpath('//*[@id="content"]/div[1]/ul/li')
for li in ret:
title = li.xpath('.//div[@class="title"]//text()')
print(title) get_url()
end_time = time.time()
print("执行时间{}".format(end_time - start_time)) # 执行时间82.57950687408447
最新文章
- PHP基础教程-54课-问题
- (转) TexturePacker 3.0 使用教程
- atom编辑markdown之上传图片
- 关于sql用<;>;不等于查询数据不对问题
- PHP Web System Optimization(undone)
- 【Tyvj】1473校门外的树3 线段树/树状数组 <;区间修改+单点访问>;
- ch2-4:遇到嵌套列表进行缩进打印
- JS兼容getElementsByClassName
- ExtJs布局之tabPanel
- 【Python笔记】图片处理库PIL的源代码安装步骤
- r.js build.js配置
- 【2016美团】浏览器和服务器在基于https进行请求链接到数据传输过程中,用到了如下哪些技术
- 基于centos6.5安装部署mongdb3.6
- Thinkphp整合阿里云OSS图片上传实例
- python实战学习之matplotlib绘图
- 【第一篇】SAP ABAP7.5x新语法之预定义数据结构
- PHP 使用非对称加密算法(RSA)
- 【BZOJ】【4052】【CERC2013】Magical GCD
- TOSCA自动测试工具跟QTP 和 Selenium的简单对比
- C#定时器,定时做什么事情