1.记我的第一次python爬虫爬取网页视频
It is my first time to public some notes on this platform, and I just want to improve myself by recording something that I learned everyday.
Partly , I don't know much about network crawler , and that makes me just understanding something that floats on the surface.
But since I was learning three days when I got a method to craw some videos on the web.
I am very excited, I just know how to craw something from the internet to computer hard disk. It is a start, surely, this is the first step, I just got to keep moving.
Step 1: Find a video on the web page, then plays the video online, press the keyboard shortcuts F12, it occurs element-checked page
as the following pictures:
Click .ts file and then you will see the URL, that is the point.
Step 2: Writing python code, as following:
from multiprocessing import Pool
import requests def demo(i):
try:
url = "https://vip.holyshitdo.com/2019/5/8/c2417/playlist%0d.ts"%i
#simulate browser
print(url)
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36Name','Referer':'http://91.com','Content-Type': 'multipart/form-data; session_language=cn_CN'}
r = requests.get(url, headers=headers)
#print(r.content) save the video with binary format
with open('./mp4/{}'.format(url[-10:]),'wb')as f:
f.write(r.content)
except:
return "" if __name__=='__main__': # program entry
pool = Pool(10) # create a process pool
for i in range(193):
pool.apply_async(demo,(i,)) # execute pool.close()
pool.join()
Step 3:Running code
Step 4 : Last but not least, merge .ts fragments into MP4 format.
Get to the terminal interface , under the saved diretory and use command line "copy /b *.ts newfile.mp4"
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
THAT IS ALL FOR NOW, TO BE CONTINUED~( ̄▽ ̄~)~
最新文章
- 再说vim的tab设置
- oracle中复制表和数据 &;&; 多表插入语句
- swift项目初体验--教你打造一款个性化图片浏览器(篇幅过大,慎入)
- Android 常用UI控件之Tab控件的实现方案
- 启动 XPs 代理
- 【网络流24题】 No.10 餐巾计划问题 (线性规划网络优化 最小费用最大流)
- CMDB机柜平台结合zabbix告警展示
- Struts2 OGNL调用公共静态方法
- 大数据分析神兽麒麟(Apache Kylin)
- OpenStack JEOS 镜像
- Java学习笔记——JDBC读取properties属性文件
- asp.net mvc源码分析-DefaultModelBinder 自定义的普通数据类型的绑定和验证
- HubbleDotNet全文搜索数据库组件(二)
- SparkMLlib学习之线性回归
- 安装linux环境及相关包方法
- DML数据操作语言之复杂查询
- springboot集成elasticsearch遇到的问题
- 谈一谈socket与java
- ubuntu诸软件安装
- Nginx加载模块