bs4 & 二进制写入图片视频

适用于：数据都在网页源代码上，可以直接从中提取到对应数据

例子：北京新发地网

原理：拿到页面源代码的文本，交给BeautifulSoup解析，然后找到对应的标签，获取值

关键词：BeautifulSoup find find_all get / mode='wb' .content

resp = requests.get(url,headers=headers)

#print(resp.text)

#resp.encoding='utf-8'

#解析数据

#1.把页面源代码较给BeautifulSoup处理生成对象

page=BeautifulSoup(resp.text,features="html.parser") #features指定html


#2.从bs对象可以查到数据  find(找第一个)  find_all（找全部的）


#find(div,属性=)

#table=page.find("table",class_='hq_table') #因为class是python关键字，他就加一个_

table=page.find("table",attrs={'class':'hq_table'})  #跟上面的是同一个含义

trs=table.find_all('tr')[1:]  #trs是列表   列表里每个元素是'bs4.element.Tag'型
for tr in trs:
      tds=tr.find_all('td')
      xx=tds[0].text

查数据也可以用select,甚至两者可以混用

res = page.select('tr[class="tr_color"]')
xx=item.select('h3 > span[class="comment-info"] > span')[1].attrs['title']

ss= item.select('p > span')[0].text

获取标签属性值

one.get('src') #获取标签属性值

二进制写图片视频等

img_resp=requests.get(url)

    with open('image/'+img_name,mode='wb') as f:  #二进制写   #为了防止索引卡，可以把image目录标记为排除

        f.write(img_resp.content)  #图片内容写入到文件

img_resp.close()

 f=open(f'video/{n}.ts',mode='wb',)

        f.write(resp3.content)

        f.close()

巴特西

bs4 & 二进制写入图片视频

最新文章

热门文章