python scraping webs - python取得NIPS oral paper列表
2024-08-28 13:39:31
from lxml import html
import requests # using xpath # page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
page = requests.get('https://nips.cc/Conferences/2019/Schedule')
tree = html.fromstring(page.content) #This will create a list of buyers:
# buyers = tree.xpath('//div[@title="buyer-name"]/text()')
# test = tree.xpath('//*[@id="maincard_15788"]/div[3]')
# print(test) doc = tree
# btags = doc.xpath("//*[@class[starts-with(., 'maincard narrower Oral') and string-length() > 3]]")
btags = doc.xpath("//*[@class[starts-with(., 'maincard narrower Spotlight') and string-length() > 3]]")
idx = 1
with open('nips_paperlist_spotlight.txt', 'w') as f:
for b in btags:
type = b.xpath("div[1]")[0].text
title = b.xpath("div[3]")[0].text
author = b.xpath("div[5]")[0].text
out_str = "%d, %s, %s, %s\n"%(idx, type, title, author)
print(out_str)
f.writelines(out_str)
# print(idx)
# print(type)
# print(title)
# print(author)
idx += 1
使用XPath
lxml, requests
https://docs.python-guide.org/scenarios/scrape/
https://stackoverflow.com/questions/12393858/xpath-using-contains-with-a-wildcard
最新文章
- MyBK
- Android之官方下拉刷新控件SwipeRefreshLayout
- [javascript]模拟汉诺塔
- go-- 用go-mssql驱动连接sqlserver数据库
- hdu 1241 Oil Deposits(水一发,自我的DFS)
- Rescue
- Quarzt.NET 任务调度框架
- 【概率】Uva 10900 - So you want to be a 2n-aire?
- 控制台程序使用MFC类的方法
- 201521123049 《JAVA程序设计》 第10周学习总结
- CSS3的[att$=val]选择器
- 第二阶段第六次spring会议
- zookeeper.Net
- hdoj:2036
- Dell R730服务器 Raid0 Raid5配置
- CF 966E May Holidays
- RubyMine8 安装
- Mysql----索引原理与慢查询优化
- dubbo支持的注册中心
- vue、入门
热门文章
- Spring学习笔记:使用Pointcut 和Advisor实现AOP
- Spring Boot 2.1.7 启动项目失败,报错: ";Failed to configure a DataSource: 'url' attribute is not specified and no embedded datasource could be configured.";
- Django (一) 基础
- Hello Rust!
- DotNet 源码学习——QUEUE
- 少量代码设计一个登录界面(二) – .NET CORE(C#) WPF开发
- 代理模式-jdk动态代理
- Spark之RDD本质
- 「Flink」事件时间与水印
- Mac-MacOS降级(Mac系统降级,系统回退)