1.安装Requests
window:pip install requests
linux:sudo pip install requests
国内安装缓慢,建议到:
http://www.lfd.uci.edu/~gohlke/pythonlibs/

aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAbMAAACjCAIAAACPLmf/AAAABmJLR0QA/wD/AP+gvaeTAAAACXBIWXMAAA7EAAAOxAGVKw4bAAARQElEQVR4nO2dva4dtxGAuUHaVEkZw68gI43kh7DzBIHLSH0AqdJVpQukl8v4EXKNPICryI0gPUCaQEUapwhSpdoUK+/lHXKGHO4f95zvwyn2Z3Y45HJnh+QeMozXSgg3ruNLJNuuFTLFS7zyS9DS2jTRy6amDFcpZ/ta7uDEL8NBDMOraWMcX8bb4ux8cDoyC8+StnxW/7ybVWVbqyVdo8QrH8sMw6v6JFxkC0cr5+DPr3Z5etNDxX0vytekK7QZmrWzdiqp/QItC1mDi4nal6f5FeluVK9Oz7GOOYSb9DVoxCNZGSNW0i5Jd43jDfakv2IqNktiTMOebDEahZZNy9jOpisKxHXf6+WLWS7WBy1djWIcbdvgquf1xmQ3Kh+fa+awmHFmfmXVvEhj4cowKqu/wUL75R/bU2lVfRi4MGDUrhVqs9uanTX2G2e92TGyUCms1ai2dA3h+oSMAmyo5y6IE4sc7xmz9Hbn4trZ9oytknS3xGWyp8HehOam5XZdE5vWE285b53fC6ZTz9gVa1WsBj1d1WnNmH36Q/tn67xTznvyi6MNKJN993ZSM+LBHJc9IrioyWOcltaKdNkTywdnOXdS/jF2jCa6O3qwX9wv7XhNYJjmvcP8nothHMdjEn54L4tjzdpB7ZStP75Ka/7Y/Z5i1K++x1BTro0CpybZKbaNlQdnOdvWGqlr3wmkd80W0OQNxNvIvu92usXcpcRVJU09rUg1tcKuD9n8GvmqeWSuh8M8YwO89/aBcgboJWYEgA652nfkmWJGAIB9OMEIDADAzuAZAQAkeEYAAAmeEQBAgmcEAJDgGQEAJHhGAAAJnhEAQIJnBACQMAvZPQ3z2jfMKjZv16dSL99sWCXFmT7EqbPQPA2HuKrhftVbwvoEu3LslOIdst0KWTuseLXpSlUXvCpWdkUKQzi7u9ayAaxg1QPX25q+vCkt4jly0kmlwGAcXxKIQcz1esb9OfuczGe3fy0oh2uAVVWX9ve1dVHVyO/zBGYLx5iR13tcEzMm8TXsKcrXpCu02dMJa717aX5d98uwR2QNz3sMxzbmWVW1ks5XVXWVwyWtqlpfizQBVj3tk+PHpuOusfmg1kcm+tFckVfzuze7fIJhT6VV9WHgwoDRWBdBC3mK5azlt7IcvNmxl3aoEW7rdS3amS2f5QE+ceLhHO8Zs/RWM4xlQ/ZMGgQNX0253kkNuJTvYA+0wQhMmbUqboNL7eqZ0ey/tkFwI79d3S9Ywgk8Y7YidlIF48Eclz0iCK3JY5yW1op02SO+7HGVs/f4DtgOetxmlVFtPcJQul8b2QNr0csKWcWxZu2gdqpmiUgxvGhcYtRs19i0PQyqCQuT7BSb/87hKufK48XktN2igCZvJ531Ytn7XplusZIE8/6m9hgJ1VRpWIszrZDFe3UfKGeAXmJGAGiAd9hGnClmBADYhxOMwAAA7AyeEQBAgmcEAJDgGQEAJHhGAAAJnhEAQIJnBACQ4BkBACSdzkJ2CK5ZlL1/TxZJuFKpl282rJLi/9nFqbOwaaG1GWDXxuZ/cBo5ZRbxBxw7cW6HsHag154LmGXatXbgdjak29pE4kusZW3CGq63NX15f9yO44h0ahYwGHtaO1DcR+MsbMf1esb9EW2l09Xvs9t/AVDsu8HagUv7+9rmQ6yR38cTZQvHmHfSe1wTK05xmLWnKF+T7rjG2oFGKqn9AntWR2FAMVH78jS/Il1cbZ5jG/OsHVgJawd65fdZO9DQYK8OmF7iquesTbg1x49NZ7tUtBep6EdzRV7N78aal//I2oHm8UqTGrJQKdzW6+qy06gnWYzb2lDPXRAnFjneM2bp7c4Zk+PvmTQIGr6a2sjXxPrn7S30Z9Oy7aEKNcAITJm1KlbDo9JVnbZ7xGDrmzUNoM/D6P1UjIvkBJ4x++B14jLiwRyXPSK4qMljnJbWinTZM7J24KGI+6UdrwkMsx/b95bfc9HLOjDFsWbtoHbK1h9fpTV/7H5PMepX32OoKddGe1OT7BTbxsqDs5wrjxeT03aLApq8nXScR/u+2+kWc5cSV5U09bQi1dQKuz5k82vkq+aRuR7OtA4M7719oJwBeokZAaBDrvYdeaaYEQBgH04wAgMAsDN4RgAACZ4RAECCZwQAkOAZAQAkeEYAAAmeEQBAgmcEAJDgGQEAJHhGAAAJnhEAQIJnBACQ4BkBACR4RgAAiX+FrG9/t4EZR/D03dEWAECnEDMCAEjwjAAAEm9r+sPts69eyIM/vX3z98crGXQxDMMwbUyzps+7WcZxzApkjxenYRdJp6fi4+mRfojzvpaFveW3zZ7ecnF5eGPGR8/fvH8dQgg/vX3z/fjm+/HN+9fhN0+effnjBsadmqnWznU33h3HUWzMMunx+MiE7WRTtTV2dsgwDHOWV1TbW357swcmLqw1/dnts6+Gvt10+iQIH2pIrpL6GR/FM9q8KSe9jyfiwjwjAMAK+L/a+cRvnjz76uftuZ/xs9tnX0S9kKL/UZz979Pwq2/vxeazYjdVJU5NZz/+cH9wsk2zKjXMwRY9X7th9D+GpOsqFk4vTMtBk9f6W43Sm2WWpKspT1OJ9YiEjP5f2x6vMakBmv7Ucs3g9Kxtf72p10BzzPjT2/sOx9gBaf2P8dnpwv/94dPGxMfn97uGqvTUfPm8O53VrGon7vmq6e+bLplZkPIDPc3V12iCzVmLn6v5URcXZstBkxd9BVrXQdZO0c/oStcmVpV1SanB8cZ0iWZPVn+lPbGwVt+y2TTSFT22WnnWlNv10BwzasSx5MzsnlyRWlaV4OPzNx+fO3R65R8wRkGB65KJJc5x64obOwLhfIWPCKVyyMrPOpd4dps03Rrh5SnaQaVLVUjK31tnluQL5xizxDN+fP4mhGdfPHn2ZeTyVvyCJ6tqTjRty9tGVsoXWFJrLwnKYWLrcljLg2eb2CJExTPGrB4z7kAc9312K13zcnmV7eKdc9FcDhdWgGfJjmGnaNOcIjv7sHBseuoczH7PGH9AY4jF8l8k35Cnqrzf5Vjyy7v/LgbRJ7W6/srutn7Y2dqa8m82aSr8tL9bU8hzEULwPgYfboe7n/1XOqYcDxCH0ti0HIN+/af34c9FVcWBZnuMO5J/+q7hjwRpjSnWY9Ebld3VrjIk643UmnuiJz47OJsKZ8+mpNbW+1wjaWNsWktXU168L6nmePy32Ols32hhUlr+wtpQujV2RrKxoV1Prjx+9AcIK8y10zYgszZP3wVaEDtCUR+F5hnBYP8vvecg7uD/FC78AgYqWf6xESxEtKa5ETVQTAAAEv4dCAAgwTMCAEjwjAAAEjwjAIAEzwgAIMEzAgBIDvzf9Nn/fsTXTgAXCzEjAIAEzwgAIDnjLGQnYBhezdvj+LL+kkrhBnkXDfZ3pf+8GLd1LjRKbA/Gwwgn/+kZCzfGrn1JjXCDvIsG+7vSf3bsAqG49oHWtGQYbnZP8dUcBYzjyzieWkUeALzgGWFv4sZg7OUB+uGwfsY5NBvHm3hbnI0PZk9Nu+LCoh4RGE6n5oNCiSav0dWTP/dbiQ4s0WkV77rsb9AvLnTZH5LinbfjhOrla1Kv0V/Mr5GvrLX0Kh7McQ35MI4hhJsQbuZtsZHuZrc1ee/x7K4uME7Gi5/MpL/fsHiJ1k+n2RPrjK+t6e+r7CRdRX+N/SK/WVVt8tl0V9GfJb2qqL9GLazF8WPTc/xlxGjzQTtYy15Sb0AlkcHleKfyhR/HXM39hkZaol9y2o23s6a67G/Qn8o3ZM0lny1bTVhEf8VLavJbCXHi4RzvGbN4vZVLT9x+r0nLKz/hfTZ6exgO7wfYAm+O4kLYYqRrbkpfXlGfnU4949aI+LTGObrkz1XXU2vXtV9oO0vhNNt5lgyCwQnGprNjNUVh45T3uxy/vHQE84YWd8QycZCiNeWy8g22ZRvvDfa79Bvye7JFDKiVZ3pQ63aATjhsHRhRV+wB6OKp1HuK4eascFb/LFCSV8st+xjMx+v/3mCP3tb/B8YekM0GjC77vfqDkl/beHsIuCigyVemnmKMuWfL0y6ZuINS0588MnjSDTlwhaylc+00DMisylHl5saOR5ZHK1vrPxfXlt9LpZeYEQAawAtvBKuqAgBITjACAwCwM3hGAAAJnhEAQIJnBACQ4BkBACR4RgAACZ4RAECCZwQAkOAZAQAkeEYAAAmeEQBAgmcEAJDgGQEAJHhGAAAJnhEAQHJuzzgMSycG74SLyQjAZXBizzgMlzPt7jiOOEeAfvA6lw+3w90LefDzt+M3jxsNEAprVfXgFidfljVjdnMuI3vI1IyRu0oBaGOtgtVWOtsC79Jj/eNdb/rR8zGE4e7FvQv7cDvcPRm+a3KOk1tcRdUBGIHeVK0vOwwkzt2ItQo2Xax1o9W7xCKIq+s/hFO2preOrY564LvyNeM4Eg/arHWz+rnpy8muVH5GTukZAQA2pSH4SrsaP387PvphuHsRwuu3X4cn09kHbeQXiWR4IBOSlrVp9MOYUfTLzLtiIzzsuxEv6mz7V5OP04oTsu009NiXZDWk6cbZTLOcpitk7MuLWdDyZRSpkbussGZV8f7W2OOy01tPvHrierVM/4OoraYprV2S7UnUujKzSqaDcxt/lkkvEUdSYc2eynyl9mTSHd28fx1uQvjL2/uN+LhxSts1DuZJzRZH5t2fM5w5XnO5S3/xqoaENIx0s6cM+13miYML9dvKtbRqtr32eO3UxFbU01Bvc5pvxEalvH25vVuvrUZ/CDcuea8lIq15wzsCU+RT0Pd4DOF+OOXTuM2T3//tX3/97XeZ2NARMHoZK4Liosyoj6iMUT2uj/iWY6Qbnyoqma+tNF6TqclXczlrGdSOL7Snps5szZJyXs+Gciy2YkJav+RsxpJOzKJycXCJZ/zZ3y0dTd7QLWqMD8c6ap6E+CFcUjt7eOq2IJuv3spZs6fBzqNosy0dpzbE4t2GtGzmprRQ7h3g9trmlV89ZrT5xx+Hd3et3YszldFZUcm8XdS2z5eGB37PuF3SvZWzZo/Lzgtmh698UnZLqJ6FY9OPno9fvw7/fDJ89+OD46m/e/R8/Pr13X/uLLEPt8OrQapyUxNorNgkmVU1PE6rmLHwMXa9ZuLYyjC+RsaFiOmyg0VGckV7TvrdjMtsEQ8qCjf8ziYOFStj2ErWalk/kHE+VGKg+ZvH90d+/TT8+9t7yTQMTN3lon/UZEeoQ/K0hKT3ShMQykMSRwiZ2acUhbVTzXFKcXQ4JJmyx3wNYU2VyHiDSVmy5SxOadYWy3mVgek06Zr7W6/H+GbApV8bzzX8kTGWXTM2nWoWKYoms6vfMG19G+mm2GPfmd31Wg3FdvH6/YkX0+rxZuRiMg7QJ2s9YGksWXMKWrA/NgSA5RB6AABI+HcgAIAEzwgAIMEzAgBI8IwAAJKd/wNzj/HZ2uqp7DnK1DaR8vKPswBgRY4cm04/1V7dmP09Y8h9wmo7r63lAcDLYTHjPvBNEgA0QD/j8RAAAvTGwa3peLfyz6fiVHZe6+wfTsXlY93E10VTc/lyT6Q8X+V1i3hSgC04uDWdnTPV2E0n70tn1jP+gR8fjGdstdPVpq4w8/XpP+r1bqvhEgDYiE5b00NEfHDFCFebmC9NV7twC2qmiprBjQJsRBeeMTstWMyelqTpTuYVPWZOm8PNecEtAmxHF56xf2J3ufospw3eE7cIsCm9eEbD4xgzzhrCy1llYmpv2Cim9sxeq0356Q1pAUDj4P/AZAc90kGVeFubWzvejqfaDtFYTUj8naG8baLmyU+5PsbOLqdrs+ms9ABw4vkZmdcaADail9a0lznkPNoQALhACLsAACRnjRkBALYDzwgAIMEzAgBI8IwAAJL/A8vrQQ07yvP2AAAAAElFTkSuQmCC" alt="" />
搜索到request并下载    
修改后缀名whl为zip并解压,复制requests文件夹到python的lib目录下

2.获取网站内容

import requests
useragent = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.85 Safari/537.36'}
html = requests.get("http://tieba.baidu.com/f?ie=utf-8&kw=python",headers=useragent)
print(html.text)

3.向网页提交数据
get从服务器获取数据
post向服务器发送数据
get通过构造url中的参数来实现功能
post是将数据放在header中提交数据

在使用ajax加载数据的时候是不会在源码中显示的,这时候就要发送post请求来获取数据

data={
'type':'',
'sort':'',
'currentPage':''
}
html_text = requests.post("http://xxxxxx/student/courses/searchCourses",data=data)
print(html_text.text)

---------------------------------------------------------------------------------------------

举个小例子,这是从计科学院的视频上记录下来的笔记

import requests
import re # -*- coding: utf-8 -*- class spider(object): def changepage(self,url,total_page):
now_page = int(re.search('pageNum=(\d+)',url,re.S).group(1))
page_group=[]
for i in range(now_page,total_page+1):
link = re.sub('pageNum=\d+','pageNum=%s'%i,url,re.S)
page_group.append(link)
return page_group def getsource(self,url):
html = requests.get(url)
return html.text def geteveryclass(self,source):
everyclass = re.findall('(<li id=.*?</li>)',source,re.S)
return everyclass def getinfo(self,eachclass):
info = {}
info['title'] = re.search('alt="(.*?)"',eachclass,re.S).group(1)
info['content'] = re.search('display: none;">(.*?)</p>',eachclass,re.S).group(1)
timeandlevel = re.findall('<em>(.*?)</em>',eachclass,re.S)
info['classtime'] = timeandlevel[0]
info['classlevel'] = timeandlevel[1]
info['learnnum'] = re.search('"learn-number">(.*?)</em>',eachclass,re.S).group(1)
return info def saveinfo(self,classinfo):
f=open('info.txt','a')#open(路径+文件名,读写模式)r只读,r+读写,w新建(会覆盖原有文件),a追加,b二进制文件.常用模式
for each in classinfo:
f.writelines('title:'+each['title']+'\n')
# f.writelines('content:'+each['content'+'\n'])
# f.writelines('classtime:'+each['classtime'+'\n'])
# f.writelines('classlevel:'+each['classlevel'+'\n'])
# f.writelines('learnnum:'+each['learnnum'+'\n\n'])
f.close() if __name__ == '__main__': classinfo = []#定义一个列表,里面将放置所有课程的字典
url = 'http://www.jikexueyuan.com/course/?pageNum=1'
jikespider = spider()#实例化
all_links = jikespider.changepage(url,2)#获取20页的url
for link in all_links:
print('读取文件:'+link)
html = jikespider.getsource(link)#获取当前页资源
everyclass = jikespider.geteveryclass(html)#获取当前页所有的li
for each in everyclass:
info = jikespider.getinfo(each)#分类获取资源
classinfo.append(info)#加入列表
jikespider.saveinfo(classinfo)#写操作

最新文章

  1. Toritoisegit记住用户名密码
  2. java socket client
  3. sprint3冲刺第一天
  4. [codevs3223]素数密度(筛)
  5. JasperReport原理解析之(一)
  6. zoj 3659
  7. redis 源码分析
  8. Android 定义重名权限问题
  9. layer1.8UI
  10. 使用FFMPeg对视频进行处理
  11. python入门(5)使用文件编辑器编写代码并保存执行
  12. avalon,xmp
  13. 高淇java300集JAVA面向对象的进阶作业
  14. python默认参数陷阱
  15. AndroidStudio中如何使用GsonFormat
  16. Lists.newArrayList的一个小坑
  17. 【终结版】C#常用函数和方法集汇总
  18. (转载)WinCC 卸载后 Simatic Shell 的删除
  19. [UWP开发] 在低版本中使用亚克力刷以及部分高版本控件
  20. Java - &quot;JUC线程池&quot; ThreadPoolExecutor原理解析

热门文章

  1. POJ 3311 Hie with the Pie (BFS+最短路+状态压缩)
  2. Android视频应用去广告学习实践
  3. .net mvc System.Web.Optimization 、System.Data.Entity.Infrastructure找不到
  4. BootStrap 智能表单系列 八 表单配置json详解
  5. 上一篇下一篇 排序 (非ID字段排序)
  6. sql表连接的几种方式
  7. Navicat Premium 11.0.x(for Mac)激活方法
  8. BZOJ 1001: [BeiJing2006]狼抓兔子(最短路)
  9. HDU2007-平方和与立方和
  10. QT IP输入框正则表达式(使用QLineEdit的setValidator函数)