网络爬虫之HTTPClient
2024-08-24 06:01:48
HTTPClient官网:http://hc.apache.org/httpcomponents-client-4.5.x/quickstart.html
问题一:明明浏览器请求有数据,可使用HTTPClient输出却为空
import org.apache.http.*;
import org.apache.http.client.*;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
import org.junit.Test; @Test
public void httpClientTest1() {
CloseableHttpClient httpclient = HttpClients.createDefault();
try{
String url = "https://www.80s.tw";
HttpGet httpGet = new HttpGet(url);
System.out.println("executing request " + httpGet.getURI()); ResponseHandler<String> responseHandler = new ResponseHandler<String>(){
public String handleResponse(final HttpResponse response) throws ClientProtocolException,IOException{
int status = response.getStatusLine().getStatusCode();
if (status >= 200 && status < 300){
HttpEntity entity = response.getEntity();
return entity !=null ? EntityUtils.toString(entity) : null;
}else{
throw new ClientProtocolException("Unexpected response status: " + status);
}
}
};
String responseBody = null;
try {
responseBody = httpclient.execute(httpGet,responseHandler);
} catch (ClientProtocolException e) {
e.printStackTrace();
} catch (IOException e) {
}
System.out.println("-------------------------------------------");
System.out.println(responseBody);
System.out.println("-------------------------------------------");
}finally{
try {
httpclient.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
原因1:访问该网站可能需要证书
证书解决办法:http://www.cnblogs.com/zhumengke/p/8846912.html
再次请求时导入我们下载的证书
import javax.net.ssl.SSLContext;
import org.apache.http.HttpEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.conn.ssl.SSLConnectionSocketFactory;
import org.apache.http.conn.ssl.TrustSelfSignedStrategy;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.ssl.SSLContexts;
import org.apache.http.util.EntityUtils;
import org.junit.Test;
@Test
public void httpTest() {
SSLContext sslcontext = null;
try {
File file = new File("D:/java/jre/lib/security", "jssecacerts");
sslcontext = SSLContexts.custom()
.loadTrustMaterial(file, "changeit".toCharArray(), new TrustSelfSignedStrategy()).build();
} catch (Exception e) {
e.printStackTrace();
}
SSLConnectionSocketFactory sslsf = new SSLConnectionSocketFactory(sslcontext, new String[] { "TLSv1" }, null,
SSLConnectionSocketFactory.getDefaultHostnameVerifier());
CloseableHttpClient httpclient = HttpClients.custom().setSSLSocketFactory(sslsf).build();
try {
HttpGet httpget = new HttpGet("https://www.80s.tw");
System.out.println("Executing request " + httpget.getRequestLine());
CloseableHttpResponse response = httpclient.execute(httpget);
try {
HttpEntity entity = response.getEntity();
System.out.println("----------------------------------------");
System.out.println(response.getStatusLine());
System.out.println(EntityUtils.toString(entity));
EntityUtils.consume(entity);
} finally {
response.close();
}
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
httpclient.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
最新文章
- CentOS 7 安装后没有ifconfig命令
- JDBC的连接和增删改和查找
- 理工科应该的知道的C/C++数学计算库(转)
- 谷歌眼镜--UI指南
- DB天气app冲刺二阶段第四天
- 排序,求几个最值问题,输入n个整数,输出其中最小的k个元素。
- Ext.getCmp()的简单使用
- eclipse插件maven的使用,web打包成WAR,tomcat下直接运行
- 挖坑:CF712E
- Java 获取SQL查询语句结果
- 搭建一个属于自己的webpack config(-)
- 新概念英语(1-21)Whick book
- SAP系统三层架构
- 20175212童皓桢 《Java程序设计》第一周学习
- 论一类每次修改log个结点更新的线段树标记方法
- Win7下mysql的安装
- C++的默认构造函数
- 第12月第30天 love2d
- 统计方法运行时间【Java实现】
- magento 如何制作模板
热门文章
- luogu P1352 没有上司的舞会 x
- 157. [USACO Nov07] 奶牛跨栏(第三次考试大整理)
- HDU 6651 Final Exam
- 简单实现骨架屏 (Skeleton Screens)
- python正则之match search findall
- 后盾网lavarel视频项目---lavarel中间件(使用中间件拦截没登录的用户)
- linux系统PS命令,按CPU、内存使用率对进程排序
- 为什么在vmware中不能使用ctrl+alt+F1~6切换到字符控制台
- GitHub入门(一)GIT配置与Hexo博客搭建
- 【转载】inno setup 水波纹效果,检测安装vcredist_x86.exe等