Snoopy - the PHP net client v1.2.4

Snoopy是一个php类,用来模拟浏览器的功能,可以获取网页内容,发送表单。
Snoopy的特点:
1、抓取网页的内容 fetch
2、抓取网页的文本内容 (去除HTML标签) fetchtext
3、抓取网页的链接,表单 fetchlinks fetchform
4、支持代理主机
5、支持基本的用户名/密码验证
6、支持设置 user_agent, referer(来路), cookies 和 header content(头文件)
7、支持浏览器重定向,并能控制重定向深度
8、能把网页中的链接扩展成高质量的url(默认)
9、提交数据并且获取返回值
10、支持跟踪HTML框架
11、支持重定向的时候传递cookies
要求php4以上就可以了,由于本身是php一个类,无需扩支持,服务器不支持curl时候的最好选择。

概要方法:

    include "Snoopy.class.php";
$snoopy = new Snoopy; $snoopy->fetchtext("http://www.php.net/");
print $snoopy->results; $snoopy->fetchlinks("http://www.phpbuilder.com/");
print $snoopy->results; $submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html"; $submit_vars["q"] = "amiga";
$submit_vars["submit"] = "Search!";
$submit_vars["searchhost"] = "Altavista"; $snoopy->submit($submit_url,$submit_vars);
print $snoopy->results; $snoopy->maxframes=5;
$snoopy->fetch("http://www.ispi.net/");
echo "<PRE>\n";
echo htmlentities($snoopy->results[0]);
echo htmlentities($snoopy->results[1]);
echo htmlentities($snoopy->results[2]);
echo "</PRE>\n"; $snoopy->fetchform("http://www.altavista.com");
print $snoopy->results; 

类方法说明:

    fetch($URI)
----------- This is the method used for fetching the contents of a web page.
$URI is the fully qualified URL of the page to fetch.
The results of the fetch are stored in $this->results.
If you are fetching frames, then $this->results
contains each frame fetched in an array. fetchtext($URI)
--------------- This behaves exactly like fetch() except that it only returns
the text from the page, stripping out html tags and other
irrelevant data. fetchform($URI)
--------------- This behaves exactly like fetch() except that it only returns
the form elements from the page, stripping out html tags and other
irrelevant data. fetchlinks($URI)
---------------- This behaves exactly like fetch() except that it only returns
the links from the page. By default, relative links are
converted to their fully qualified URL form. submit($URI,$formvars)
---------------------- This submits a form to the specified $URI. $formvars is an
array of the form variables to pass. submittext($URI,$formvars)
-------------------------- This behaves exactly like submit() except that it only returns
the text from the page, stripping out html tags and other
irrelevant data. submitlinks($URI)
---------------- This behaves exactly like submit() except that it only returns
the links from the page. By default, relative links are
converted to their fully qualified URL form.

类 VARIABLES: (default value in parenthesis)

    $host            the host to connect to
$port the port to connect to
$proxy_host the proxy host to use, if any
$proxy_port the proxy port to use, if any
$agent the user agent to masqerade as (Snoopy v0.1)
$referer referer information to pass, if any
$cookies cookies to pass if any
$rawheaders other header info to pass, if any
$maxredirs maximum redirects to allow. 0=none allowed. (5)
$offsiteok whether or not to allow redirects off-site. (true)
$expandlinks whether or not to expand links to fully qualified URLs (true)
$user authentication username, if any
$pass authentication password, if any
$accept http accept types (image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, )
$error where errors are sent, if any
$response_code responde code returned from server
$headers headers returned from server
$maxlength max return data length
$read_timeout timeout on read operations (requires PHP 4 Beta 4+)
set to 0 to disallow timeouts
$timed_out true if a read operation timed out (requires PHP 4 Beta 4+)
$maxframes number of frames we will follow
$status http status of fetch
$temp_dir temp directory that the webserver can write to. (/tmp)
$curl_path system path to cURL binary, set to false if none

EXample:

    Example:     fetch a web page and display the return headers and
the contents of the page (html-escaped): include "Snoopy.class.php";
$snoopy = new Snoopy; $snoopy->user = "joe";
$snoopy->pass = "bloe"; if($snoopy->fetch("http://www.slashdot.org/"))
{
echo "response code: ".$snoopy->response_code."<br>\n";
while(list($key,$val) = each($snoopy->headers))
echo $key.": ".$val."<br>\n";
echo "<p>\n"; echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
}
else
echo "error fetching document: ".$snoopy->error."\n"; Example: submit a form and print out the result headers
and html-escaped page: include "Snoopy.class.php";
$snoopy = new Snoopy; $submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html"; $submit_vars["q"] = "amiga";
$submit_vars["submit"] = "Search!";
$submit_vars["searchhost"] = "Altavista"; if($snoopy->submit($submit_url,$submit_vars))
{
while(list($key,$val) = each($snoopy->headers))
echo $key.": ".$val."<br>\n";
echo "<p>\n"; echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
}
else
echo "error fetching document: ".$snoopy->error."\n"; Example: showing functionality of all the variables: include "Snoopy.class.php";
$snoopy = new Snoopy; $snoopy->proxy_host = "my.proxy.host";
$snoopy->proxy_port = "8080"; $snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)";
$snoopy->referer = "http://www.microsnot.com/"; $snoopy->cookies["SessionID"] = 238472834723489l;
$snoopy->cookies["favoriteColor"] = "RED"; $snoopy->rawheaders["Pragma"] = "no-cache"; $snoopy->maxredirs = 2;
$snoopy->offsiteok = false;
$snoopy->expandlinks = false; $snoopy->user = "joe";
$snoopy->pass = "bloe"; if($snoopy->fetchtext("http://www.phpbuilder.com"))
{
while(list($key,$val) = each($snoopy->headers))
echo $key.": ".$val."<br>\n";
echo "<p>\n"; echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
}
else
echo "error fetching document: ".$snoopy->error."\n"; Example: fetched framed content and display the results include "Snoopy.class.php";
$snoopy = new Snoopy; $snoopy->maxframes = 5; if($snoopy->fetch("http://www.ispi.net/"))
{
echo "<PRE>".htmlspecialchars($snoopy->results[0])."</PRE>\n";
echo "<PRE>".htmlspecialchars($snoopy->results[1])."</PRE>\n";
echo "<PRE>".htmlspecialchars($snoopy->results[2])."</PRE>\n";
}
else
echo "error fetching document: ".$snoopy->error."\n";

最新文章

  1. Android实现归属地查询功能
  2. JDK结构介绍
  3. poj1247 bjfu1239水题
  4. ↗☻【HTML5秘籍 #BOOK#】第8章 使用CSS3
  5. CAsyncSocket
  6. Oracle 调用存储过程执行CRUD的小DEMO
  7. [转载]C#控制台应用程序里调用自己写的函数的方法
  8. Hash Table Performance in R: Part I(转)
  9. [04-05]box框模型(Box Model)定义了元素框处理元素内容、内边距、边框和外边距的方式
  10. NoSQL是什么?
  11. 016-并发编程-java.util.concurrent.locks之-Lock及ReentrantLock
  12. Node require
  13. 【慕课网实战】Spark Streaming实时流处理项目实战笔记七之铭文升级版
  14. DXP中插入LOGO图片方法(1)
  15. 转:Parameter Server 详解
  16. IIS7中Ajax.AjaxMethod无效的原因及解决方法
  17. tortisegit 创建分支和合并分支
  18. OmniMarkupPreviewer 404
  19. Jmeter_实现Excel文件导出到本地
  20. WPF中Auto与*的差别

热门文章

  1. C语言中的指针加减偏移量
  2. 图解golang内存分配机制 (转)
  3. python编码,三个编码实例
  4. mysql中的utf8mb4、utf8mb4_unicode_ci、utf8mb4_general_ci的关系
  5. Permission权限大全
  6. redis基础学习---1
  7. JAVA 基础编程练习题11 【程序 11 求不重复数字】
  8. c++空类为什么占用1个字符
  9. playbook部署flanneld
  10. php获取服务器ip方法