php模拟登陆 curl抓取微博页面
浏览量:443
先拿微博的举个例子:
<?php header("Content-Type: text/html;charset=utf-8"); $szUrl='http://weibo.cn/search/?pos=search&vt=4'; $UserAgent = 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.0.04506; .NET CLR 3.5.21022; .NET CLR 1.0.3705; .NET CLR 1.1.4322)'; $search = array( 'keyword' => '坏蛋', 'smblog' => '搜微博', ); $referer = "http://weibo.cn/search/?vt=4"; //登陆的cookie值,也可以火狐直接导出一份cookie 设置一下cookie过期时间就行了 $cookie_file = realpath("./cookies.txt"); $curl = curl_init(); curl_setopt($curl, CURLOPT_URL, $szUrl); curl_setopt($curl, CURLOPT_HEADER, 0 ); //0表示不输出Header,1表示输出 curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false); curl_setopt ($curl, CURLOPT_COOKIEJAR, $cookie_file); curl_setopt ($curl, CURLOPT_COOKIEFILE, $cookie_file); curl_setopt($curl, CURLOPT_USERAGENT, $UserAgent); curl_setopt($curl, CURLOPT_REFERER, $referer); curl_setopt($curl, CURLOPT_POST, 1); curl_setopt($curl, CURLOPT_POSTFIELDS, http_build_query($search)); $data = curl_exec($curl); $info = curl_getinfo($curl); $keywords = "2016-4-27"; $keywords=iconv("utf-8","gb2312",$keywords); file_put_contents($keywords.'.html', $data); //echo '抓取成功'; //关闭curl 防止占用内存 curl_close($curl); ?>
cookie.txt
神回复
发表评论:
◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。