<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>爬虫 &#8211; ChaBug安全</title>
	<atom:link href="/tags/%E7%88%AC%E8%99%AB/feed" rel="self" type="application/rss+xml" />
	<link>/</link>
	<description>一个分享知识、结识伙伴、资源共享的博客</description>
	<lastBuildDate>Mon, 13 Aug 2018 15:46:13 +0000</lastBuildDate>
	<language>zh-CN</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=5.5.5</generator>
	<item>
		<title>[开车]Python爬取mm131美女套图附百度云</title>
		<link>/code/320.html</link>
		
		<dc:creator><![CDATA[Y4er]]></dc:creator>
		<pubDate>Sun, 21 Jan 2018 08:46:00 +0000</pubDate>
				<category><![CDATA[编程学习]]></category>
		<category><![CDATA[爬虫]]></category>
		<category><![CDATA[百度云]]></category>
		<guid isPermaLink="false">/?p=6</guid>

					<description><![CDATA[最近闲来无事，在学python爬虫，而对于一个老司机来说，美女图永远是我学习爬虫的动力。@(小乖) 而对于所有的性感美女图片站中，mm131可谓是独树一帜，他们家的模特着实漂亮@(...]]></description>
										<content:encoded><![CDATA[<p>最近闲来无事，在学python<span class="wpcom_tag_link"><a href="/tags/%e7%88%ac%e8%99%ab" title="爬虫" target="_blank">爬虫</a></span>，而对于一个老司机来说，美女图永远是我学习爬虫的动力。@(小乖)</p>
<p>而对于所有的性感美女图片站中，mm131可谓是独树一帜，他们家的模特着实漂亮@(你懂的)，这也就促成了我的这篇文章。</p>
<p>先贴一张图片吧#(邪恶)<br /><img src="https://ws1.sinaimg.cn/large/006xriynly1fnigyl09vuj30jg0t6whp.jpg" alt="" title=""><br />看完是不是鸡儿一硬@(滑稽)<br />别急  来看代码</p>
<pre><code class="lang-python">#!/usr/bin/env python
# coding=utf-8
import re
import requests
from bs4 import BeautifulSoup
import os
import sys
def downloadpic(url):
        headers={
                'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
                'Accept-Encoding':'gzip, deflate',
                'Accept-Language':'zh-CN,zh;q=0.9',
                'Connection':'keep-alive',
                'Cookie':'UM_distinctid=160c072721f36a-049309acceadc2-e323462-144000-160c0727220f67; CNZZDATA3866066=cnzz_eid%3D1829424698-1494676185-%26ntime%3D1494676185; bdshare_firstime=1515057214243; Hm_lvt_9a737a8572f89206db6e9c301695b55a=1515057214,1515074260,1515159455; Hm_lpvt_9a737a8572f89206db6e9c301695b55a=1515159455',
                'Host':'img1.mm131.me',
                'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36',
                'Referer':'http://www.mm131.com/'
        }
        #url='http://www.mm131.com/xinggan/3561.html'
        r=requests.get(url)
        #r.encoding=&quot;gb2312&quot;
        r.encoding=r.apparent_encoding
        html=r.text
        # 套图标题
        title = BeautifulSoup(html,'lxml').find(&quot;h5&quot;).get_text()
        #获取页码
        page = BeautifulSoup(html, 'lxml').find(&quot;span&quot;, {&quot;class&quot;: &quot;page-ch&quot;}).get_text()
        print page
        pattern = re.compile('\d*')
        page = pattern.findall(page)[1]
        #创建以套图标题为题的文件夹
        path=&quot;E:\\pic\\&quot;
        os.makedirs(path + title + page + 'P')
        #获取第一张图片地址
        a = re.search(r'img alt=.* src=&quot;(.*?)&quot; /',html,re.S)
        print a.group(1)
        pic=requests.get(a.group(1),headers=headers)
        #下载图片
        f=open(path + title + page + 'P'  + '\\' + '1.jpg',&quot;wb&quot;)
        f.write(pic.content)
        f.close
        #下载第一张以后的图
        after = int(page) + 1
        for i in range(2, after):
                #改变地址结构
                url0 = url[:-5]
                url1 = url0 + '_' + str(i) + '.html'
                #print url1
                html=requests.get(url1).text
                a = re.search(r'img alt=.* src=&quot;(.*?)&quot; /',html,re.S)
                pic=requests.get(a.group(1),headers=headers)
                print a.group(1)
                f=open(path + title + page + 'P'  + '\\' +  str(i) +&quot;.jpg&quot;,&quot;wb&quot;)
                f.write(pic.content)
                f.close
if __name__ == '__main__':
    url = 'http://www.mm131.com/xinggan/'
    html = requests.get(url).text
    urls = BeautifulSoup(html, 'lxml').find('dl', {'class': 'list-left public-box'}).findAll('a', {'target': '_blank'})
    for url in urls:
        url = url['href']
        print url
        #downloadpic(url)
    for i in range(2,122):
        print(&quot;第&quot;+str(i)+&quot;页&quot;)
        url = 'http://www.mm131.com/xinggan/list_6_'+str(i)+'.html'
            html = requests.get(url).text
        urls = BeautifulSoup(html,'lxml').find('dl',{'class': 'list-left public-box'}).findAll('a',{'target': '_blank'})
        for url in urls:
            url = url['href']
            print url
            downloadpic(url)</code></pre>
<p>ps:</p>
<ul>
<li>注意安装模块</li>
<li>请注意修改文件保存路径,在37行 <code>path=&quot;E:\\pic\\&quot;</code></li>
</ul>
<p>@(呵呵)看这里 <a href="https://github.com/chabug/mm131">Github开源</a> 给个star啊</p>
<p><span class="wpcom_tag_link"><a href="/tags/%e7%99%be%e5%ba%a6%e4%ba%91" title="百度云" target="_blank">百度云</a></span>套图地址：<a href="https://pan.baidu.com/s/4dFQ7Tdv">https://pan.baidu.com/s/4dFQ7Tdv</a></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>[干货]Python爬虫-爬取各个地区的天气.</title>
		<link>/code/323.html</link>
		
		<dc:creator><![CDATA[Y4er]]></dc:creator>
		<pubDate>Tue, 16 Jan 2018 11:50:00 +0000</pubDate>
				<category><![CDATA[编程学习]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[爬虫]]></category>
		<guid isPermaLink="false">/?p=17</guid>

					<description><![CDATA[大家好，我是傲天 好，开始正题，开始我们的爬虫! 首先配上效果图 OK，先说一下我的运行环境 Windows7 Python3.6 接下来是依赖库 BeautifulSoup re...]]></description>
										<content:encoded><![CDATA[<blockquote><p>大家好，我是傲天</p></blockquote>
<hr />
<p>好，开始正题，开始我们的<span class="wpcom_tag_link"><a href="/tags/%e7%88%ac%e8%99%ab" title="爬虫" target="_blank">爬虫</a></span>!</p>
<h1>首先配上效果图</h1>
<p><img src="https://ws1.sinaimg.cn/large/006xriynly1fniq48fh4rj30go09dgm3.jpg" alt="效果" title="效果"></p>
<p>OK，先说一下我的运行环境</p>
<ul>
<li>Windows7</li>
<li>Python3.6</li>
</ul>
<p>接下来是依赖库</p>
<ul>
<li>BeautifulSoup</li>
<li>requests</li>
<li>pinyin</li>
</ul>
<h1>进入正题贴代码</h1>
<pre><code>import requests
import pinyin
from bs4 import BeautifulSoup
from os import system
class Get_url_weather(object):
    #实现请求一个天气的URL，并对数据进行解析
    def __init__(self, url, timeout=2):
        #    请扔进来一个url,还有一个超时查询默认为2秒吧
        self.r = requests.get(url, timeout=timeout)
        if self.r.status_code == 404:
            print(&quot;出现错误,请检查输入是否正确，如若多次输入不正确，说明该程序无法查询到你地址的天气&quot;)
    def get(self):
        soup = self.get_soup()
        #因为我们想要的信息都在一个dl里，class=&quot;weather_info&quot;
        html = self.get_dl_weather_ifno(soup)
        a = []
        a.append(&quot;标头:{}&quot;.format(html.img[&quot;alt&quot;]))
        a.append(&quot;地区:{}&quot;.format(html.dd.h2.text))
        a.append(&quot;{}&quot;.format(html.find(&quot;dd&quot;, class_=&quot;kongqi&quot;).h6.text))
        a.append(&quot;{}&quot;.format(html.find(&quot;dd&quot;, class_=&quot;kongqi&quot;).span.text)[:9])
        a.append(&quot;{}&quot;.format(html.find(&quot;dd&quot;, class_=&quot;kongqi&quot;).span.text)[9:])
        a.append(&quot;{}&quot;.format(html.find(&quot;dd&quot;, class_=&quot;shidu&quot;).b.text))
        a.append(&quot;{}&quot;.format(html.find(&quot;dd&quot;, class_=&quot;shidu&quot;).find_all(&quot;b&quot;)[1].text))
        a.append(&quot;{}&quot;.format(html.find(&quot;dd&quot;, class_=&quot;shidu&quot;).find_all(&quot;b&quot;)[2].text))
        a.append(&quot;{}&quot;.format(html.find(&quot;dd&quot;, class_=&quot;kongqi&quot;).h5.text))
        a.append(&quot;当前时间:{}&quot;.format(html.find(&quot;dd&quot;, class_=&quot;week&quot;).text))
        a.append(&quot;当前天气:{}&quot;.format(html.find(&quot;span&quot;).b.text))
        a.append(&quot;全天温度:{}&quot;.format(html.find(&quot;span&quot;).text))
        return a
    def get_soup(self):
        return(BeautifulSoup(self.r.text, &quot;html.parser&quot;))
    def get_dl_weather_ifno(self, soup):
        return (soup.find(&quot;dl&quot;, attrs={'class':'weather_info'}))
if __name__ == &quot;__main__&quot;:
    URL = &quot;http://www.tianqi.com/&quot;
    url_path = pinyin.get(input(&quot;请输入地区名(不需要带市或省):&quot;), format=&quot;strip&quot;)
    URL = URL+url_path
    Data = Get_url_weather(URL)
    data = Data.get()
    print('\n'.join(data))
    system(&quot;pause&quot;)
    #print(str(pinyin.get(&quot;你好&quot;, format=&quot;strip&quot;)))</code></pre>
<p>需要学习爬虫的 交流群:62851737</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>[干货]基于itchat,用Python玩微信.</title>
		<link>/code/322.html</link>
		
		<dc:creator><![CDATA[Y4er]]></dc:creator>
		<pubDate>Tue, 16 Jan 2018 11:48:00 +0000</pubDate>
				<category><![CDATA[编程学习]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[爬虫]]></category>
		<guid isPermaLink="false">/?p=16</guid>

					<description><![CDATA[大家好，我是傲天 好，开始正题，开始我们的爬虫! OK，先说一下我的运行环境 Windows7 Python3.6 接下来是依赖库 itchat 我承认上边是从那个文章复制来的 进...]]></description>
										<content:encoded><![CDATA[<blockquote><p>大家好，我是傲天</p></blockquote>
<hr />
<p><img src="https://ws4.sinaimg.cn/mw690/006xriynly1fnjqr9ricoj30u01o0435.jpg" alt="" title=""><br />好，开始正题，开始我们的<span class="wpcom_tag_link"><a href="/tags/%e7%88%ac%e8%99%ab" title="爬虫" target="_blank">爬虫</a></span>!</p>
<h1>OK，先说一下我的运行环境</h1>
<ul>
<li>Windows7</li>
<li>Python3.6</li>
</ul>
<p>接下来是依赖库</p>
<ul>
<li>itchat</li>
</ul>
<p>我承认上边是从那个文章复制来的</p>
<h1>进入正题贴代码</h1>
<pre><code>&quot;&quot;&quot;
    这个程序会跑在服务器上，然后每天定时给某些好友发送一些信息
    目前这些信息包括每天凌晨发送过去天气状况
&quot;&quot;&quot;
import itchat
from time import sleep
import time
Wchat = itchat.auto_login(hotReload=True)
friends = itchat.get_friends()[0:]
#创建一个字典保存用户名称和id
friends_name = {}
for i in friends:
    if i[&quot;RemarkName&quot;]:
        if i[&quot;RemarkName&quot;] not in friends_name:
            friends_name[i[&quot;RemarkName&quot;]] = i[&quot;UserName&quot;]
while True:
    #sleep(30) #休眠一分钟
    #itchat.send_msg(&quot;xxx&quot;, toUserName='filehelper')
    #获取当前小时数
    time_now = time.strftime('%H%M',time.localtime(time.time()))
    if int(time_now) == 0700:#说明到了两点，然后就发送消息
        #在这个老爸的位置你可以更改为任何人，但那个人一定是你给备注了的名字
        itchat.send_msg(time.strftime('%Y-%m-%y-%H-%M-%S',time.localtime(time.time())), friends_name[&quot;老爸&quot;])
        sleep(60)
</code></pre>
<p>当然这个程序还没有搞完，现在只是简单的实现了一个早上七点准时发一个消息<br />接下来我会将这个程序和天气的那个爬虫融合，让它每天早上7点准时给你报天气<br />这就是为什么我把上一个爬虫写成一个类，好的，等我下一篇吧。接下来又是  交流群:62851737</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
