Image may be NSFW.
Clik here to view.

Scrapy 1.2.1 发布了。
Scrapy 是一套基于基于Twisted的异步处理框架,纯python实现的爬虫框架,用户只需要定制开发几个模块就可以轻松的实现一个爬虫,用来抓取网页内容以及各种图片。
更新内容: 新功能New FEED_EXPORT_ENCODING setting to customize the encoding used when writing items to a file. This can be used to turn off \uXXXX escapes in JSON output. This is also useful for those wanting something else than UTF-8 for XML or CSV output ( #2034 ).
startproject command now supports an optional destination directory to override the default one based on the project name ( #2005 ).
New SCHEDULER_DEBUG setting to log requests serialization failures ( #1610 ).
JSON encoder now supports serialization of set instances ( #2058 ).
Interpret application/json-amazonui-streaming as TextResponse ( #1503 ).
scrapy is imported by default when using shell tools ( shell , inspect_response ) ( #2248 ).
Bug 修复DefaultRequestHeaders middleware now runs before UserAgent middleware ( #2088 ). Warning: this is technically backwards incompatible , though we consider this a bug fix.
HTTP cache extension and plugins that use the .scrapy data directory now work outside projects ( #1581 ). Warning: this is technically backwards incompatible , though we consider this a bug fix.
Selector does not allow passing both response and text anymore ( #2153 ).
Fixed logging of wrong callback name with scrapy parse ( #2169 ).
Fix for an odd gzip decompression bug ( #1606 ).
Fix for selected callbacks when using CrawlSpider with scrapy parse ( #2225 ).
Fix for invalid JSON and XML files when spider yields no items ( #872 ).
Implement flush() for StreamLogger avoiding a warning in logs ( #2125 ).
重构canonicalize_url has been moved to w3lib.url ( #2168 ).
下载地址: