Scrapy 1.2.2 发布，Web 爬虫框架

Scrapy 1.2.2 发布了。

Scrapy 是一套基于基于Twisted的异步处理框架，纯python实现的爬虫框架，用户只需要定制开发几个模块就可以轻松的实现一个爬虫，用来抓取网页内容以及各种图片。

更新内容： Bug 修复

Fix a cryptic traceback when a pipeline fails on open_spider() ( issue 2011 )

Fix embedded IPython shell variables (fixing issue 396 that re-appeared in 1.2.0, fixed in issue 2418 )

A couple of patches when dealing with robots.txt:

handle (non-standard) relative sitemap URLs ( issue 2390 )

handle non-ASCII URLs and User-Agents in Python 2 ( issue 2373 )

文档

Document "download_latency" key in Request ‘s meta dict ( issue 2033 )

Remove page on (deprecated & unsupported) Ubuntu packages from ToC ( issue 2335 )

A few fixed typos ( issue 2346 , issue 2369 , issue 2369 , issue 2380 ) and clarifications ( issue 2354 , issue 2325 , issue 2414 )

其他变更

Advertize conda-forge as Scrapy’s official conda channel ( issue 2387 )

More helpful error messages when trying to use .css() or .xpath() on non-Text Responses ( issue 2264 )

startproject command now generates a sample middlewares.py file ( issue 2335 )

Add more dependencies’ version info in scrapyversion verbose output ( issue 2404 )

Remove all *.pyc files from source distribution ( issue 2386 )

完整更新内容

下载地址