Scrapy 1.2.2 发布,Web 爬虫框架
Scrapy 1.2.2 发布了。 Scrapy 是一套基于基于Twisted的异步处理框架,纯python实现的爬虫框架,用户只需要定制开发几个模块就可以轻松的实现一个爬虫,用来抓取网页内容以及各种图片。 更新内容: Bug 修复 Fix a cryptic traceback when a pipeline fails on open_spider() ( issue 2011 ) Fix...
View ArticleVisualizing Tweet Vectors Using Python
I try to experiment with a lot of different technologies. I’ve found that having experience with a diverse set of concepts, languages, libraries, tools etc. leads to more robust thinking when trying to...
View Articletryexceptpass: Threaded Asynchronous Magic and How to Wield It
Photo Credit: Daniel Schwen via Wikipedia Threaded Asynchronous Magic and How to WieldIt A dive into python’s asyncio tasks and eventloops Ok let’s face it. Clock speeds no longer govern the pace at...
View ArticleMarcos Dione: ayrton-0.9
Another release, but this time not (only) a bugfix one. After playing with bool semantics I converted the file tests from a _X format, which, let's face it, was not pretty, into the more usual -X...
View ArticleTalk Python to Me: #88 Lightweight Django
Django is a very popular python web framework. One reason is you have many building blocks to drop in for large sections of your application. Need a full-on admin table editor backend? That's a few...
View ArticleObey the Testing Goat: Second Edition update: Virtualenvs, Django 1.10, REST...
A brief update on my progress for the second edition. Getting there! Virtualenvs all the way down. In the first edition, I made the judgement call that telling people to use virtualenvs at the very...
View ArticleWriting autofill plugins for TeamPlayer
Background TeamPlayer is a Django-based streaming radio app with a twist. A while back it gained a feature called "shake things up" where, instead of dead silence, "DJ Ango" would play tracks from the...
View ArticleHow to Create Group By Queries With Django ORM
This tutorial is about how to implement SQL-like group by queries using the Django ORM. It’s a fairly common operation, specially for those who are familiar with SQL. The Django ORM is actually an...
View ArticlePython爬虫入门六之Cookie的使用
大家好哈,上一节我们研究了一下爬虫的异常处理问题,那么接下来我们一起来看一下Cookie的使用。 为什么要使用Cookie呢? Cookie,指某些网站为了辨别用户身份、进行session跟踪而储存在用户本地终端上的数据(通常经过加密)...
View ArticleIntroducing: fastparquet
A compliant, flexible and speedy interface to Parquet format files for python, fastparquet provides seamless translation between in-memory pandas DataFrames and on-disc storage. In this post, we will...
View ArticlePython 3.6.0 release candidate is now available
python 3.6.0rc1 is the release candidate for Python 3.6, the next major release of Python. Code for 3.6.0 is now frozen. Assuming no release critical problems are found prior to the 3.6.0 final...
View Articlepython tempfile 学习小结
tempfile 这个模块主要是用来创建临时文件和目录,用完后会自动删除,省的你自己去创建一个文件、使用这个文件、再删除这个过程了。其中比较常用的是TemporaryFile和NamedTemporaryFile,其他觉得简单看看就可以了。 TemporaryFile 创建一个临时文件,关闭时自动删除 In [81]: tmp = tempfile.TemporaryFile() In [82]:...
View ArticleGET and POST requests using Python
This post discusses two HTTP (Hypertext Transfer Protocol) request methods GET and POST requests inpython and their implementation in python. What is HTTP? HTTP is a set of protocols designed to enable...
View ArticleMonkeyRunner填坑之jython
MonkeyRunner使用的jython环境是jython-standalone-2.5.3,写好的python脚本,运行才发现, import json 报了 import error ,看了2.7的jython包是包含的,本想替换却发现不行,只能另寻出路。 最后解决方案,手动下载 simplejson : import sys,time,datetime...
View Articlepython的yield用法
yield在使用过程中需要用到函数内部,现在已经不能单独的使用了。含有yield的函数一般被认为是generator或者是产生generator的函数.直接上网上的斐波那契例子: def fab(max): n, a, b = 0, 0, 1 while n < max: #print b yield b a, b = b, a + b n = n + 1...
View Article为什么抵制Python 3
为什么抵制python 3 5小时前来源:CSDN 这份文档列出了为什么初学者应该避免学Python 3的原因,在这里我给出两类原因:第一类是针对零基础的,另一类是对于有一定编程基础的人来说的。第一部分,将会从非技术的角度谈论,帮助初学者不受外部宣传和压力的影响做出合理的决定。第二部分,将讨论目前Python 3存在的缺陷,以及这些缺陷为什么会阻碍程序员的工作。 我不会教初学者Python...
View ArticleCodementor: Extending Apache Pig with Python UDFs
( image source ) Introduction Apache Pig is a popular system for executing complex Hadoop map-reduce based data-flows. It adds a layer of abstraction on top of Hadoop's map-reduce mechanisms in order...
View Article人生苦短,我用python-- Day18 正则+组件+django框架
目录 1.正则表达式 2.组件 3.django框架 一、正则表达式 作用:1,判断字符串是否符合规定的正则表达式 ----test 2,获取匹配的数据 exec 用户登录的时候 常常需要用到正则进行匹配用户输入的是否符合要求: 实验案例一:判断字符串是否符合定义的正则表达式要求 exec 使用方法: rep = /\d+/; 定义一个正则表达式,匹配数字 str =...
View ArticlePython Web Frameworks
Structures (source: Eos Maia ). Introduction At the time of this writing, the web development landscape is dominated by javascript tools. Frameworks like ReactJS and AngularJS are very popular, and...
View Article