Gunicorn & LRU cache pitfall

February 6, 2017, 3:40 am

Gunicorn & LRU pitfall

In python 3 you can use decorator @lru_cache from functools module. It stores a result of decorated function inside the cache. Imagine that you have simple flask application:

from flask import Flask, jsonify from functools import lru_cache app = Flask(__name__) @app.route('/') def main(): store_to_cache() return jsonify({'message': 'Stored'}) @lru_cache(maxsize=2) def store_to_cache(): return {'this_goes_to_cache': 'and_this_too'}

You enter the root URL and you store dictionary to cache. Cache is setup to have only 2 elements inside. Then you have a helper function for getting info about an object that is inside that cache:

@app.route('/get_cache_info') def get_cache_info(): cache_info = store_to_cache.cache_info() return jsonify({ 'Hits': cache_info.hits, 'Misses': cache_info.misses, 'Maxsize': cache_info.maxsize, 'Currsize': cache_info.currsize })

When you run this application in development mode - without gunicorn everything works as expected - you store to cache and receive proper information:

$ curl -X GET http://127.0.0.1:8000 { "message": "Stored" } $ curl -X GET http://127.0.0.1:8000/get_cache_info { "Currsize": 1, "Hits": 0, "Maxsize": 2, "Misses": 1 }

Let's run the same code but with using gunicorn with two workers:

$ gunicorn --workers=2 application:app $ curl -X GET http://127.0.0.1:8000 $ curl -X GET http://127.0.0.1:8000/get_cache_info { "Currsize": 1, "Hits": 0, "Maxsize": 2, "Misses": 1 } curl -X GET http://127.0.0.1:8000/get_cache_info { "Currsize": 0, "Hits": 0, "Maxsize": 2, "Misses": 0 }

Sometimes request returns that there is one item inside cache and other times that there are no items in the cache. Why is that? Because LRU cache is using cache per worker . It means that when user enters your site cache is stored but it is stored only on this worker! The same user enters another time and his request is handled by the second worker which doesn't have anything stored in the cache!

For this reason, it's not a good idea to use cache per worker in your web application. What can you use instead? Use centrally stored cache like Memcached. You will thank yourself in the future.

That's all for today! Feel free to comment - maybe you have a better idea which cache use to avoid pitfalls?

Example of how LRU cache works is based upon this article .

The code that I have made so far is available on github . Stay tuned for next blog post from this series.

Cover image by Tim Green under CC BY-SA 2.0 .

↧

Python字符串拼接方法详解

February 6, 2017, 5:37 am

≫ Next: Doing query param request in Django

≪ Previous: Gunicorn & LRU cache pitfall

python字符串拼接方法详解

一点号python开源学院1小时前

字符串是python中最常用的一种数据类型，字符串的拼接方法有很多种，这里将通过例子来详细讲解这几方法的使用及各自的特点。

>>> a = 'hello'

>>> b = 'python'

>>> c = '!'

>>> a + ' ' + b + ' ' + c

'hello python !'

>>> ' '.join([a,b,c])

'hello python !'

>>> '%s %s,I love %s %s' % (a,b,b,c)

'hello python,I love python !'

>>> '{} {} {}'.format(a,b,c)

'hello python !'

>>> '{1} {2} {0}'.format(a,b,c)

'python ! hello'

>>> '{x1} {x2} {x3}'.format(x1=a,x2=b,x3=c)

'hello python !'

>>>

先创建了a，b，c三个字符串对象，通过上面的例子来归纳这几种字符串连接方式的特点。

第一种方法是用“+”连接，这里要注意的是两个字符串是直接相连的，如果是连成一句话，单词之间要有个空格，那么就得自己将空格加上。

第二种方法是用.join的方式，要注意的是这种方法括号里面只能是一个对象，可以把多个对象放到一个列表或元祖里面后再使用这种方法，而这个列表或元组里面的元素必须是字符串类型的。同时前面引号里字符相当于是连接点，可以在里面写连接点的字符，例如空格。这种方法也相当于是.split方法的反操作。

例：

>>> '*'.join([a,b,c])

'hello*python*!'

>>> 'xxx'.join([a,b,c])

'helloxxxpythonxxx!'

>>>

第三种方法是用“%s”字符串格式化的方式，%s当占位符在前面的字符串中占一个位置，后面用百分号%来连接需要填进去的对象。一般在一长串字符串中添加某个变量就会使用这个方法。字符串的格式化除了%s之外还有格式整数的%d，格式化小数的%f等。

第四种方法是.format的方式。format方法和%s的方法一样都是属于字符串的格式化的方法，只是在format方法中用的是大括号{}来当占位符。

'{}{}{}'.format(a,b,c)

当{}里面是空的时候，里面默认索引为0，1，2按format括号里的顺序依次填入。

'{1}{2}{0}'.format(a,b,c)

当{}里面有索引值时，按前面的索引值将后面的每项依次填入。

'{n1}{n2}{n3}'.format(n1=a,n2=b,n3=c)

大括号{}里面可以指定对象名称，后面通过赋值的方式给前面的相应的值，后面的对象是无序的。欢迎加入Python学习交流群：330617182 群内每天更新学习资料视频，每天有免费的公开课，喜欢Python想学习Python的话可以加一下群。

↧

Doing query param request in Django

February 6, 2017, 5:36 am

≫ Next: Django Dynamic Formsets

≪ Previous: Python字符串拼接方法详解

Typically, I want my API endpoint url to be flat like this:

mydomain.com/shit_api/v1/person/123

That url is pretty straight forward, it's trying to get a person with an ID of 123 from shit_api. But that's not always clear cut.

I've been working on a particular API that needed a pair of optional filters that we're a pain to work out in a regular expression. And this is where I found about Django's QueryDict object . With it I can write out API urls with params:

mydomain.com/shit_api/v1/something/?filter1=value1&filter2=value2

And on the urls and views, I can handle it as such:

#urls.py urlpatterns = [ url(r'^parts/$', SomethingView.as_view()), name='Something'), ] ... #views.py class SomethingView(View): def get(self, request, *args, *kwargs): fiter1_value = request.GET.get('filter1', 'default') fiter2_value = request.GET.get('filter2', 'default') #do more here return Response('some response')

The thing here is that request.GET.get() method having a default value. This make it that having it on the URL it isn't a problem. So call like:

mydomain.com/shit_api/v1/something/

Will still work.

Bonus since I don't have to deal with a convoluted regular expression. Yehey!

↧

Django Dynamic Formsets

February 6, 2017, 10:33 pm

≫ Next: Extracting a TOC from Markup

≪ Previous: Doing query param request in Django

Django forms are one of the most important parts of the stack: they enable us to write declarative code that will validate user input, and ensure we protect ourselves from malicious input.

Formsets are an extension of this: they deal with a set of homogeous forms, and will ensure that all of the forms are valid independently (and possibly do some inter-form validation, but that’s a topic for a later day).

The Django Admin contains an implementation of a dynamic formset: that is, it handles adding and removing forms from a formset, and maintains the management for accordingly. This post details an alternative implementation.

A Formset contains a Form (and has zero or more instances of that Form). It also contains a “Management Form”, which has metadata about the formset: the number of instances of the form that were provided initially, the number that were submitted by the user, and the maximum number of forms that should be accepted.

A Formset has a “prefix”, which is prepended to each element within the management form:

Each Form within the Formset uses the prefix, plus it’s index within the list of forms. For instance, if we have a Formset that contains three forms, each containing a single “name” field, we would have something similar to:

Note that the form’s prefix is <formset_prefix>-<form_index> .

To make a Formset dynamic, we just need to be able to add (and possibly remove, but there’s a little more complexity there) extra forms. The managment form needs to be updated to reflect this, and we need to ensure that the new form’s fields are named appropriately.

A Formset also contains an empty_form . This is an unbound form, where the form’s “index” is set to __prefix__ . Thus, the empty form for the above formset might look somewhat like:

We can leverage this to allow us to have simpler code: instead of having to duplicate elements and remove the value, we can just duplicate the empty form, and replace the string __prefix__ with whatever the index of the newly created form should be.

Here’s an implementation that has no dependencies, but does make some assumptions:

↧

Extracting a TOC from Markup

February 6, 2017, 10:32 pm

≫ Next: Rec-a-Sketch: a Flask App for Interactive Sketchfab Recommendations

≪ Previous: Django Dynamic Formsets

In today’s addition of “really simple things that come in handy all the time” I present a simple script to extract the table of contents from markdown or asciidoc files:

So this is pretty simple, just use regular expressions to look for lines that start with one or more "#" or "=" (for markdown and asciidoc, respectively) and print them out with an indent according to their depth (e.g. indent ## heading 2 one block). Because this script goes from top to bottom, you get a quick view of the document structure without creating a nested data structure under the hood. I’ve also implemented some simple type detection using common extensions to decide which regex to use.

The result is a quick view of the structure of a markup file, especially when they can get overly large. From the Markdown of one of my longer blog posts :

- A Practical Guide to Anonymizing Datasets with python - Anonymizing CSV Data - Generating Fake Data - Creating A Provider - Maintaining Data Quality - Domain Distribution - Realistic Profiles - Fuzzing Fake Names from Duplicates - Conclusion - Acknowledgments - Footnotes

And from the first chapter of Applied Text Analysis with Python :

- Language and Computation - - - What is Language? - Identifying the Basic Units of Language - Formal vs. Natural Languages - Formal Languages - Natural Languages - Language Models - Language Features - Contextual Features - Structural Features - The Academic State of the Art - Tools for Natural Language Processing - Language Aware Data Products - Conclusion

Ok, so clearly there are some bugs, those two blank - bullet points are a note callout which has the form:

[NOTE] ==== Insert note text here. ====

Therefore misidentifying the first and second ==== as a level 4 heading. I tried a couple of regular expression fixes for this, but couldn’t exactly get it. The next step is to add a simple loop to do multiple paths so that I can print out the table of contents for an entire directory (e.g. to get the TOC for the entire book where one chapter == one file).

↧

Rec-a-Sketch: a Flask App for Interactive Sketchfab Recommendations

February 6, 2017, 10:30 pm

≪ Previous: Extracting a TOC from Markup

After the longseries of previous posts describing various recommendation algorithms using Sketchfab data, I decided to build a website called Rec-a-Sketch which visualizes the different algorithms' recommendations. In this post, I'll describe the process of getting this website up and running on AWS with nginx and gunicorn.

The goal of the website was two-fold.

I wanted to view the different algorithm's recommendations side-by-side for comparison. I wanted to get "lost" in the recommendations like one gets lost clicking from link to link on Wikipedia.

I organized the page as follows so that (1) all recommendations were visible and (2) one can click on any of the recommended models to be taken to that model's recommendations.

Rec-a-Sketch: a Flask App for Interactive Sketchfab Recommendations

Organization of the App

I decided to use Flask to build the web app because I already have some experience with it, and I'm not trying to reinvent the wheel! The functionality itself is fairly simple. Other than an about page, there is only one page and one Flask route in the whole site.

The functionality is relatively simple. When one initially goes to the page, there is a default list of models to select from, or one can input a link to a custom model. Once a model is selected, this sends a GET request to the main route . When the route receives this request, it must do two things:

Grab data about the input model (name, url, and thumbnail). Find other recommended models and get their associated data.

I populated a sqllite database with data about the Sketchfab models. I do not store the thumbnails directly; rather, I include a link to the thumbnail on Sketchfab's servers.

For grabbing recommendations, I created a table with precomputed recommendations for each model. The recommendations are stored in the stupidest possible way as a string of comma-separated model ID's. I pull down the string, split on the commas, and place everything in a list. The code looks something like what follows.

In[]: # mid is an inputted model ID # type is the recommendation algorithm type (e.g. learning-to-rank) c = conn.cursor() sql = """ SELECT type, recommended FROM recommendations WHERE mid = '{}' """.format(mid) c.execute(sql) results = c.fetchall() recommendations = [] for r in results: recommendations.append((r[0], [str(x) for x in r[1].split(',')])) recommendations = dict(out)

I should note that the above code is easily subject to SQL Injection . Please don't write code like this on a production server!

The main route functionality was actually the easiest part of the whole project. The hardest parts were getting things running remotely.

This post from DigitalOcean was super helpful in getting things up and running. In fact, I almost followed that post verbatim.

For my purposes, I chose to use Amazon Web Services (AWS) instead of DigitalOcean for hosting the Rec-a-Sketch. This was imply because I had previous experience with AWS. The first step is to setup an EC2 instance which is virtual server. Rec-a-Sketch is lightweight, so I chose a t2.nano instance because it's the cheapest.

One must create an Elastic IP address for the instance (which costs some money) as well as open ports 80 and 22. The ports can be opened by going to Network & Security -> Security Groups and creating a security group with the following ports:

When the EC2 instance is created, you can download a pem file which allows you to ssh into the EC2 box. Save the pem file to your computer, and set the permissions accordingly:

chmod 400 pemfile.pem

I usually place the file in ~/.ssh/ and then add the file to my ~/.ssh/config file for easy ssh-ing later on. The config file lets you setup quick aliases for ssh-ing (see here for more details).

Once you're able to ssh into the EC2 instance, it's time to setup the stack. The stack consists of the following:

nginx | a web server which can handle incoming requests and redirect them to your Flask app. upstart | this makes sure that your Flask app stays up and running. If the app should die, upstart will start it back up again. gunicorn | a python WSGI HTTP server. I freely admit that I don't quite get the purpose of gunicorn. One clear benefit is that you can run multiple "workers" or copies of your Flask app which allows you to process multiple requests at once.

The DigitalOcean posts walks through the setup of this stack quite nicely. Some modifications that I made are that I use miniconda for managing the python libraries. In my upstart script, I have to make sure to add miniconda to the PATH environment variable. The upstart script is on github here , and the nginx configuration is here .

I did run into some issues setting up both the upstart service and nginx (when do things ever work the first time around?). Both services have log files which can be helpful for debugging. nginx had access and error logs in /var/log/nginx/ , and each upstart service has its own log in /var/log/upstart/ .

I mentioned before that I do not actually host the Sketchfab model images on my server. I would have to pay for outgoing bandwidth, and this would add up quite fast (if people actually visit my website!). A simpler way to host images (though maybe morally dubious?) is to point to the url where Sketchfab hosts the image.

The Sketchfab API easily lets you find the location of an image thumbnail. At first I would just ping the Sketchfab API for each request that came into my Flask app. This proved super slow because I would have to wait for the Sketchfab API response each time. I tried to solve this by running a big script to store all API responses in my own database.

This worked for a bit, but then the image links started to break. I was confused for a bit, but maybe you can figure out what happened - here's an example image link:

https://dg5bepmjyhz9h.cloudfront.net/urls/a1194aa7be824b7da6accb1d0c788132/dist/thumbnails/93e331260a8142c6ab85d61f6a025476/200x200.jpeg

What's going on here? It turns out that Sketchfab smartly hosts their images using a Content Delivery Network, or CDN. CDNs are used to quickly serve files to users by hosting the files much closer to the user. There's no guarantee that the filename should stay the same at the CDN node, and it seems that they do not.

I did not want t

↧

码云推荐 | 基于 ActiveRecord 模式的 ORM 框架 Hare

February 6, 2017, 10:29 pm

≫ Next: Visual Studio 2017 RC3支持.NET Core，延迟对Python的支持

≪ Previous: Rec-a-Sketch: a Flask App for Interactive Sketchfab Recommendations

Hare

hare 是一个基于pymysql并运用ActiveRecord模式的ORM框架, 在虚拟环境下，通过：

pip install hare

即可安装。

当前，它只支持：

MySQL 动机

在python下进行数据库操作，大体有两种方法：

1、使用raw sql 2、使用ORM Raw SQL

python中常用的 raw sql 工具是：

MySQLdb PyMySQL

使用 raw sql 的好处是：

给予开发人员极大的自由，让开发人员知道具体要执行的sql，方便sql优化

坏处是麻烦：

写起来麻烦、影响开发速度；维护起来也麻烦 ORM

python中用的最广的ORM是 SQLAlchemy 和 Peewee .

使用 ORM 的好处是：

写起来方便，维护方便

坏处是：

对开发人员透明、不利于sql优化；主流的ORM学习成本高，对于一般的中小型项目而言，用不到那么到功能，如SQLAlchemy

此外， python ORM 框架的使用哲学是：

需要要手动的在类中配置字段和对应类型，然后使用ORM去自动创建对应的table。

而开发人员的哲学是:

手动使用sql建表、然后再去创建对应的ORM。

那么，比较下来，就产生了新的需求：实现一个 ORM ，满足下列要求：

1、方便ORM和数据库表之间的映射、最好不用在ORM中声明字段 2、支持raw sql 3、不需要实现复杂的API(太复杂的，可以直接通过raw sql实现) 4、支持事务(声明式、命令式)

很容易想到，使用 Active Record 的方式实现一个ORM，满足上述条件

于是就实现了一个名为 Hare 的ORM. Hare 的意思是野兔，希望进行python的db操作时，像兔子一样快。

参考框架

在设计和实现 Hare 的过程中，参考了 Flask 框架和 jFinal 框架的设计。

jFinal

jFinal是一种轻量的java web框架；设计和实现 Hare 的过程中，借鉴了它的一些设计思想：

自动获取表结构

jFinal在启动的时候，根据ORM对应的表名，通过 MySQL 的 INFORMATION_SCHEMA 取获取表结构；

Hare 也通过此方式来获取。

Flask

Flask是一种轻量的python web框架；设计和实现 Hare 的过程中，借鉴了它的一些设计思想：

将框架对象化

flask中，通过：

app = Flask(__name__)

的方式来建立一个应用对象，并在该对象中存储相关路由、处理器等信息；

Hare中，采用类似方式，通过：

haredb = Hare(host='localhost', user='root', password='*****', db='test', charset='utf8', cursorclass=pymysql.cursors.DictCursor)

来创建一个数据源对象，存放数据操作所需的一切信息。

装饰器

flask中，使用装饰器的方式，来定义路由处理：

@app.route('/home', methods=['GET']) def home(): pass

Hare也使用装饰器来定义定义数据模型类和表之间的映射关系，并存储，如下：

@haredb.table('user') class User(Model): pass

把 User 类和 user 表对应起来.

同时，Hare中的事务也可以通过装饰器来实现：

@haredb.tx def func(...): ... 使用

数据库的"库->表->字段"，是一种层次分明的结构。 Hare 也基于此。

用户提供数据库的连接配置，就对应了一个数据源，也就是Database；

haredb = Hare( host='localhost', user='root', password='135246', db='test', charset='utf8', cursorclass=pymysql.cursors.DictCursor)

假设在 test 数据库中已经创建了一个 user 表：

USER_TABLE = """CREATE TABLE `user` ( `uid` int(11) NOT NULL AUTO_INCREMENT, `nickname` varchar(20) DEFAULT NULL, `email` varchar(20) DEFAULT NULL, PRIMARY KEY (`uid`) ) ENGINE=InnoDB AUTO_INCREMENT=59 DEFAULT CHARSET=utf8"""

通过装饰器来声明这个数据库下有哪些表(添加一个名是 user 的table，对应的模型是 User )：

@haredb.table('user') class User(Model): pass

那么:

完整的用例如下 #! -*- coding: utf-8 -*- from __future__ import absolute_import import logging from traceback import format_exc import pymysql from hare import Hare, Model # 创建一个Hare对象, 作为数据源 haredb = Hare( host='localhost', user='root', password='********', db='test', charset='utf8', cursorclass=pymysql.cursors.DictCursor) # 将user表和User类绑定 @haredb.table('user') class User(Model): pass # 获取所有的表名 # 返回['user'] print haredb.tables # 获取User类对应的table对象 table = User.table # 输出表名称 print table.name # 清空User表 table.truncate() # 判断字段是否属于该表 print table.is_column('uid') print table.is_column('uid_not_exists') # 新建一条记录 u = User() u.set_many(**{'nickname': 'haha', 'email': 'a@q.com'}).save() # 获取主键 print u.uid # 获取一条记录 u = User.get(uid=1) # 修改字段的值 u.nickname = 'new name' u.update() # 删除该对象 u.delete() # 获取所有的用户记录 # 每个元素是个dict users = User.select_many() # 查询符合条件的所有记录 # 每个元素是个dict users = User.select_many(email='a@q.com') # 分页查询User表 pagination = User.paginate(params={'nickname': ('is not', None)}, page=1, per_page=10) print pagination.items # 获取一条数据库连接 dbi = haredb.dbi # 执行row sql # 一条记录 users = dbi.select(u'SELECT * FROM user WHERE uid = 10') # 多条记录 users = dbi.select_many(u'SELECT * FROM user WHERE uid > 10') # 执行写操作 dbi.modify(u'DELETE FROM user WHERE uid = %s', 1) # 批量写操作 rows = [{'nickname': 'test', 'email': 'test@qq.com'}] dbi.modify_many(u'INSERT INTO user(nickname, email) VALUES(%(nickname)s, %(email)s)', rows) # 执行事务 @haredb.tx def save_user(): user = User().set_many(**{'nickname': 'test2'}) user.save() # 1/0 取消注释该行，则保存失败 # 执行事务的另外一种方式 def save_user2(): user = User().set_many(**{'nickname': 'test2'}) user.save() # 1/0 取消注释该行，则保存失败 with haredb.get_tx() as tx: try: save_user2() except: logging.error(format_exc()) tx.rollback() else: tx.commit() print User.select_many() API

见

doc/api.md

↧

Visual Studio 2017 RC3支持.NET Core，延迟对Python的支持

February 6, 2017, 10:28 pm

≫ Next: （原创）我为什么要学Python

Visual Studio 2017第三个候选版本上周发布，解决了之前发现的安装程序的小问题。由于这些问题得到了解决，现在值得关注的就是这次版本中更新了什么内容。（版本是发布于1月27日的build 26127.00）

RC3版本中最值得关注的部分就是对NET.Core和ASP.NET Core的支持，对Team Explorer的更新以及对Visual Studio安装程序相关的错误修复。根据Microsoft的John Montgomery的评论，由于没有足够的时间完成本地化，几个工作负荷已经从RC3中删除，VS2017即将发布产品化版本。虽然Data Science和python Development工作负荷在发布版本中仍然不可用，但对F#的支持将通过.NET Desktop 和.NET Web development工作负荷保留在VS2017中。Montgomery说，Data Science和Python Development工作负荷在发布后会单独提供下载。

NET.Core / ASP.NET Core

VS2017中，这两种工作负荷将不仅仅是预览版本。作为这次发布的一部分，.NET Core项目从project.json/xproj文件迁移到csproj会更加可靠。

Team Explorer

Team Explorer进行了更新，当与Visual Studio Team Services和Team Foundation Server连接使用时速度将更快。

NuGet更新

NuGet更新后可支持WPF、windowsForms和UWP项目中的 <PackageReference> 。无需加载项目也可以使Lightweight Solution Restore工作。

其他值得注意的更新

RC3中修复了几个错误，包括为安装失败添加了重试按钮，更新了不完善的离线安装功能，以及关闭延迟等问题。由于VS2017主要变更数量已经缩减，看起来VS2017即将迎来正式发布。现在就可以尝试在你的环境、你的特定开发需求中VS2017RTM的表现。

VS2017RC3现在已经可以下载。想要了解更多详细信息，请参考发布说明。

查看英文原文： Visual Studio 2017 RC3 Adds .NET Core, Delays Python Support

↧

（原创）我为什么要学Python

February 6, 2017, 10:27 pm

≫ Next: Python 表达式 i += x 与 i = i + x 等价吗？

≪ Previous: Visual Studio 2017 RC3支持.NET Core，延迟对Python的支持

写文章也有一段时间了，感觉每篇文章前面加上（原创）两个字有点土，但突然不加（原创）又怕看官觉得是不是哪抄袭过来的，真是纠结。纠结归纠结，还是应该有一个了结，（原创）在下一篇文章之后就不加在标题上了，之后的原创申明会放在文末，望知悉。（一bu小心加粗了三个（原创），望谅解[坏笑脸]）

回归正题，小编目前正在学习python这个脚本语言，据百度知，此语言貌似是最受欢迎的脚本语言，之一。小编怎么会是那种没有主见听信百度的人呢，小编是在严谨求证（Baidu）后选择这个语言的。并且在学习了一段时间后，让我越发觉得这个语言实在是太有用了。小编是一个绝对正统的理工 ‘<error:undefined>’，因此小编选择一条条罗列学Python的’原油’：

1、作为学过C、C++、 JAVA、MATLAB 的M语言、Verilog HDL、VHDL等等语言的游走在程序员和硬件工程师边缘的毫无存在感的技术渣，上述一堆语言对于小编来说实在是学了也不知道怎么用到实际生活中有需要的地方去，这些课程我竟然都通过了，一定是改卷老师喝醉了。但是Python不一样，小编虽然驽钝，但是竟然在一两天的时间内写出了一个很实用含GUI部分的小程序（留点隐私，自个想去吧，哈哈），毫无存在感的技术渣此时别提多有存在感了，屁颠颠就继续学下去了。

2、其实呀，小编真不是随便就百度一下做决定的了。我是想用脚本语言解决很多需要重复操作（比如Excel的数据处理）的工作才选择Python的，哪里是简单百度的呢，要不是百度告诉我脚本语言有这个功能……（哎呀，打字太快，打漏嘴了，留着吧）。

3、还有一个原因呀，就是最近人工智能不是很火吗，听说人工智能就是喜欢取代我这种渣渣，那可咋整啊。这不就去了解了一下人工智能吗，才发现人工智能不就是个自动下围棋的吗，人工智能短时间还取代不了我这种渣渣嘛，所以小编就开始学这种简单的胶水语言，没准以后在人工智能的统治下能摆个地摊卖点残次品什么的。

4、又有一个原因呀，Python这个东西啊，支持的库实在是太多了，那些我原来要写个几十上百行的C代码的东西，这玩意竟然一句话……一句话……一句话……，我去NM***

5、又有一个原因呀……咳咳，忘了~

总的来说，Python非常简单易用，甚至没有编程经验的人都可以很快上手，有人这个时候可能就要问了，这种人人都很容易学会的垃*货，学了有什么优势？小编觉得呢，如果你是想靠此谋生，那你肯定是要研究得非常深入才行，这就已经不是人人都容易学会的了。对于那些想利用编程解放双手，不再在这个破电脑上重复点击鼠标键盘做些苦力活，或者想写一些小程序（crawling个jpg什么的）丰富一下业余生活的童鞋呢，Python绝对是改善生活提升国民幸福感的利器。

因此，这里很多文章会是小编学习Python过程中的写的小程序或总结或心得（当然会消化干净只把重点呈现给大家啦），偶尔会发点癫什么的也请原谅哦，想学Python的童鞋们可以常来这里看看哦，看谁掌握得更快吧！（你尽管三天打鱼两天晒网，你比我慢算我输）

↧

Python 表达式 i += x 与 i = i + x 等价吗？

February 6, 2017, 10:26 pm

≫ Next: python generators, coroutines, native coroutines and async/await

≪ Previous: （原创）我为什么要学Python

题图：unsplash.com by Dmitry Pavlov

python 表达式 i += x 与 i = i + x 等价吗？如果你的回答是yes，那么恭喜你正确了50%，为什么说只对了一半呢？按照我们的一般理解它们俩是等价的，整数操作时两者没什么异同，但是对于列表操作，是不是也一样呢？先看下面两段代码：

代码1

>>> l1 = range(3) >>> l2 = l1 >>> l2 += [3] >>> l1 [0, 1, 2, 3] >>> l2 [0, 1, 2, 3]

代码2

>>> l1 = range(3) >>> l2 = l1 >>> l2 = l2 + [3] >>> l1 [0, 1, 2] >>> l2 [0, 1, 2, 3]

代码1与代码2中的 l2 的值是一样的，但是 l1 的值却不一样，说明 i += x 与 i = i + x 是不等价的，那什么情况下等价，什么情况下不等价呢？

弄清楚这个问题之前，首选得明白两个概念：可变对象与不可变对象。

在 Python 中任何对象都有的三个通用属性：唯一标识、类型、值。

唯一标识：用于标识对象的在内存中唯一性，它在对象创建之后就不会再改变，函数 id() 可以查看对象的唯一标识

类型：决定了该对象支持哪些操作，不同类型的对象支持的操作就不一样，比如列表可以有length属性，而整数没有。同样地对象的类型一旦确定了就不会再变，函数 type() 可以返回对象的类型信息。

对象的值与唯一标识不一样，并不是所有的对象的值都是一成不变的，有些对象的值可以通过某些操作发生改变，值可以变化的对象称之为可变对象（mutable），值不能改变的对象称之为不可变对象（immutable）

不可变对象（immutable）

对于不可变对象，值永远是刚开始创建时候的值，对该对象做的任何操作都会导致一个新的对象的创建。

>>> a = 1 >>> id(a) 32574568 >>> a += 1 >>> id(a) 32574544

整数 “1” 是一个不可变对象，最初赋值的时候， a 指向的是整数对象 1 ，但对变量a执行 += 操作后， a 指向另外一个整数对象 2 ，但对象 1 还是在那里没有发生任何变化，而变量 a 已经指向了一个新的对象2。常见的不可变对象有：int、tuple、set、str。

可变对象（mutable）

可变对象的值可以通过某些操作动态的改变，比如列表对象，可以通过append方法不断地往列表中添加元素，该列表的值就在不断的处于变化中，一个可变对象赋值给两个变量时，他们共享同一个实例对象，指向相同的内存地址，对其中任何一个变量操作时，同时也会影响另外一个变量。

>>> x = range(3) >>> y = x >>> id(x) 139726103041232 >>> id(y) 139726103041232 >>> x.append(3) >>> x [0, 1, 2, 3] >>> y [0, 1, 2, 3] >>> id(x) 139726103041232 >>> id(y) 139726103041232
Python 表达式 i += x 与 i = i + x 等价吗？

执行append操作后，对象的内存地址不会改变，x、y 依然指向的是原来同一个对象，只不过是他的值发生了变化而已。

理解完可变对象与不可变对象后，回到问题本身， += 与 + 的区别在哪里呢？

+= 操作首先会尝试调用对象的 __iadd__ 方法，如果没有该方法，那么尝试调用 __add__ 方法，先来看看这两个方法有什么区别

__add__和 __ iadd __ 的区别

__add__ 方法接收两个参数，返回它们的和，两个参数的值均不会改变。

__iadd__ 方法同样接收两个参数，但它是属于 in-place 操作，就是说它会改变第一个参数的值，因为这需要对象是可变的，所以对于不可变对象没有__iadd__方法。

>>> hasattr(int, '__iadd__') False >>> hasattr(list, '__iadd__') True

显然，整数对象是没有__iadd__的，而列表对象提供了__iadd__方法。

>>> l2 += [3] # 代码1：使用__iadd__，l2的值原地修改

代码1中的 += 操作调用的是__iadd __ 方法，他会原地修改l2指向的那个对象本身的值

>>> l2 = l2 + [3] # 代码2：调用 __add__，创建了一个新的列表，赋值给了l2

而代码2中的 + 操作调用的是 __add __ 方法，该方法会返回一个新的对象，原来的对象保持不变，l1还是指向原来的对象，而l2已经指向一个新的对象。

以上就是表达式 i += x 与 i = i + x 的区别。因此对于列表进行 += 操作时，会存在潜在的bug，因为l1会因为l2的变化而发生改变，就像函数的参数不宜使用可变对象作为关键字参数一样。

↧

python generators, coroutines, native coroutines and async/await

February 6, 2017, 10:25 pm

≫ Next: 注意迁移的PyTorch实现

≪ Previous: Python 表达式 i += x 与 i = i + x 等价吗？

Abstraction is not about vagueness, it is about being precise at a new semantic level. - Dijkstra

笔者之前学习python的时候就对这几个概念有些困惑，尤其是python3之后又不断添加了 yield from, async, await 等关键字用来支持异步编程。最近看到一篇比较好的博客就结合自己的理解翻译并解释一下这些概念，包括生成器，协程，原生协程和python3.5引入的async/await。请使用python3.5运行代码示例。

Generators(生成器)

python中生成器是用来生成值的函数。通常函数使用 return 返回值然后作用域被销毁，再次调用函数会重新执行。但是生成器可以 yield 一个值之后暂停函数执行，然后控制权交给调用者，之后我们可以恢复其执行并且获取下一个值，我们看一个例子：

def simple_gen(): yield 'hello' yield 'world' gen = simple_gen() print(type(gen)) # <class 'generator'> print(next(gen)) # 'hello' print(next(gen)) # 'world'

注意生成器函数调用的时候不会直接返回值，而是返回一个类似于可迭代对象(iterable)的生成器对象(generator object)，我们可以对生成器对象调用 next() 函数来迭代值，或者使用 for 循环。

生成器常用来节省内存，比如我们可以使用生成器函数yield值来替代返回一个耗费内存的大序列:

def f(n): res = [] for i in range(n): res.append(i) return res def yield_n(n): for i in range(n): yield i Coroutines(协程)

上一节讲到了使用使用生成器来从函数中获取数据(pull data)，但是如果我们想发送一些数据呢（push data）?这时候协程就发挥作用了。 yield 关键字既可以用来获取数据也可以在函数中作为表达式(在=右边的时候)。我们可以对生成器对象使用 send() 方法来给函数发送值。这叫做『基于生成器的协程』(generator based coroutines)，下边是一个示例:

def coro(): hello = yield 'hello' # yield关键字在=右边作为表达式，可以被send值 yield hello c = coro() print(next(c)) # 输出 'hello' print(c.send('world')) # 输出 'world'

这里发生了什么？和之前一样我们先调用了 next() 函数，代码执行到 yield 'hello' 然后我们得到了’hello’之后我们使用了 send 函数发送了一个值’world’, 它使 coro 恢复执行并且赋了参数’world’给hello这个变量，接着执行到下一行的yield语句并将hello变量的值’world’返回。所以我们得到了 send() 方法的返回值’world’。

当我们使用基于生成器的协程(generator based coroutines)时候，术语”generator”和”coroutine”通常表示一个东西，尽管实际上不是。而python3.5以后增加了async/await关键字用来支持原生协程(native coroutines)，我们在后边讨论。

Async I/O and the asyncio module （异步IO和asyncio模块）

python3.4以后标准库增加了新的asyncio模块来支持更加简洁的异步编程。我们可以在asyncio模块使用协程轻松实现异步IO，下边是一个来自官方文档的示例：

import asyncio import datetime import random @asyncio.coroutine def display_date(num, loop): end_time = loop.time() + 50.0 while True: print('Loop: {} Time: {}'.format(num, datetime.datetime.now())) if (loop.time() + 1.0) >= end_time: break yield from asyncio.sleep(random.randint(0, 5)) loop = asyncio.get_event_loop() asyncio.ensure_future(display_date(1, loop)) asyncio.ensure_future(display_date(2, loop)) loop.run_forever()

我们创建了一个协程display_date(num, loop)，它接收一个数字和event loop作为参数，然后持续输出当前时间。然后使用 yield from 关键字来await从 asyncio.sleep() 执行的结果。 asyncio.sleep() 是一个协程，在指定时间以后完成。之后我们在默认的事件循环(event loop)中使用 asyncio.ensure_future() 来调度协程的执行，最后通知事件循环一直执行下去。

如果我们执行这段代码，可以看到两个协程是并发执行的。当我们用 yield from 的时候，事件循环知道它将(这里指sleep函数)将会忙碌一段时间然后暂停这个协程的执行转而执行另一个协程。所以这两个协程能够并发执行（注意并发不是并行，因为event loop是单线程的，所以不是真正意义上的『同时执行』）。

这里只需要知道 yield from 是一个语法糖用来替代下边这种形式的写法，这种形式使代码更加简洁。

# yield from 等价方式 yield from asyncio.sleep(random.randint(0, 5)) for x in asycio.sleep(random.randint(0, 5)): yield x Native Coroutines and async/await (原生线程与async/await)

记住到目前为止，我们仍然使用的是基于生成器的协程(generators based coroutines)，在python3.5中，python增加了使用async/await语法的原生协程(native coroutines)。之前的函数用async/await语法可以这么写:

import asyncio import datetime import random async def display_date(num, loop): end_time = loop.time() + 50.0 while True: print('Loop: {} Time: {}'.format(num, datetime.datetime.now())) if (loop.time() + 1.0) >= end_time: break await asyncio.sleep(random.randint(0, 5)) loop = asyncio.get_event_loop() asyncio.ensure_future(display_date(1, loop)) asyncio.ensure_future(display_date(2, loop)) loop.run_forever()

你能看出变化吗？实际上就是去掉了装饰器 @asyncio.coroutine ，然后在定义前加上 async 关键字，之后把 yield from 替换成 await 。写法是不是更加简洁了？

Native vs Generator Based Coroutines: Interoperability （原生协程 vs 基于生成器的协程）

实际上除了语法之外原生协程(async/await)和基于生成器的协程(@asyncio.coroutine/yield from)并没有功能上的区别。但是注意，这两种写法不能混用，就是说你不能在generator based coroutines里使用 await ，或者在naive coroutines里头使用 yield 或者 yield from 。

除此之外，两种写法是互通的，我们可以同时使用，比如我们可以在原生协程里 await 一个基于生成器的协程，也可以在基于生成器的协程里 yield from 一个使用 async 定义的原生协程。

比如我们同时在一个时间循环里使用两种协程:

import asyncio import datetime import random import types @types.coroutine def my_sleep_func(): yield from asyncio.sleep(random.randint(0, 5)) # 注意这里就不能用 await async def display_date(num, loop): end_time = loop.time() + 50.0 while True: print('Loop: {} Time: {}'.format(num, datetime.datetime.now())) if (loop.time() + 1.0) >= end_time: break await my_sleep_func() # 注意这里就不能用 async loop = asyncio.get_event_loop() asyncio.ensure_future(display_date(1, loop)) asyncio.ensure_future(display_date(2, loop)) loop.run_forever() Ref

PYTHON: GENERATORS, COROUTINES, NATIVE COROUTINES AND ASYNC/AWAIT

↧

注意迁移的PyTorch实现

February 6, 2017, 10:24 pm

≫ Next: nvim-completion-manager - A completion framework for neovim

≪ Previous: python generators, coroutines, native coroutines and async/await

本项目是论文《要更加注重注意力：通过注意迁移技术提升卷积神经网络的性能（Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer）》PyTorch 实现。点击文末「阅读原文」可查阅原论文。

项目地址：https://github.com/szagoruyko/attention-transfer

这篇论文已经提交给了 ICLR 2017 会议，正在 review 状态：https://openreview.net/forum?id=Sks9_ajex

到目前为止该代码库里的内容包括：

CIFAR-10 实验的基于激活技术的 AT 代码

ImageNet 实验的代码（ResNet-18-ResNet-34 student-teacher）

即将上线：

基于梯度的 AT

场景和基于 CUB 激活的 AT 代码

预训练的基于激活的 AT ResNet-18

代码使用 PyTorch。原始的实验是用 torch-autograd 做的，我们目前已经验证了 CIFAR-10 实验结果能够完全在 PyTorch 中复现，而且目前正在针对 ImageNet 做类似的工作（由于超参数的原因，PyTorch 的结果有一点点变差）

引用：

@article{Zagoruyko2016AT,
author = {Sergey Zagoruyko and Nikos Komodakis},
title = {Paying More Attention to Attention: Improving the Performance of
Convolutional Neural Networks via Attention Transfer},
url = {https://arxiv.org/abs/1612.03928},
year = {2016}} 要求

先安装 PyTorch，再安装 torchnet：

git clone https://github.com/pytorch/tnt
cd tnt
python setup.py install

安装 OpenCV 以及 Python 支持包，以及带有 OpenCV 变换的 torchvision：

git clone https://github.com/szagoruyko/vision
cd vision; git checkout opencv
python setup.py install

最后，安装其他的 Python 包：

pip install -r requirements.txt 实验

CIFAR-10

这一节讲述如何得到本文中第一个表里的那些结果。

首先，训练老师：

python cifar.py --save logs/resnet_40_1_teacher --depth 40 --width 1
python cifar.py --save logs/resnet_16_2_teacher --depth 16 --width 2
python cifar.py --save logs/resnet_40_2_teacher --depth 40 --width 2

用基于激活的 AT 来训练：

python cifar.py --save logs/at_16_1_16_2 --teacher_id resnet_16_2_teacher --beta 1e+3

用 KD 来训练：

python cifar.py --save logs/kd_16_1_16_2 --teacher_id resnet_16_2_teacher --alpha 0.9

我们下一步计划增加带有 beta 衰退的 AT+KD 来得到最优的知识转换结果。

ImageNet

预训练模型

我们提供带有基于激活 AT 的 ResNet-18 预训练模型：

从头开始训练

下载 ResNet-34 的预训练权值（functional-zoo 里有更多介绍）：

wget https://s3.amazonaws.com/pytorch/h5models/resnet-34-export.hkl

根据 fb.resnet.torch 准备数据，然后进行训练（比如使用 2 个 GPU）：

python imagenet.py --imagenetpath ~/ILSVRC2012 --depth 18 --width 1 \
--teacher_params resnet-34-export.hkl --gpu_id 0,1 --ngpu 2 \
--beta 1e+3

↧

nvim-completion-manager - A completion framework for neovim

February 6, 2017, 10:23 pm

≫ Next: Python之路【第二十二章】：Django Form组件

≪ Previous: 注意迁移的PyTorch实现

:heart: for my favorite editor

A Completion Framework for Neovim

This is my experimental completion framework for neovim, which offers great flexibility for writing your own completion plugin, including async support. For more information, please read the Whysection

Current Completion Sources Keyword from current buffer Ultisnips hint File path completion python code completion javascript code completion Golang code completion php code completion Language specific completion for markdown Requirements Neovim python3 support. :help provider-python . For lazy linux users, I recommend this plugin python-support.nvim . (Note: Self promotion) For python code completion , you need to install jedi library. For python code completion in markdown file, you need to install mistune For Javascript code completion , you need to install nodejs and npm on your system. For Golang code completion , you need to install gocode . Installation and Configuration

Assumming you're using vim-plug

" `npm install` For javascript code completion support Plug 'roxma/nvim-completion-manager', {'do': 'npm install'} " PHP code completion is moved to a standalone plugin Plug 'roxma/nvim-cm-php-language-server', {'do': 'composer install && composer run-script parse-stubs'}

If you are using python-support.nvim , add the following code into your vimrc, to satisfy requirement 1 and requirement 2.

Plug 'roxma/python-support.nvim' " for python completions let g:python_support_python3_requirements = add(get(g:,'python_support_python3_requirements',[]),'jedi') " enable python completions on markdown file let g:python_support_python3_requirements = add(get(g:,'python_support_python3_requirements',[]),'mistune')

Add this to supress the annoying completion messages:

" don't give |ins-completion-menu| messages. For example, " '-- XXX completion (YYY)', 'match 1 of 2', 'The only match', set shortmess+=c

Notethat there's no guarantee that this plugin will be compatible with other completion plugin in the same buffer. Use let g:cm_enable_for_all=0 and call cm#enable_for_buffer() to use this plugin for specific buffer.

Tab Completion inoremap <expr> <Tab> pumvisible() ? "\<C-n>" : "\<Tab>" inoremap <expr> <S-Tab> pumvisible() ? "\<C-p>" : "\<S-Tab>" How to extend this framework? For really simple, light weight completion candidate calculation, refer to autoload/cm/sources/ultisnips.vim For really async completion source, refer to the file path completion example: autoload/cm/sources/cm_filepath.py Why?

I'm writing this for fun, feeding my own need, and it's working pleasingly for me now. And It seems there's lots of differences between deoplete, YCM, and nvim-completion-manager, by design.

I havn't read the source of YCM yet. So here I'm describing the main design of NCM (from now on, I'm using NCM as short for nvim-completion-manager) and some of the differences between deoplete and this plugin.

Async architecture

Each completion source should be a standalone process, the manager notifies the completion source for any text changing, even when popup menu is visible. The completion source notifies the manager if there's any complete matches available. After some basic priority sorting between completion sources, and some simple filtering, the completion popup menu will be trigger with the complete() function by the completion manager.

As shown intentionally in the python jedi completion demo, If some of the completion source is calculating matches for a long long time, the popup menu will still be shown quickly if other completion sources works properly. And if the user havn't changed anything, the popup menu will be updated after the slow completion source finish the work.

As the time as of this plugin being created, the completion sources of deoplete are gathered with gather_candidates() of the Source object, inside a for loop, in deoplete's process. A slow completion source may defer the display of popup menu. Of course It will not block the ui.

Scoping

I write markdown file with code blocks quite often, so I've also implementedlanguage specific completion for markdown file. This is a framework feature, which is called scoping. It should work for any markdown code block whose language completion source is avaible to NCM. I've also added support for javascript completion in script tag of html files.

Experimental hacks

Note that there's some hacks done in NCM. It uses a per 30ms timer to detect changes even popup menu is visible, instead of using the TextChangedI event, which only triggers when no popup menu is visible. This is important for implementing the async architecture. I'm hoping one day neovim will offer better option rather than a timer or the limited TextChangedI .

Also note that the calling context of nvim's complete() function by NCM does not meet the requirement in the documentation :help complete() , which says:

You need to use a mapping with CTRL-R = |i_CTRL-R|. It does not work after CTRL-O or with an expression mapping.

I work on remote VM quite often. I tend to avoid the CTRL-R = mapping, because this triggers text updated on neovim's command line and it's potentially slowing down the ui. Luckily it seems it's working by calling this function directly. This is why I claimed it's experimental . I'm hoping one day I can confirm that the calling context is legal.

Deoplete and YCM are mature, legit, they have tons of features I'm not offering currently, which should be considered a main difference too.

FAQ Vim 8 support?

Sorry, no plan for that. #1

Related Projects

asyncomplete.vim

Demo Keyword from current buffer
nvim-completion-manager - A completion framework for neovim

Ultisnips hint
nvim-completion-manager - A completion framework for neovim

File path completion
nvim-completion-manager - A completion framework for neovim

Python code completion
nvim-completion-manager - A completion framework for neovim

Language specific completion for markdown

I've also added python completion for markdown file , just for fun. Note that this is a framework feature, which is called scoping , It should work for any markdown code block whose language completion source is added to NCM.

Javascript code completion
nvim-completion-manager - A completion framework for neovim

Golang code completion
nvim-completion-manager - A completion framework for neovim

↧

Python之路【第二十二章】：Django Form组件

February 6, 2017, 10:22 pm

≫ Next: Yes, People Still Want to Learn Excel

≪ Previous: nvim-completion-manager - A completion framework for neovim

Form组件

Django的Form主要具有一下几大功能：

生成HTML标签验证用户数据（显示错误信息） HTML Form提交保留上次提交数据初始化页面显示内容

创建Form类时，主要涉及到【字段】和【插件】，字段用于对用户请求数据的验证，插件用于自动生成HTML;

1、内置字段 Field required=True, 是否允许为空 widget=None, HTML插件 label=None, 用于生成Label标签或显示内容 initial=None, 初始值 help_text='', 帮助信息(在标签旁边显示) error_messages=None, *错误信息 {'required': '不能为空', 'invalid': '格式错误'} show_hidden_initial=False, 是否在当前插件后面再加一个隐藏的且具有默认值的插件（可用于检验两次输入是否一直） validators=[], *自定义验证规则下面有介绍具体用法 localize=False, 是否支持本地化 disabled=False, 是否可以编辑 label_suffix=None Label内容后缀 *注：继承field的字段 field里面的参数都可以用 CharField(Field) max_length=None, 最大长度 min_length=None, 最小长度 strip=True 是否移除用户输入空白 IntegerField(Field) max_value=None, 最大值 min_value=None, 最小值 FloatField(IntegerField) ... DecimalField(IntegerField) max_value=None, 最大值 min_value=None, 最小值 max_digits=None, 总长度 decimal_places=None, 小数位长度 BaseTemporalField(Field) input_formats=None 时间格式化 DateField(BaseTemporalField) 格式：2015-09-01 TimeField(BaseTemporalField) 格式：11:12 DateTimeField(BaseTemporalField)格式：2015-09-01 11:12 DurationField(Field) 时间间隔：%d %H:%M:%S.%f ... RegexField(CharField) *等同于CharField加上validators regex, 自定制正则表达式 max_length=None, 最大长度 min_length=None, 最小长度 error_message=None, 忽略，错误信息使用 error_messages={'invalid': '...'} EmailField(CharField) ... FileField(Field) allow_empty_file=False 是否允许空文件 *上传文件 ImageField(FileField) ... 注：需要PIL模块，pip3 install Pillow 以上两个字典使用时，需要注意两点： - form表单中 enctype="multipart/form-data" - view函数中 obj = MyForm(request.POST, request.FILES) URLField(Field) ... BooleanField(Field) ... NullBooleanField(BooleanField) ... *ChoiceField(Field) *单选下拉框 ... choices=(), 选项，如：choices = ((0,'上海'),(1,'北京'),) required=True, 是否必填 widget=None, 插件，默认select插件 label=None, Label内容 initial=None, 初始值 help_text='', 帮助提示 ModelChoiceField(ChoiceField) *多选下拉框 ... django.forms.models.ModelChoiceField queryset, # 查询数据库中的数据 empty_label="---------", # 默认空显示内容 to_field_name=None, # HTML中value的值对应的字段 limit_choices_to=None # ModelForm中对queryset二次筛选 ModelMultipleChoiceField(ModelChoiceField) ... django.forms.models.ModelMultipleChoiceField TypedChoiceField(ChoiceField) coerce = lambda val: val 对选中的值进行一次转换 empty_value= '' 空值的默认值 MultipleChoiceField(ChoiceField) ... TypedMultipleChoiceField(MultipleChoiceField) coerce = lambda val: val 对选中的每一个值进行一次转换 empty_value= '' 空值的默认值 ComboField(Field) fields=() 使用多个验证，如下：即验证最大长度20，又验证邮箱格式 fields.ComboField(fields=[fields.CharField(max_length=20), fields.EmailField(),]) MultiValueField(Field) *被继承 PS: 抽象类，子类中可以实现聚合多个字典去匹配一个值，要配合MultiWidget使用 SplitDateTimeField(MultiValueField) *一下生成三个输入框 input_date_formats=None, 格式列表：['%Y--%m--%d', '%m%d/%Y', '%m/%d/%y'] input_time_formats=None 格式列表：['%H:%M:%S', '%H:%M:%S.%f', '%H:%M'] FilePathField(ChoiceField) *文件选项，目录下文件显示在页面中提交时是文件路径 path, 文件夹路径 match=None, 正则匹配 recursive=False, 递归下面的文件夹 allow_files=True, 允许文件 allow_folders=False, 允许文件夹 required=True, widget=None, label=None, initial=None, help_text='' GenericIPAddressField protocol='both', both,ipv4,ipv6支持的IP格式 unpack_ipv4=False 解析ipv4地址，如果是::ffff:192.0.2.1时候，可解析为192.0.2.1， PS：protocol必须为both才能启用 SlugField(CharField) 数字，字母，下划线，减号（连字符） ... UUIDField(CharField) uuid类型 ... Django内置字段 Django内置字段

2、内置插件

* 插件用于生成HTML，所有的插件都可以用attrs={'class':'c1'}创建默认值 TextInput(Input) NumberInput(TextInput) EmailInput(TextInput) URLInput(TextInput) PasswordInput(TextInput) HiddenInput(TextInput) Textarea(Widget) DateInput(DateTimeBaseInput) DateTimeInput(DateTimeBaseInput) TimeInput(DateTimeBaseInput) CheckboxInput Select NullBooleanSelect SelectMultiple RadioSelect CheckboxSelectMultiple FileInput ClearableFileInput MultipleHiddenInput SplitDateTimeWidget SplitHiddenDateTimeWidget SelectDateWidget Django内置插件

3、表单验证

处理文件：

from django import forms class Verification(forms.Form): # 跟表单提交的name值一一对应进行验证 user = forms.CharField(error_messages={'required':'用户名不能为空'}) pwd = forms.CharField( max_length=12, min_length=6, error_messages={'required': '密码不能为空','min_length':'密码长度不能小于6位', 'max_length': '密码长度不能大于12位'} ) email = forms.EmailField(error_messages={'required':'用户名不能为空','invalid':'邮箱格式不正确'}) def login(request): if request.method == "GET": obj = Verification() return render(request,'login.html',{'obj':obj}) elif request.method == "POST": # 获取用户所有数据 # 每条数据请求的验证 # 成功：获取所有的正确的信息 # 失败：显示错误信息 obj = Verification(request.POST) result = obj.is_valid() # 进行验证 True/False if result: print(obj.cleaned_data) # 用户提交的正确信息 else: print(obj.errors.as_json) # 所有错误信息 obj.errors return render(request, 'login.html',{'obj':obj}) # 传入obj return redirect('/login/')

HTML文件：

<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Title</title> </head> <body> <form action="/login/" method="post"> {% csrf_token %} <p>用户：{{ obj.user }}{{ obj.errors.user.0 }}</p> <p>密码：{{ obj.pwd }}{{ obj.errors.pwd.0 }}</p> <p>邮箱：{{ obj.email }}{{ obj.errors.email.0 }}</p> <input type="submit" name="提交" /> </form> </body> </html> login.html

其他标签使用：

<form method="POST" enctype="multipart/form-data"> {% csrf_token %} {{ form.xxoo.label }} {{ form.xxoo.id_for_label }} {{ form.xxoo.label_tag }} {{ form.xxoo.errors }} <p>{{ form.user }} {{ form.user.errors }}</p> <input type="submit" /> </form> 其他标签有其他标签使用

4、更多验证方式：

验证文件：

from django import forms from django.forms import widgets from django.forms import fields class Verification(forms.Form): # 跟表单提交的name值一一对应进行验证 user = fields.CharField( widget=widgets.Textarea(attrs={'class':'c1'}), # 定制样式，变成长框输入，添加样式class=c1 label="用户名：" # 左边显示信息 ) pwd = fields.CharField( max_length=12, min_length=6, widget=widgets.PasswordInput() # 定制密码格式 ) f = fields.FileField() # 上传文件 p = fields.FilePathField(path='app01') # 显示路径提交数据为路径值 email = fields.EmailField() city1 = fields.ChoiceField( # 单选下拉框 choices=[(0, '上海'), (1, '广州'), (2, '东莞')] ) city2 = fields.MultipleChoiceField( # 多选下拉框 choices=[(0, '上海'), (1, '广州'), (2, '东莞')] )

HTML文件：

<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Title</title> </head> <body> <form action="/login/" method="post"> {% csrf_token %} <p>{{ obj.user }}{{ obj.errors.user.0 }}</p> <p>{{ obj.pwd }}{{ obj.errors.pwd.0 }}</p> <p>{{ obj.email }}{{ obj.errors.email.0 }}</p> <p>{{ obj.f }}{{ obj.errors.f.0 }}</p> <p>{{ obj.p }}</p> <p>{{ obj.city1 }}</p> <p>{{ obj.city2 }}</p> <input type="submit" name="提交" /> </form> </body> </html> login.html

5、常用选择插件

# 单radio，值为字符串 user = fields.CharField( initial=2, widget=widgets.RadioSelect(choices=((1,'上海'),(2,'北京'),)) ) # 单radio，值为字符串 user = fields.ChoiceField( choices=((1, '上海'), (2, '北京'),), initial=2, widget=widgets.RadioSelect ) # 单select，值为字符串 user = fields.CharField( initial=2, widget=widgets.Select(choices=((1,'上海'),(2,'北京'),)) ) # 单select，值为字符串 user = fields.ChoiceField( choices=((1, '上海'), (2, '北京'),), initial=2, widget=widgets.Select ) # 多选select，值为列表 user = fields.MultipleChoiceField( choices=((1,'上海'),(2,'北京'),), initial=[1,], widget=widgets.SelectMultiple ) # 单checkbox user = fields.CharField( widget=widgets.CheckboxInput() ) # 多选checkbox,值为列表 user = fields.MultipleChoiceField( initial=[2, ], choices=((1, '上海'), (2, '北京'),), widget=widgets.CheckboxSelectMultiple Django选择插件 6、初始化数据

在Web应用程序中开发编写功能时，时常用到获取数据库中的数据并将值初始化在HTML中的标签上

验证文件：

from django import forms from django.forms import widgets from django.forms import fields class Verification(forms.Form): # 跟表单提交的name值一一对应进行验证 user = fields.CharField( widget=widgets.Textarea(attrs={'class':'c1'}), # 定制样式，变成长框输入，添加样式class=c1 label="用户名：" # 左边显示信息 ) pwd = fields.CharField( max_length=12, min_length=6, widget=widgets.PasswordInput() # 定制密码格式 ) # f = fields.FileField() # 上传文件 p = fields.FilePathField(path='app01') # 显示路径提交数据为路径值 email = fields.EmailField() city1 = fields.ChoiceField( # 单选下拉框 choices=[(0, '上海'), (1, '广州'), (2, '东莞')] ) city2 = fields.MultipleChoiceField( # 多选下拉框 choices=[(0, '上海'), (1, '广州'), (2, '东莞')] ) 创建验证类

处理文件：

def login(request): if request.method == "GET": # 从数据库中吧数据获取到 dic = { "user": 'r1', 'pwd': '123123', 'email': 'sdfsd', 'city1': 1, 'city2': [1, 2] } obj = Verification(initial=dic) return render(request, 'login.html', {'obj': obj}) elif request.method == "POST": obj = Verification(request.POST) result = obj.is_valid() # 进行验证 True/False if result: print(obj.cleaned_data) # 用户提交的正确信息 else: print(obj.errors.as_json) # 所有错误信息 obj.errors return render(request, 'login.html',{'obj':obj}) # 传入obj return redirect('/login/') 处理函数

HTML文件：

<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Title</title> </head> <body> <form action="/login/" method="post"> {% csrf_token %} <p>{{ obj.user }}{{ obj.errors.user.0 }}</p> <p>{{ obj.pwd }}{{ obj.errors.pwd.0 }}</p> <p>{{ obj.email }}{{ obj.errors.email.0 }}</p> {# <p>{{ obj.f }}{{ obj.errors.f.0 }}</p>#} <p>{{ obj.p }}</p> <p>{{ obj.city1 }}</p> <p>{{ obj.city2 }}</p> <input type="submit" name="提交" /> </form> </body> </html> login.html

↧

Yes, People Still Want to Learn Excel

February 7, 2017, 12:08 am

≫ Next: 0016 编程入门python之模块和进程

≪ Previous: Python之路【第二十二章】：Django Form组件

A question I getall the time:“Why do you blog about Excel? R/python/name your program is better.”

There’s no doubt that Microsoft is facing increased competition in the business analytics market.

Microsoft’s problem, though, is not in the technology but the marketing. Excel can do amazing things. Too few people know about them.

In part it’s up to Excel’s user base to demonstrate the program’s capabilities. This was a topic of great discussion between me, Alex Power, and Oz du Soleil. Check out our conversation here.

Regardless, as a blogger with a focus on Excel, I like to keep up on trends in software training. I like to check out Google Trends to track search interest in various topics.

My first query compared training for Excel, Python, and R.

Looks just about what I would expect: more search for Excel than for R and Python combined.It was surprising not to see an increased search interest in any term.

I changed my search criteria a bit in the second query and the story changed radically:

Wow. Looks like many more people want to learn Python. And a few more want R.

But guess what? While interest in Python and R has increased, interest in Excel has not decreased (for now).

One more.

R and Python are free programs popular for data analysis. Let’s compare search interest for them versus SPSS and SAS, traditional proprietary packages:

There is a pattern. Free programs are gaining on paid programs.

Is it fair to extend this pattern to Excel? Probably not. The search trends appear not to indicate that.

One thing Excel does have that SAS and SPSS do not are huge user groups (nearly every workstation on the planet is running Office). Network effects loom large.

Microsoft would have to become MySpace for a Facebook to take over. And I do not foresee that, because it has been cognizant in offering highly valuable proprietary software while being open to partnership with open-source programs.

Here’s the thrust:

The adoption of Python and R will NOT cannabilize the use of Excel.

They each have strengths. In fact, there is synergy (ugh, I used the S word.).

The question of “Why learn/teach Excel? Why not Python/R?” Is a false dichotomy. People keep learning Excel, so I will keep teaching it (and learning R myself).

↧

0016 编程入门python之模块和进程

February 6, 2017, 11:38 pm

≫ Next: 隔壁程序员是这样教我学Python Web的

≪ Previous: Yes, People Still Want to Learn Excel

0016 编程入门python之模块和进程

一点号零基础学编程14分钟前

php?url=0FaU0Cahb8" alt="0016 编程入门python之模块和进程" />
上节课作业

1.修改输入一个年月日日期，输出是星期几的程序，闰年判断做成函数，统计天数作成函数，星期计算做成函数

代码如下：

2.求图形面积函数增加平行四边形，梯形

做成模块

前面这个求各种图形面积的程序，如果后续陆续要增加更多的图形，则会导致这个程序越做越长，看起来很不方便

并且还有一个需求无法满足，假如有2个同学在做这个项目，一个人已经做好了3个图形的计算方法函数，另外一个人做好了2个图形的计算方法，怎样才能更方便的将2个人的代码整合到一起呢？假如有更多的人在做更多的图形，而且使用函数的也是其它的更多人，那么如何组织这些代码才能更有逻辑呢？

模块能够让你更有逻辑的组织你的python代码块。

简单的说，模块就是一个保存了python代码的文件。模块能够定义函数，类和变量。模块里也能包含可执行的代码。

将上面这个包含了5个图形计算面积函数的文件另存为area.py，然后只保留这5个函数，将其它代码删除：

在另一个文件里导入模块使用import module1[,module2[,... moduleN] 来导入模块

例如刚才做好了area.py，要导入这个模块，则在代码前面使用import area 来完成

然后使用 area.triangle(base,high) 来访问模块里面的函数

具体代码如下：

还记得以前我们曾经导入过一个模块random吗

import random 引入一个模块random

random.randint(1,99) 生成一个1到99范围之内的随机整数

日期和时间模块

使用import time来导入日期和时间模块，来处理常见的转换日期格式问题

时间间隔是按照秒为单位的浮点小数

每个时间戳是从1970年1月1日0点0分0秒经过了多长时间来表示

比如要表示一个当前时间的时间戳，可以在python里面运行如下代码：

这个时间是用于电脑存储和计算的，但是对于人类来说并不友好，所以，一般会通过函数将这个时间戳转换为人类熟悉的格式

运行如下代码进行体验：

这里是用time.localtime函数来获得当前的本地时间戳，然后用strftime函数将时间戳转换为显示格式

其中的%Y表示用4位数的年，%m表示月份01-12，%d表示月内的一天0-31，%H表示24小时数0-23，%M表示分钟数00-59，%S表示秒00-59

还有很多其它格式参数，具体内容大家可以自行去网上搜索。

time模块还有很多函数，常用的有time.sleep(seconds)，表示暂停程序几秒钟，在python里面输入time.sleep(5)体验一下暂停5秒钟的感觉

time其它时间函数，请大家自行去网上搜索后在python里面进行体验。

进程的概念

计算机系统会启动很多程序，但这些程序不一定在当前界面能看到，比如启动一个python程序，假设这个程序一直在运行不退出，然后我们并没有停留在终端界面，就不一定会看到这个程序在运行，我们把这可以称之为进程。

在树莓派系统里面，可以在LX终端程序里面用sudo ps aux命令来查看所有的当前正在运行的进程：

注意其中PID这一列就是进程ID

如果要搜索包含指定名字的进程，则可以使用sudo ps aux|grep 名字

例如，要搜索有几个包含python的进程，使用sudo ps aux|grep python

如果只看到最后一行包含了 grep 这一行，表示是搜索程序本身，表示当前进程里面没有包含python的进程

然后我们再打开另外一个LX终端，在里面运行python，进入python环境

然后回到刚才的上面这个LX终端，再次运行sudo ps aux|grep python

会发现搜索结果里面多了一个PID等于1913的进程，运行的是python程序。

如何杀死不用的进程

在某些情况下，进程一直在运行，可能是进入了死循环，这时候，常规办法可能无法删除掉这个进程，可以用下面的办法强制杀掉进程

使用sudo kill -9 PID 这个命令

例如，上面的python进程PID为1913，就运行sudo kill -9 1913杀掉python进程，然后再次查看进程，发现没了该进程：

此时，再切换到另外一个LX终端窗口，发现python程序被杀死强制被退出了：

今天的课程，主要讲的就是如何定义模块，如何使用模块，以及如果碰到死循环程序如何查询杀掉进程。

课后作业

1.写一个死循环程序，循环内部每次sleep 3秒，运行该程序，然后强制退出程序，杀死该进程

2.网上寻找datetime函数包的用法，用2行代码重写输入年月日输出星期几(中文)

往期教程

因为教程是系列教程，前后关联性非常强，请大家按照微信公众号【零基础学编程】的历史消息发布时间先后次序进行阅读。

QQ群简介

欢迎大家加入QQ群 603559164 零基础学编程，交流学习，共同进步。

↧

隔壁程序员是这样教我学Python Web的

February 6, 2017, 11:37 pm

≫ Next: django signal 使用总结

≪ Previous: 0016 编程入门python之模块和进程

隔壁程序员是这样教我学python Web的

一点号CSDN3小时前

系列直播

Python除了可以写虫、做数据挖掘，我们也可以使用Python进行WEB开发，使用Python进行WEB开发将会让代码非常简洁，并且后续也比较易于维护，本课程将由CSDN与韦玮老师联合推出，在这一门课程中，将使用Python3一步一步讲解WEB开发的知识，并且全程采用实战案例教学，让学员可以从实际场景中学习Python WEB开发，能够实际写出一些WEB项目。

本期主题

Python WEB！！！

开发经典案例实战

讲师简介

php?url=0FaRke2G40" alt="隔壁程序员是这样教我学Python Web的" />
2

课程全部采用Python3.X进行，虽然Python2比较稳定，但Python3一定是未来的趋势，现在掌握Python3，一定会抢占先机

老师提供互动答课程注重实战，全程通过实战讲解，绝对干货。

报名的用户，可以进入专享学习交流群，随时沟通交流。

直播后提供视频回放+课件，反复学习和巩固知识。

课程大纲

1.1 WEB服务器的搭建

1.2 Python CGI编程基础

1.3 Django的安装与配置

1.4 Django基本指令实战

1.5 Django MVC编程实战

1.6 使用Django做一个简单的网页

1.7 mysql数据库使用基础实战

第二节（2.16）使用Django开发你的公司网站并部署到服务器

2.1 公司网站的开发思路

2.2 让网页漂亮如此简单-BootStrap前端利器使用实战

2.3 公司网站数据库设计

2.4 公司网站前端页面设计实战

2.5 公司网站后端开发实战

2.6 将开发好的公司网站部署到腾讯云（或者阿里云）中

第三节（2.23） Python论坛系统项目开发实战

3.1 论坛系统开发思路

3.2 论坛系统数据库设计实战

3.3 论坛系统开发之用户登录功能的实现实战

3.4 论坛系统开发之板块功能的实现实战

3.5 论坛系统开发之发帖功能的实现实战

3.6 论坛系统开发之回帖功能的实现实战

4.1 CMS系统开发思路

4.2 CMS系统数据库设计实战

4.3 CMS系统开发之前端页面设计

4.4 CMS系统开发之管理员登录功能实现实战

4.5 CMS系统开发之文章发布功能实现实战

4.6 CMS系统开发之文章查看与搜索功能的实现实战

4.7 CMS系统开发之文章评论功能的实现实战

5.1 商城系统项目开发思路

5.2 商城系统数据库设计实战

5.3 商城系统开发之前端页面设计

5.4商城系统开发之用户登录实现

5.5 商城系统开发之后台管理实现实战（商品发布、订单查询、支付查询、给用户群发邮件通知）

5.6 商城系统开发之商品展览页面实现实战

5.7 商城系统开发之商品搜索实现实战

5.8 商城系统开发之商品购买与订单处理实战

5.9 商城系统开发之支付宝接口开发实现实战

5.10 商城系统开发之支付功能的实现实战

4 5 课程报名

长按报名（2月9号直播！）

欢迎交流沟通加群哦！

如微信群满，您可以点击阅读原文，查看最新群二维码哦！您可以（ web+姓名）

↧

django signal 使用总结

February 6, 2017, 11:36 pm

≫ Next: How to fix Django’s HTTPS redirects in nginx

≪ Previous: 隔壁程序员是这样教我学Python Web的

最近在已经开发好的项目上加功能，想到了django的signal，整理记录如下备查。

什么是django的signal

官方文档描述如下：

Django includes a “signal dispatcher” which helps allow decoupled applications get notified when actions occur elsewhere in the framework.In a nutshell, signals allow certain senders to notify a set of receivers that some action has taken place. They’re especially useful when many pieces of code may be interested in the same events.

Django内部包含了一位“信号调度员”：当某事件在框架内发生时，它可以通知到我们的应用程序。简而言之，当event（事件）发生时，signals（信号）允许若干 senders（寄件人）通知一组 receivers（接收者）。这在我们多个独立的应用代码对同一事件的发生都感兴趣时，特别有用。

个人理解，django的signal可理解为django内部的钩子，当一个事件发生时，其他程序可对其作出相关反应，可通过signal来回调定义好的处理函数（receivers），从而更大程度的解耦我们的系统。

最佳使用场景通知类

通知是signal最常用的场景之一。例如，在论坛中，在帖子得到回复时，通知楼主。从技术上来讲，我们可以将通知逻辑放在回复保存时，但是这并不是一个好的处理方式，这样会时程序耦合度增大，不利于系统的后期扩展维护。如果我们在回复保存时，只发一个简单的信号，外部的通知逻辑拿到信号后，再发送通知，这样回复的逻辑和通知的逻辑做到了分开，后期维护扩展都比较容易。

初始化类

信号的另一个列子便是事件完成后，做一系列的初始化工作。

其他一些使用场景总结

以下情况不要使用signal:

signal与一个model紧密相关, 并能移到该model的save()时 signal能使用model manager代替时 signal与一个view紧密相关, 并能移到该view中时

以下情况可以使用signal:

signal的receiver需要同时修改对多个model时将多个app的相同signal引到同一receiver中处理时在某一model保存之后将cache清除时无法使用其他方法, 但需要一个被调函数来处理某些问题时如何使用

django 的 signal 使用可分为2个模块：

signal ：signal定义及触发事件 receiver : signal 接受函数内建signal的使用

django 内部有些定义好的signal供我们使用：

模型相关：

pre_save 对象save前触发 post_save 对象save后触发 pre_delete 对象delete前触发 post_delete 对象delete后触发 m2m_changed ManyToManyField 字段更新后触发

请求相关：

request_started 一个request请求前触发 request_finished request请求后触发

针对django自带的signal，我们只需要编写receiver 即可，使用如下。

第一步，编写receiver并绑定到signal

myapp/signals/handlers.py

from django.dispatch import receiver from django.core.signals import request_finished ## decorators 方式绑定 @receiver(request_finished, dispatch_uid="request_finished") def my_signal_handler(sender, **kwargs): print("Request finished!================================") # 普通绑定方式 def my_signal_handler(sender, **kwargs): print("Request finished!================================") request_finished.connect(my_signal_handler) ##################################################### # 针对model 的signal from django.dispatch import receiver from django.db.models.signals import post_save from polls.models import MyModel @receiver(post_save, sender=MyModel, dispatch_uid="mymodel_post_save") def my_model_handler(sender, **kwargs): print('Saved: {}'.format(kwargs['instance'].__dict__)) dispatch_uid 确保此receiver 只调用一次第二步，加载signal

myapp/__init__py

default_app_config = 'myapp.apps.MySendingAppConfig'

myapp/apps.py

from django.apps import AppConfig class MyAppConfig(AppConfig): name = 'myapp' def ready(self): # signals are imported, so that they are defined and can be used import myapp.signals.handlers

到此，当系统受到request 请求完成后，便会执行receiver。

其他内建的signal，参考官方文档：

https://docs.djangoproject.com/en/1.9/topics/signals/ 自定义signal的使用

自定义signal，需要我们编写 signal和 receiver 。

第一步, 编写signal

myapp.signals.signals.py

import django.dispatch my_signal = django.dispatch.Signal(providing_args=["my_signal_arg1", "my_signal_arg_2"]) 第二步，加载signal

myapp/__init__py

default_app_config = 'myapp.apps.MySendingAppConfig'

myapp/apps.py

from django.apps import AppConfig class MyAppConfig(AppConfig): name = 'myapp' def ready(self): # signals are imported, so that they are defined and can be used import myapp.signals.handlers

第三步，事件触发时，发送signal

myapp/views.py

from .signals.signals import my_signal my_signal.send(sender="some function or class", my_signal_arg1="something", my_signal_arg_2="something else"])

自定义的signal，django已经为我们编写了此处的事件监听。

第四步，收到signal，执行receiver

myapp/signals/handlers.py

from django.dispatch import receiver from myapp.signals.signals import my_signal @receiver(my_signal, dispatch_uid="my_signal_receiver") def my_signal_handler(sender, **kwargs): print('my_signal received')

此时，我们自定义的signal 便开发完成了。

总结 django signal 的处理是同步的，勿用于处理大批量任务。 django signal 对程序的解耦、代码的复用及维护性有很大的帮助。

以上为个人观点，如有疑问欢迎交流。

参考

http://sabinemaennel.ch/django/signals-in-django/

https://docs.djangoproject.com/en/1.10/topics/signals/

http://www.weiguda.com/blog/38/

http://www.python88.com/topic/151

↧

How to fix Django’s HTTPS redirects in nginx

February 6, 2017, 11:35 pm

≫ Next: How to plot your own bike/jogging route using Python and Google Maps API

≪ Previous: django signal 使用总结

In the nginx configuration (inside the location block), specify this:

proxy_redirect off; proxy_set_header X-Forwarded-Proto $scheme;

The proxy_redirect off statement tells nginx that, if the backend returns an HTTP redirect, it should leave it as is. By default, nginx assumes the backend is stupid and tries to fix the response; if, for example, the backend returns an HTTP redirect that says “redirect to http://localhost:8000/somewhere”, nginx replaces it with something similar to “http://yourowndomain.com/somewhere”. But Django isn’t stupid (or it can be configured to not be stupid), and it will typically return a relative URL. If nginx attempts to “fix” the relative URL, it will likely break things. Instead, we use proxy_redirect off so that nginx merely passes the redirection as is.

The second line is only necessary if your Django project ever uses request.is_secure() or similar. It’s a good idea to have it because even if it doesn’t today it will tomorrow, and it does no harm. Django does not know whether the request has been made through HTTPS or plain HTTP; nginx knows that, but the request it subsequently makes to the Django backend is always plain HTTP. We tell nginx to pass this information with the X-Forwarded-Proto HTTP header, so that related Django functionality such as request.is_secure() works properly. You will also need to set SECURE_PROXY_SSL_HEADER = ('HTTP_X_FORWARDED_PROTO', 'https') in your settings.py .

↧

How to plot your own bike/jogging route using Python and Google Maps API

February 6, 2017, 11:34 pm

≫ Next: 用Python中的tkinter模块作图

≪ Previous: How to fix Django’s HTTPS redirects in nginx

Apart from being a data scientist, I also spend a lot of time on my bike. It is therefore no surprise that I am a huge fan of all kinds of wearable devices. Lots of the times though, I get quite frustrated with the data processing and data visualizationsoftware that major providers of wearable devices offer. That’s why I have been trying to take things to my own hands. Recently I have started to play around with plotting my bike route from python using Google Maps API. My novice’s guide to all this follows in the post.

How to plot your own bike/jogging route using Python and Google Maps API

Recently I was playing with my sports data and wanted to create aGoogle map with my bike ride like Garmin Connect or Strava is doing.

That let me to Google Map API , specifically to their javascript API. And since I was playing with my data in Python we’ll be creating the map from there.

But first things first. To get the positional data for some of my recent bike rides I downloaded a TCX file from Garmin Connect. Parsing the TCX is easy but more about that some other time. For now let me just show a basic Python 3.x snippet that parses latitude and longitude from my TCX file and stores them in a pandas data frame.

from lxml import objectify
import pandas as pd
# helper function to handle missing data in my file
def add_trackpoint(element, subelement, namespaces, default=None):
in_str = './/' + subelement
try:
return float(element.find(in_str, namespaces=namespaces).text)
except AttributeError:
return default
# activity file and namespace of the schema
tcx_file = 'activity_1485936178.tcx'
namespaces={'ns': 'http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2'}
# get activity tree
tree = objectify.parse(tcx_file)
root = tree.getroot()
activity = root.Activities.Activity
# run through all the trackpoints and store lat and lon data
trackpoints = []
for trackpoint in activity.xpath('.//ns:Trackpoint', namespaces=namespaces):
latitude_degrees = add_trackpoint(trackpoint, 'ns:Position/ns:LatitudeDegrees', namespaces)
longitude_degrees = add_trackpoint(trackpoint, 'ns:Position/ns:LongitudeDegrees', namespaces)
trackpoints.append((latitude_degrees,
longitude_degrees))
# store as dataframe
activity_data = pd.DataFrame(trackpoints, columns=['latitude_degrees', 'longitude_degrees'])

Now we can focus on the Google Map JavaScript. The documentation is really great so there is no point in rewriting it myself. This tutorial got me started. In a nutshell, I was about to create a html file that would source Google Map JavaScript API and use its syntax to create a map and plot the route on it.

Following javascript code initializes a new map.

var map;
function show_map() {{
map = new google.maps.Map(document.getElementById("map-canvas"), {{
zoom: {zoom},
center: new google.maps.LatLng({center_lat}, {center_lon}),
mapTypeId: 'terrain'
}});

What we need to solve is where to centre the map and what should be the zoom. The first task is easy as you can simply take an average of minimal and maximal latitude and longitude. Zoom is where things get a bit tricky.

Zoom is documented here plus I found an extremely useful answer on stackoverflow . The trick is to get the extreme coordinates of the route and deal with the Mercator projection Google Maps is using to get the zoom needed to show the whole route on one screen. This is done by functions _get_zoom and _lat_rad as shown further down in a code with Map class I used.

Once we have a map that is correctly centered and zoomed we can start plotting the route. This step is done by using simple polylines . Such polyline is initialised by following javascript code.

var activity_route = new google.maps.Polyline({{
path: activity_coordinates,
geodesic: true,
strokeColor: '#FF0000',
strokeOpacity: 1.0,
strokeWeight: 2
}});

Where activity_coordinates contains the coordinates of my route.

I wrapped all this into a Python class called Map that looks as follows

from __future__ import print_function
import math
class Map(object):
def __init__(self):
self._points = []
def add_point(self, coordinates):
"""
Adds coordinates to map
:param coordinates: latitude, longitude
:return:
"""
# add only points with existing coordinates
if not ((math.isnan(coordinates[0])) or (math.isnan(coordinates[1]))):
self._points.append(coordinates)
@staticmethod
def _lat_rad(lat):
"""
Helper function for _get_zoom()
:param lat:
:return:
"""
sinus = math.sin(math.radians(lat + math.pi / 180))
rad_2 = math.log((1 + sinus) / (1 - sinus)) / 2
return max(min(rad_2, math.pi), -math.pi) / 2
def _get_zoom(self, map_height_pix=900, map_width_pix=1900, zoom_max=21):
"""
Algorithm to derive zoom from the activity route. For details please see
- https://developers.google.com/maps/documentation/javascript/maptypes#WorldCoordinates
- http://stackoverflow.com/questions/6048975/google-maps-v3-how-to-calculate-the-zoom-level-for-a-given-bounds
:param zoom_max: maximal zoom level based on Google Map API
:return:
"""
# at zoom level 0 the entire world can be displayed in an area that is 256 x 256 pixels
world_heigth_pix = 256
world_width_pix = 256
# get boundaries of the activity route
max_lat = max(x[0] for x in self._points)
min_lat = min(x[0] for x in self._points)
max_lon = max(x[1] for x in self._points)
min_lon = min(x[1] for x in self._points)
# calculate longitude fraction
diff_lon = max_lon - min_lon
if diff_lon < 0:
fraction_lon = (diff_lon + 360) / 360
else:
fraction_lon = diff_lon / 360
# calculate latitude fraction
fraction_lat = (self._lat_rad(max_lat) - self._lat_rad(min_lat)) / math.pi
# get zoom for both latitude and longitude
zoom_lat = math.floor(math.log(map_height_pix / world_heigth_pix / fraction_lat) / math.log(2))
zoom_lon = math.floor(math.log(map_width_pix / world_width_pix / fraction_lon) / math.log(2))
return min(zoom_lat, zoom_lon, zoom_max)
def __str__(self):
"""
A Python wrapper around Google Map Api v3; see
- https://developers.google.com/maps/documentation/javascript/
- https://developers.google.com/maps/documentation/javascript/examples/polyline-simple
- http://stackoverflow.com/questions/22342097/is-it-possible-to-create-a-google-map-from-python
:return: string to be stored as html and opened in a

↧