Quantcast
Channel: CodeSection,代码区,Python开发技术文章_教程 - CodeSec
Viewing all 9596 articles
Browse latest View live

Python为何能坐稳 AI 时代头牌语言

$
0
0
python为何能坐稳 AI 时代头牌语言

一点号THU数据派2天前

本文长度为4200字,建议阅读8分钟

本文探讨了Python语言在AI领域的优势与运用。


php?url=0FYSi7Gppk" alt="Python为何能坐稳 AI 时代头牌语言" />

谁会成为AI 和大数据时代的第一开发语言?

这本已是一个不需要争论的问题。如果说三年前,Matlab、Scala、R、Java 和 Python还各有机会,局面尚且不清楚,那么三年之后,趋势已经非常明确了,特别是前两天 Facebook 开源了 PyTorch 之后,Python 作为 AI 时代头牌语言的位置基本确立,未来的悬念仅仅是谁能坐稳第二把交椅。

不过声音市场上还有一些杂音。最近一个有意学习数据科学的姑娘跟我说,她的一个朋友建议她从 Java 入手,因为 Hadoop 等大数据基础设施是用 Java 写的。

无独有偶,上个月 IBM developerWorks 发表的一篇个人博客(https://www.ibm.com/developerworks/community/blogs/jfp/entry/What_Language_Is_Best_For_Machine_Learning_And_Data_Science?lang=en),用职位招聘网站indeed 上的数据做了一个统计。

这篇文章本身算得上是客观公正,实事求是,但传到国内来,就被一些评论者曲解了本意,说 Python 的优势地位尚未确立,鹿死谁手尚未可知,各位学习者不可盲目跟风啊,千万要多方押宝,继续撒胡椒面不要停。

在这里我要明确表个态,对于希望加入到 AI 和大数据行业的开发人员来说,把鸡蛋放在 Python 这个篮子里不但是安全的,而且是必须的。

或者换个方式说,如果你将来想在这个行业混,什么都不用想,先闭着眼睛把 Python 学会了。

当然,Python不是没有它的问题和短处,你可以也应该有另外一种甚至几种语言与 Python 形成搭配,但是Python 将坐稳数据分析和 AI 第一语言的位置,这一点毫无疑问。

我甚至认为,由于 Python 坐稳了这个位置,由于这个行业未来需要大批的从业者,更由于Python正在迅速成为全球大中小学编程入门课程的首选教学语言,这种开源动态脚本语言非常有机会在不久的将来成为第一种真正意义上的编程世界语。

讨论编程语言的优劣兴衰一直被认为是一个口水战话题,被资深人士所不屑。但是我认为这次 Python 的上位是一件大事。

请设想一下,如果十五年之后,所有40岁以下的知识工作者,无分中外,从医生到建筑工程师,从办公室秘书到电影导演,从作曲家到销售,都能使用同一种编程语言进行基本的数据处理,调用云上的人工智能 API,操纵智能机器人,进而相互沟通想法,那么这一普遍编程的协作网络,其意义将远远超越任何编程语言之争。目前看来,Python 最有希望担任这个角色。

Python 的胜出令人意外,因为它缺点很明显。

它语法上自成一派,让很多老手感到不习惯;“裸” Python 的速度很慢,在不同的任务上比C 语言大约慢数十倍到数千倍不等;由于全局解释器锁(GIL)的限制,单个Python 程序无法在多核上并发执行;Python 2 和 Python 3 两个版本长期并行,很多模块需要同时维护两个不同的版本,给开发者选择带来了很多不必要的混乱和麻烦;由于不受任何一家公司的控制,一直以来也没有一个技术巨头肯死挺 Python 。

所以,相对于 Python 的应用之广泛,其核心基础设施所得到的投入和支持其实是非常薄弱的。


Python为何能坐稳 AI 时代头牌语言

直到今天,26岁的Python 都还没有一个官方标配的 JIT 编译器,相比之下, Java 语言在其发布之后头三年内就获得了标配 JIT 。

另一个事情更能够说明问题。Python 的 GIL 核心代码 1992 年由该语言创造者 Guido van Rossum 编写,此后十八年时间没有一个人对这段至关重要的代码改动过一个字节。

十八年!直到2010年,Antoine Pitrou才对 GIL 进行了近二十年来的第一次改进,而且还仅在 Python 3.x 版本中使用。这也就是说,今天使用 Python 2.7 的大多数开发者,他们所写的每一段程序仍然被26年前的一段代码牢牢制约着。

说到 Python 的不足,我就想起发生在自己身上的一段小小的轶事。我多年前曾经在一篇文章里声明自己看好 Python,而不看好 Ruby。

大概两年多以前,有一个网友在微博里找到我,对我大加责备,说因为当年读了我这篇文章,误听谗言,鬼迷心窍,一直专攻 Python,而始终对 Ruby 敬而远之。

结果他Python 固然精通,但最近一学 Ruby,如此美好,如此甜蜜,喜不自胜,反过来愤然意识到,当年完全被我误导了,在最美的年华错过了最美的编程语言。

我当时没有更多的与他争辩,也不知道他今天是否已经从Python后端、大数据分析、机器学习和 AI 工程师成功转型为Rails快速开发高手。我只是觉得,想要真正认识一件事物的价值,确实也不是一件容易的事情。

Python 就是这样一个带着各种毛病冲到第一方阵的赛车手,但即便到了几年前,也没有多少人相信它有机会摘取桂冠,很多人认为 Java 的位置不可动摇,还有人说一切程序都将用 javascript重写。

但今天我们再看,Python 已经是数据分析和 AI的第一语言,网络攻防的第一黑客语言,正在成为编程入门教学的第一语言,云计算系统管理第一语言。

Python 也早就成为Web 开发、游戏脚本、计算机视觉、物联网管理和机器人开发的主流语言之一,随着 Python 用户可以预期的增长,它还有机会在多个领域里登顶。

而且不要忘了,未来绝大多数的 Python 用户并不是专业的程序员,而是今天还在使用 Excel、PowePoint、SAS、Matlab和视频编辑器的那些人。

就拿 AI 来说,我们首先要问一下,AI 的主力人群在哪里?如果我们今天静态的来谈这个话题,你可能会认为 AI 的主力是研究机构里的 AI 科学家、拥有博士学位的机器学习专家和算法专家。

但上次我提到李开复的“AI红利三段论”明确告诉我们,只要稍微把眼光放长远一点,往后看三至五年,你会看到整个 AI 产业的从业人口将逐渐形成一个巨大的金字塔结构,上述的 AI 科学家仅仅是顶端的那么一点点,95% 甚至更多的 AI 技术人员,都将是AI 工程师、应用工程师和AI 工具用户。

我相信这些人几乎都将被Python 一网打尽,成为 Python 阵营的庞大后备军。

这些潜在的 Python 用户至今仍然在技术圈子之外,但随着 AI 应用的发展,数百万之众的教师、公司职员、工程师、翻译、编辑、医生、销售、管理者和公务员将裹挟着各自领域中的行业知识和数据资源,涌入 Python 和 AI 大潮之中,深刻的改变整个 IT,或者说 DT (数据科技)产业的整体格局和面貌。

为什么 Python 能够后来居上呢?

如果泛泛而论,我很可以列举 Python 的一些优点,比如语言设计简洁优雅,对程序员友好,开发效率高。但我认为这不是根本原因,因为其他一些语言在这方面表现得并不差。

还有人认为 Python 的优势在于资源丰富,拥有坚实的数值算法、图标和数据处理基础设施,建立了非常良好的生态环境,吸引了大批科学家以及各领域的专家使用,从而把雪球越滚越大。

但我觉得这是倒因为果。为什么偏偏是 Python 能够吸引人们使用,建立起这么好的基础设施呢?为什么世界上最好的语言 PHP 里头就没有 numpy 、NLTK、sk-learn、pandas 和 PyTorch 这样级别的库呢?为什么 JavaScript 极度繁荣之后就搞得各种程序库层次不齐,一地鸡毛,而 Python 的各种程序库既繁荣又有序,能够保持较高水准呢?

我认为最根本的原因只有一点:Python 是众多主流语言中唯一一个战略定位明确,而且始终坚持原有战略定位不动摇的语言。相比之下,太多的语言不断的用战术上无原则的勤奋去侵蚀和模糊自己的战略定位,最终只能等而下之。

Python 的战略定位是什么?其实很简单,就是要做一种简单、易用但专业、严谨的通用组合语言,或者叫胶水语言,让普通人也能够很容易的入门,把各种基本程序元件拼装在一起,协调运作。

正是因为坚持这个定位,Python 始终把语言本身的优美一致放在奇技妙招前面,始终把开发者效率放在CPU效率前面,始终把横向扩张能力放在纵向深潜能力之前。长期坚持这些战略选择,为 Python 带来了其他语言望尘莫及的丰富生态。

比如说,任何一个人,只要愿意学习,可以在几天的时间里学会Python基础部分,然后干很多很多事情,这种投入产出比可能是其他任何语言都无法相比的。

再比如说,正是由于 Python 语言本身慢,所以大家在开发被频繁使用的核心程序库时,大量使用 C 语言跟它配合,结果用 Python 开发的真实程序跑起来非常快,因为很有可能超过 80% 的时间系统执行的代码是 C 写的。

相反,如果 Python 不服气,非要在速度上较劲,那么结果很可能是裸速提高个几倍,但这样就没人有动力为它开发 C 模块了,最后的速度远不如混合模式,而且很可能语言因此会变得更复杂,结果是一个又慢又丑陋的语言。

更重要的是,Python 的包装能力、可组合性、可嵌入性都很好,可以把各种复杂性包装在 Python 模块里,暴露出漂亮的接口。

很多时候,一个程序库本身是用 C/C++ 写的,但你会发现,直接使用 C 或者 C++ 去调用那个程序库,从环境配置到接口调用,都非常麻烦,反而隔着一层,用其python 包装库更加清爽整洁,又快又漂亮。这些特点到了 AI 领域中,就成了 Python 的强大优势。

Python 也借助 AI 和数据科学,攀爬到了编程语言生态链的顶级位置。Python 与 AI绑在一起,对它们来说,无论是电子商务、搜索引擎、社交网络还是智能硬件,未来都只是生态链下游的数据奶牛、电子神经和执行工具,都将听命于自己。


Python为何能坐稳 AI 时代头牌语言

对编程语言发展历史缺乏了解的人可能会觉得,Python 的战略定位是犬儒主义和缺乏进取心的。但事实证明,能同时做到简单而严谨、易用而专业,是很难的,而能够坚守胶水语言的定位,更是难上加难。

有的语言,从一开始就是出于学术而非实用的目的,学习曲线过于陡峭,一般人很难接近。

有的语言,过于依赖背后金主的商业支持,好的时候风光无限,一旦被打入冷宫,连生存下去都成问题。

有的语言,设计的时候有明确的假想场景,要么是为了解决大规模并发,要么是为了解决矩阵运算,要么是为了做网页渲染模板,一旦离开这个场景,就各种不爽。

更多的语言,刚刚取得一点成功,就迫不及待的想成为全能冠军,在各个方向上拼命的伸展触角,特别是在增强表达能力和提升性能方面经常过分积极,不惜将核心语言改得面目全非,最后变成谁都无法掌控的庞然大物。

相比之下,Python 是现代编程语言设计和演化当中的一个成功典范。

Python 之所以在战略定位上如此清晰,战略坚持上如此坚定,归根结底是因为其社区构建了一个堪称典范的决策和治理机制。

这个机制以 Guido van Rossum (BDFL,Pythoners 都知道这是什么意思), DavidBeazley, Raymond Hettinger 等人为核心,以 PEP 为组织平台,民主而有序,集中而开明。只要这个机制本身得以维系,Python 在可见的未来里仍将一路平稳上行。

最有可能向 Python 发起挑战的,当然是Java。Java 的用户存量大,它本身也是一种战略定位清晰而且非常坚定的语言。

但我并不认为 Java 有很大的机会,因为它本质上是为构造大型复杂系统而设计的。什么是大型复杂系统?就是由人清清楚楚描述和构造出来的系统,其规模和复杂性是外生的,或者说外界赋予的。

而 AI 的本质是一个自学习、自组织的系统,其规模和复杂性是一个数学模型在数据的喂养下自己长出来的,是内生的。

因此,Java大多数的语言结构对于大数据的处理和 AI 系统的开发显得使不上劲,你强的东西这里用不上,这里需要的东西你做起来又别扭。

而 Python 在数据处理方面的简洁强悍早就尽人皆知。对比两个功能相同的 Java 和 Python 机器学习程序,正常人只要看两眼就能做出判断,一定是 Python 程序更加清爽痛快。

大概在 2003 或者 2004 年的时候,我买过一本 Python 的书,作者是一位巴西人。他说自己之所以坚定的选择 Python,是因为他小时候经常梦到未来世界将由一条大蟒蛇(蟒蛇的英文为python)统治。

我当时觉得这哥们好可怜,做个梦都能梦到这么恐怖的场景。但今天来看,也许他只是像黑客帝国里的程序员安德森一样,不小心穿越到未来,并且窥探到了世界的真相。

本期排版:卢苗苗


字符集编码与Python编码历史

$
0
0
字符集编码与python编码历史

昨天来源:cnblogs

ASCII(American Standard Code for Information Interchange,美国信息交换标准代码)是基于拉丁字母的一套电脑编码系统,主要用于显示现代英语和其他西欧语言。ASCII 是一种字符编码,同时也是一种最简单的字符集编码,把128 个字符映射至整数0 ~ 127。

在 ASCII码里面,虽然你看到的是一个字符,但是对于计算机来说,存的就是二进制的整数,也就是说你看到的可能是abcd,但是计算机内部存的就是97 98 99 100。

ASCII这种7-bit 字符编码系统非常简单,在计算机中以一个字节存储一个字符。如下所示,

我安装了一个ascii

下面就是详细的信息:

[root@iZ28b4rx8dxZ ~]# yuminstall ascii [root@iZ28b4rx8dxZ ~]# ascii Usage: ascii [-dxohv] [-t] [char-alias...] -t = one-line output -d = Decimal table -o = octal table -x = hex table -h = This help screen -v = version information Prints all aliases of an ASCII character. Args may be chars, C \-escapes, English names, ^-escapes, ASCII mnemonics, or numerics indecimal/octal/hex. Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex Dec Hex 000 NUL 1610 DLE 3220483006440 @ 8050 P 9660 ` 11270 p 101 SOH 1711 DC1 3321 ! 493116541 A 8151 Q 9761 a 11371 q 202 STX 1812 DC2 3422" 50 32 2 66 42 B 82 52 R 98 62 b 114 72 r303 ETX 1913 DC3 3523 # 513336743 C 8353 S 9963 c 11573 s 404 EOT 2014 DC4 3624 $ 523446844 D 8454 T 10064 d 11674 t 505 ENQ 2115 NAK 3725 % 533556945 E 8555 U 10165 e 11775 u 606 ACK 2216 SYN 3826 & 543667046 F 8656 V 10266 f 11876 v 707 BEL 2317 ETB 3927' 55 37 7 71 47 G 87 57 W 103 67 g 119 77 w808 BS 2418 CAN 4028 ( 563887248 H 8858 X 10468 h 12078 x 909 HT 2519 EM 4129 ) 573997349 I 8959 Y 10569 i 12179 y 10 0A LF 26 1A SUB 42 2A * 58 3A : 74 4A J 90 5A Z 106 6A j 122 7A z 11 0B VT 27 1B ESC 43 2B + 59 3B ; 75 4B K 91 5B [ 107 6B k 123 7B { 12 0C FF 28 1C FS 44 2C , 60 3C < 76 4C L 92 5C \ 108 6C l 124 7C | 13 0D CR 29 1D GS 45 2D - 61 3D = 77 4D M 93 5D ] 109 6D m 125 7D } 14 0E SO 30 1E RS 46 2E . 62 3E > 78 4E N 94 5E ^ 110 6E n 126 7E ~ 15 0F SI 31 1F US 47 2F / 63 3F ? 79 4F O 95 5F _ 111 6F o 127 7F DEL [root@iZ28b4rx8dxZ ~]#

截个图显示一下:


php?url=0FZ7rqGfH5" alt="字符集编码与Python编码历史" />
Latin-1

虽然ASCII好用,但是,很多时候一个字符用一个字节并不够。例如,各种符号和重音字符并不在ASCII所定义的可能字符的范围中,

为了容纳特殊字符,一些标准允许一个8位字节的所有bit都用来表示字符,那么,一个字节最多就可以表示255个字符了,这样的一个标准叫latin-1

Latin1是ISO-8859-1的别名,有些环境下写作Latin-1。ISO-8859-1编码是单字节编码,向下兼容ASCII,其编码范围是0x00-0xFF,0x00-0x7F之间完全和ASCII一致,0x80-0x9F之间是控制字符,0xA0-0xFF之间是文字符号。

因为Latin-1 编码范围使用了单字节内的所有空间,在支持Latin-1 的系统中传输和存储其他任何编码的字节流都不会被抛弃。换句话说就是把其他任何编码的字节流当作Latin-1 编码看待都没有问题。这是个很重要的特性,mysql数据库默认编码是Latin1就是利用了这个特性。

在latin-1中,127以上的字符代码分配给了重音和其他特殊字符

总结latin-1与ASCII的区别:

1、在计算机里面一个字节是8位二进制,8位二进制可以表示255个字符,但是ASCII只有128个,也就是说第八位的那个bit是没有用的,只用到了后面的7个bit来表示ASCII,但是latin-1就把第八位的那个bit用到了,换句话说就是把一个字节的最高位的bit也使用上了。

2、latin-1虽然对ASCII进行了扩展,可以表示最多为256个字符,但是,它仅适合美国英语,甚至一些英语中常用的标点符号、重音符号都不能表示,无法表示各国语言,特别是中文等表意文字(一种用象征性书写符号记录词或词素的文字体系,不直接或不单纯表示语音。 )

GB 2312

GB 2312是一个简体中文汉字编码国家标准,GB2312对汉字采用双字节编码,由6763个常用汉字和682个全角的非汉字字符组成。其中汉字根据使用的频率分为两级。一级汉字3755个,二级汉字3008个。由于字符数量比较大,GB2312采用了二维矩阵编码法对所有字符进行编码。

BIG5编码

台湾地区繁体中文标准字符集,采用双字节编码,也称为大五码,具体的信息我就不介绍了,百度上一堆

GBK编码

1995年12月发布的汉字编码国家标准,是对GB2312编码的扩充,对汉字采用双字节编码。GBK字符集共收录21003个汉字,包含国家标准GB13000-1中的全部中日韩汉字,和BIG5编码中的所有汉字。

最一开始,在Unicode 出现之前,各地区国家制定了不同的编码系统,也就是说中文自己进行编码,如GB 2312 和大五码等,日文也自己进行编码,如主要用JIS ,这样会造成混乱不便,例如一个文本信息里面混合了不同的语言,就不能正确的表示信息,也就是无法进行混合的使用。

Unicode

Unicode(统一码、万国码、单一码)是计算机科学领域里的一项业界标准,包括字符集、编码方案等。Unicode 是为了解决传统的字符编码方案的局限而产生的,它对世界上大部分的文字系统进行了整理、编码,使得电脑可以用更为简单的方式来呈现和处理文字,以满足跨语言、跨平台进行文本转换、处理的要求。

统一码为每一个字符而非字形定义唯一的代码(即一个整数)。换句话说,统一码以一种抽象的方式(即数字)来处理字符,并将视觉上的演绎工作(例如字体大小、外观形状、字体形态、文体等)留给其他软件来处理,例如网页浏览器或是文字处理器

Unicode是为了解决传统的字符编码方案的局限而产生的

每个字符映射至一个整数码点(code point),码点的范围是0 至0x10FFFF,在表示一个Unicode的字符时,通常会用“U+”然后紧接着一组十六进制的数字来表示这一个字符。在Python中,用\uxxxx表示。所以有的时候用写Python的过程中看到了像\u四个十六进制的的字符的话,说明这个字符是Unicode。

其中Unicode 还制定了各种储存码点的方式,这些方式称为Unicode 转换格式(Uniform Transformation Format, UTF)。

现在流行的UTF 为UTF-8、UTF-16 和UTF-32。每种UTF 会把一个码点储存为一至多个编码单元(code unit)。例如UTF-8 的编码单元是8 位的字节、UTF-16 为16 位、UTF-32 为32 位。除UTF-32 外,UTF-8 和UTF-16 都是可变长度编码。

其中UTF-8 成为现在互联网上最流行的格式,有几个原因:

1、它采用8位编码,八位正好是一个字节,也就是说以字节为编码单元, 不会有字节序(大端模式,小端模式)的问题

大端模式:是指数据的低位保存在内存的高地址中,而数据的高位,保存在内存的低地址中; 这和我们的阅读习惯一致。地址的增长顺序与值的增长顺序相反 小端模式:是指数据的低位保存在内存的低地址中,而数据的高位保存在内存的高地址中。地址的增长顺序与值的增长顺序相同

单字节的情况下不会出现大端小端的问题,比如说有两个字节,ab就要考虑是a放在前面还是b放在前面,对于大端的话就是a放前面,对于小端的话就是b放前面。我们常用的X86结构是小端模式

2、如果程序原来是以字节方式储存字符,理论上不需要特别改动就能处理UTF-8 的数据

3、每个ASCII 字符只需一个字节去储存

UTF-8的实现

首先声明:utf8和Unicode并不是竞争的关系

UTF-8 的编码单元是8 位字节,每个码点编码成1 至4 个字节。它的编码方式很简单,按照码点的范围,把码点的二进位分拆成1 至最多4 个字节

UTF-8 编码方法的好处之一是,码点范围U+0000 ~ U+007F 编码为一个字节,与ASCII 编码兼容。这范围的Unicode 码点也是和ASCII 字符相同的。因此,一个ASCII 文本也是一个UTF-8 文本

也就是说Unicode编码成utf-8有自己的一套编码方法


字符集编码与Python编码历史

用Python举个例子解释一下:

把“张三”编码成为utf-8

这其中的转换过程如下所示:

把5f20和4e09分别翻译成二进制然后转为utf-8

5f20:

翻译为二进制:

按照上面码点范围的转换方式,5f20 应该属于第三个范围 也就是 1110+0101(四位)+10+111100(六位)+10+100000(六位):

1110 0101 1011 1100 1010 0000

e 5 b c a 0

4e09:

翻译为二进制:

0100 1110 0000 1001

e 4 b 8 8 9

下一篇会在Python里面详细理解Unicode与utf-8

Conda's New Noarch Packages

$
0
0

Beginning with conda version 4.3 and conda-build 2.1, two new types of noarch packages are supported. Noarch python packages cut down on the overhead of building multiple different pure Python packages on different architectures and Python versions by sorting out platform and Python version-specific differences at install time. Noarch generic packages allow users to distribute docs, datasets, and source code in conda packages.

It’s true that conda-build has had a noarch_python option for a while, but the user experience has been suboptimal. The deprecated noarch_python flag adds a pre-link script to the package to handle the several install-time platform-dependent differences. Our noarch Python implementation teaches conda itself about noarch packages of the Python type. The noarch Python-dependent install logic is moved out of the package and into conda, where any extra capabilities and bugs can be directly addressed without the need to rebuild the package.

How to Build Noarch Packages

To build a noarch Python package, specify noarch in your meta.yaml:

build: noarch: python

Similarly, to build a noarch generic package, specify noarch in your meta.yaml:

build: noarch: generic

While there are currently only two flavors of supported noarch package― generic and python ―we’ll likely extend the concept in future releases.

The Anatomy of a Python Noarch Package Similar to a regular conda package, a noarch Python package will contain a site-packages and info directory that define the package. In addition to the standard files in the info directory there is package_metadata.json that defines the type of noarch package and any entry points or scripts. Entry points, defined in the setup.py style entry_points['console_scripts'] will be created by conda when the package is installed. These packages also do not contain any .pyc files since these differ among python versions. Instead, generation of .pyc files is handled by conda at install time. All other scripts associated with the package, for example those found in bin or Scripts , will be included in the directory python-scripts .

The package structure will look something like:

package - info/ - files - about.json - index.json - package_metadata.json - recipe/ - ... - site-packages/ - ... - python-scripts/ - ...

Where, package_metadata.json will have a noarch section that will look something like:

{ "noarch": { "type": "python", "entry_points": [ "pkg = pkg.foo:main" ] } } The Noarch Python Package Build Process

By defining noarch: python in meta.yaml, conda-build will create a noarch Python package as defined above without any .pyc files or __pycache__ directories. It will also create a info/package_metadata.json file with information about the type of noarch package and the entry point information.

For example, consider the flask package on the anaconda-recipes repo. This is a pure Python package that can easily be turned into a noarch package by slight modification to the meta.yaml file:

build: noarch: python entry_points: flask = flask.cli:main

Then, build the package as any normal conda package: conda build . The resulting package is a noarch flask package and installable on any architecture and Python version (that the package itself supports).

The Noarch Python Package Install Process

To install these packages, conda will map site-packages and python-scripts directories to the corresponding correct locations within the install prefix. It will then generate the entry points for the package if applicable. On windows systems, it will include the shim script required for entry points to work. Finally, conda will compile .pyc files. From the user’s perspective, installing a noarch package is the same as any other, and is as simple as conda install <package> .

Uninstalling noarch packages works the same way that uninstalling regular conda packages works. That is, conda uninstall <package> will remove it from the environment.

Looking to the Future

This new way of treating noarch packages aims to provide users with a flexible way of creating conda packages that are platform and python version agnostic. Further, it should provide the flexibility to create noarch packages for other interpreted languages, like R, lua, and ruby.

Wingware News: Wing IDE 6.0.2: February 2, 2017

$
0
0

Wingware News: Wing IDE 6.0.2: February 2, 2017
Wingware News: Wing IDE 6.0.2: February 2, 2017

Wingware has released version 6.0.2 ofWing IDE, our cross-platform integrated development environment for the python programming language.

Wing IDE is a Python IDE with powerfully integrated editing, debugging, unit testing, and project management features. Wing runs on windows, linux, and OS X, to make all your Python development fast, accurate, and fun.

Changes in 6.0.2

This release of Wing IDE (1) adds support for remote development using OpenSSH via Cygwin or Git Bash on Windows, (2) allows X11 forwarding for remote development, (3) adds a drop down of found Python installations to Python Executable properties, (4) introduces refactoring operations to convert symbols between lowerCamelCase, UpperCamelCase, under_scored_name, and UNDER_SCORE_NAME styles, (5) improves adding selections by clicking, (6) fixes debugging Jupyter notebooks, and makes about 40 other improvements. See thechange log for details.

New in Wing IDE 6

Wing 6 is a major release with many new features.


Wingware News: Wing IDE 6.0.2: February 2, 2017
Improved Multiple Selections

Wing Pro and Personal 6 make working with multiple selections on the editor much easier. The Edit > Multiple Selections menu and selections toolbar item can be used to select matching occurrences of text one at a time, or within the current block, function, method, class, or file. Once multiple selections are made, edits are applied to all of them.

For details, see theMultiple Selections.


Wingware News: Wing IDE 6.0.2: February 2, 2017
Easy Remote Development

Wing Pro 6 adds the ability to connect to a remote host through a secure SSH tunnel in order to work with files stored remotely in the same way that Wing supports working with files on your local system.

This is done by setting up SSH access to a remote host outside of Wing IDE and then configuring the remote host from Wing's Project > Remote Hosts menu item and using that host for the Python Executable in Project Properties . Files and directories added to the project can be on any configured remote host, and the project file can be stored either remotely or locally. Editing, debugging, testing, searching, version control, Python Shell, OS Commands, and other features work with remote files as if they were stored locally.

For detailed instructions seeRemote Hosts.


Wingware News: Wing IDE 6.0.2: February 2, 2017
Debugging in the Python Shell

All product levels of Wing 6 make it possible to turn on debugging for code that is executed in the Python Shell . This is done by pressing the bug icon in the top right of the Python Shell tool. Once enabled, a breakpoint margin will appear and Wing's debugger will stop on any breakpoints and exceptions reached in code, either within the Python Shell tool or in source files.

For details see Debugging Code in the Python Shell . If you are debugging multi-threaded code from the shells, you will want to read and understand how threads are managed in this style of debugging.


Wingware News: Wing IDE 6.0.2: February 2, 2017
Recursive Debugging

In Wing Pro it is possible to debug code invoked from the Debug Probe , allowing for one level of recursive debugging.

The Python Shell and Debug Probe can also debug recursively to any depth by checking Enable Recursive Prompt in their Options menus. When enabled, Wing displays a new prompt whenever the debugged code reaches a breakpoint or exception, so that you can continue to interact with the paused debug process from the command line, optionally debugging other code in context of the currently selected debug stack frame. Continuing or stopping debug will exit one level of recursion rather than exiting the debug process entirely.


Wingware News: Wing IDE 6.0.2: February 2, 2017
PEP 484 and PEP 526 Type Hints

Wing 6 can understand type hints in the style standardized by PEP 484 (Python 3.5+) and PEP 526 (Python 3.6+).

For details see Helping Wing Analyze Code .


Wingware News: Wing IDE 6.0.2: February 2, 2017
Improved Raspberry Pi Support

Wing Pro 6 makes it much easier to work with code on the Raspberry Pi, using the new remote development support to set up remote access to the Raspberry Pi.

For details, see Using Wing IDE with Raspberry Pi .


Wingware News: Wing IDE 6.0.2: February 2, 2017
Annual License Option and Pricing Changes

Wing 6 adds the option of purchasing a lower-cost expiring annual license for Wing IDE Pro. An annual license includes access to all available Wing IDE versions while it is valid, and then ceases to function if it is allowed to expire. Pricing for annual licenses is US$ 179/user for Commercial Use and US$ 69/user for Non-Commercial Use.

The cost of extending Support+Upgrades subscriptions on Non-Commercial Use perpetual licenses for Wing IDE Pro has been dropped from US$ 89 to US$ 39 per user.

For details, see theWingware store.


Wingware News: Wing IDE 6.0.2: February 2, 2017
Wing IDE Personal is Now Free

Wing IDE Personal is now a free product and no longer requires a license to run. It now also includes the Source Browser , PyLint , and OS Commands tools, and supports the scripting API and Perspectives.

However, Wing Personal does not include Wing Pro's advanced editing, debugging, testing and code management features, such as remote host access, refactoring, find uses, version control, unit testing, interactive debug probe, multi-process and child process debugging, move program counter, conditional breakpoints, debug watch, framework-specific support (for matplotlib, Django, and others), find symbol in project, and other features.


Wingware News: Wing IDE 6.0.2: February 2, 2017
Other Improvements

Wing 6 adds many other new features and improvements, including the following:

Support for Python 3.6 and Stackless 3.4 Optimized debugger, particularly for multi-threaded and multi-process code Support OS X full screen mode Restore editor selection after undo and redo Added One Dark color palette Support Move Program Counter in recent Python versions Refactoring operations to convert easily between lowerCamelCase, UpperCamelCase, under_scored_name, and UNDER_SCORE_NAME symbol name formatting Holding modifier keys and clicking in the Key field for the Custom Key Bindings preference produces a binding (for example, Ctrl-Right-button-click) that can be bound to a command Better support for portable installs, by allowing auto-activation of a stored license and using --settings and --cache command line arguments to specify location of the settings and cache directories Always move breakpoints and Debug To Here positions to valid lines that will actually be reached by the Python interpreter Support for custom python builds on Windows Automatically find Python installations that follow PEP 514 on Windows Updated French localization (thanks to Jean Sanchez and Laurent Fasnacht) Updated German localization (thanks to Christoph Heitkamp)

Not all of these features are available in Wing IDE Personal and Wing IDE 101. See thefeature matrix for details.

Wing 6 installs side by side with earlier versions of Wing, so there is no need to remove old versions in order to try Wing 6. Wing 6 will read and convert Wing 5 preferences, settings, and projects. Projects should be saved to a new name since earlier versions of Wing cannot read Wing 6 projects.

SeeUpgrading for details and Migrating from Older Versions for a list of compatibility notes.

More About Wing IDE

Wing IDE is an integrated development environment designed specifically for the Python programming language. It integrates powerful editing, testing, debugging, and project management features to help reduce development and debugging time, cut down on coding errors, and make it easier to understand and navigate Python code. Wing IDE can be used to develop any kind of Python code for web, GUI, embedded scripting, and other applications.

Wing IDE is available in three product levels: Wing IDE Professional is the full-featured Python IDE, Wing IDE Personal is a free alternative that offers a reduced feature set for students and hobbyists, and Wing IDE 101 is a very simplified free product designed for teaching beginning programming courses with Python.

Version 6 of Wing IDE Professional includes the following major features:

Native user interface on OS X, Windows, and Linux Powerful code editor with vi, emacs, Visual Studio, Eclipse, XCode, and other keyboard personalities Code intelligence for Python: Auto-editing, auto-completion, call tips, find uses, goto-definition, error indicators, refactoring, find symbol, smart indent and rewrapping, source navigation, and support for PEP 484, PEP 526, and other type hinting styles Advanced multi-process and multi-threaded debugger with graphical UI, command line interaction, conditional breakpoints, data value tooltips, watch tool, move program counter, sharable launch configurations, named entry points, interactive Python shell debugging, recursive debugging, and externally launched and remote debugging Easy remote development via secure SSH tunnels Powerful search and replace options including keyboard driven and graphical UIs, multi-file, wild card, and regular expression search and replace Version control integration for Subversion, CVS, Bazaar, git, Mercurial, and Perforce Integrated unit testing with unittest, pytest, nose, doctest, and Django testing frameworks Many other features including project manager, multiple selections, bookmarks, recursive code snippets, diff/merge tool, integrated OS command invocation, indentation manager, PyLint integration, named file sets, and perspectives Extremely configurable and may be extended with Python scripts Extensive product documentation , tutorial, and How-Tos for Django, Flask, Google App Engine, matplotlib, Raspberry Pi, Plone, wxPython, PyQt, mod_wsgi, Autodesk Maya, blender, NUKE/NUKEX, and many other Python libraries and applications Django support : Debugs Django templates, provides project setup tools, and runs Django unit tests

For more information, see the Wing IDE product overview and thefeature comparison for a detailed listing of features by product level.

System requirements are Windows 7 or later, OS X 10.7 or later, or a recent 64-bit Linux system. Remote development is also supported to 32-bit and 64-bit Linux systems that are compatible with PEP 513's manylinux1 policy, and Raspberry Pi May 2016 or newer. Wing IDE 6.0 supports Python versions 2.5 through 2.7 and 3.2 through 3.6 and Stackless Python.

Downloads

Wing IDE Pro -- A full-featured Python IDE. Requires a license to run, or obtain afree trial when you start the product.

Wing IDE Personal -- A free simplified IDE for students and hobbyists.

Wing IDE 101 -- A very simplified free IDE for teaching beginning programmers with Python.

Purchasing and Upgrading Wing Pro Licenses

Purchase a new license -- For Perpetual or Annual Use licenses for Wing IDE Pro version 6.x.

Upgrade a license -- For users of Wing Pro 5.x and earlier licenses that are not covered by Support+Upgrades.

Django Conditional Expressions in Queries

$
0
0

Django Conditional Expressions are added inDjango 1.8.

By using Conditional Expressions w e can use "If...Elif...Else" expressions while querying the database.

Conditional expressions executes series of conditions while querying the database, It checks the condition for every record of the table in database and returns the matching results.

Conditional expressions can be nested and also can be combined.

The following are the Conditional Expressions in Django and Consider the below model for sample queries

class Employee(models.Model):
ACCOUNT_TYPE_CHOICES = (
("REGULAR", 'Regular'),
("GOLD", 'Gold'),
("PLATINUM", 'Platinum'),
)
name = models.CharField(max_length=50)
joined_on = models.DateField()
salary = models.DecimalField()
account_type = models.CharField(
max_length=10,
choices=ACCOUNT_TYPE_CHOICES,
default="REGULAR",
) 1. WHEN

A When() object is used as a condition inside the query

from django.db.models import When, F, Q
>>> When(field_name1_on__gt=date(2014, 1, 1), then="field_name2") # if we want the value in the field
>>> When(field_name1_on__gt=date(2014, 1, 1), then=5) # we can specify external value in place of "5"
>>> When(Q(name__startswith="John") | Q(name__startswith="Paul"), then="name") # we can also use nested lookups 2.CASE

A Case() expression is like the if ... elif ... else statement in python. It executes the conditions one by one until one of the given conditions are satisfied. If no conditions are satisfied then the default value is returned if it is provided otherwise "None" will be returned.

from django.db.models import CharField, Case, Value, When
>>>ModelName.objects.annotate(
... field=Case(
... When(field1="value1", then=Value('5%')),
... When(field1="value2", then=Value('10%')),
... default=Value('0%'),
... output_field=CharField(),
... ),
... )

If we want to update the account type to PLATINUM if employee has more than 3 years of experiance, to GOLD if employee has more than 2 years of experiance otherwise REGULAR

For this we can write the code in two ways:

case1: Better Query without using conditional Expressions

>>> above_3yrs = date.today() - timedelta(days=365 * 3)
>>> above_2yrs = date.today() - timedelta(days=365 * 2)
>>> Employee.objects.filter(joined_on__lt=above_3yrs).update(account_type="PLATINUM")
>>> Employee.objects.filter(joined_on__lt=above_2yrs).update(account_type="GOLD")

Above code hits the database for two times to apply the change

case2: Best Query using conditional Expressions

>>> from datetime import date
>>> above_3yrs = date.today() - timedelta(days=365 * 3)
>>> above_2yrs = date.today() - timedelta(days=365 * 2)
>>> Employee.objects.update(
... account_type=Case(
... When(registered_on__lte=above_3yrs,
... then=Value("PLATINUM")),
... When(registered_on__lte=above_2yrs,
... then=Value("GOLD")),
... default=Value("REGULAR")
... ),
... )

It hits the database only one time. So, We can reduce the no of queries to the database. By reducing the number of queries on the database, ultimately we can improve the efficiency and response time.

Rene Dudfield: Where the code & data things are. Part 3.

$
0
0

This is part three of a series of articles about packaging python games.Part one, Part two . More discussion is happening on the pygame mailing list.

TLDR; I think we should follow the python conventions and fix the problems they cause.

src, gamelib, and mygamepackage

1). One thing that is different in the skellington layout from the sampleproject one is that the naming is a bit more specific for where the code goes in skellington.

Skellington layout:

gamelib/

data/

Why this is good? Because you can start writing your code without first having to decide the name. It's a small thing, but in the context of game competitions it's more important. I'm not sure if it's really worth keeping that idea though.

Sampleproject layout after doing " skellington create mygamepackage"

mygamepackage/

data/

(Where skellington is the name of our tool. It could be pygame create... or whatever)

The benefits of this are that you can go into the repo and do:

import mygamepackage

Any it works. Because mygamepackage is just a normal package.

You can also do:

python mygamepackage/run.py

Whilst naming is important, the name of the package doesn't need to be the name of the Game. I've worked on projects where the company name, and app name changed at least three times whilst the package name stayed the same. So I don't think people need to worry so much. Also, it's not too hard to change the package name later on.

My vote would be to follow the python sampleproject way of doing things.

Data folder, and "get_data". 2). The other aspect I'm not sure of is having a data folder outside of the source folder. Having it inside the source folder means you can more easily include it inside a zip file for example. It also makes packaging slightly easier. Since you can just use from within a MANIFEST.in a recursive include of the whole "mygamepackage " folder.

Having data/ separate is the choice of the sampleproject, and the skellington.

I haven't really seen a modern justification for keeping data out of the package folder? I have vague recollections of reasons being: 'because debian does it'. My recollection is that Debian traditionally did it to keep code updates smaller. Because if you only change 1KB of source code, there's no point having a 20MB update every time.

A bonus of keeping data separate is that it forces you to use relative addressing of file locations. You don't hardcode "data/myfile.png" in all your paths. Do we recommend the python way of finding the data folder? This is package_data, and data_files setup attributes. https://github.com/pypa/sampleproject/blob/master/setup.py

They are a giant pain. One, because they require mentioning every single file, rather than just the whole folder. Two, because they require you updating configuration in both MANIFEST.in and setup.py. Also, there are different files included depending on which python packaging option you use.

See the packaging.python.org documentation on including data files: https://packaging.python.org/distributing/?highlight=data#data-files

Another issue is that, using the python way pkg_resources from setuptools needs to be used at runtime. pkg_resources gets you access to the resources in an abstract way. Which means you need setuptools at runtime (not just at install time). There is already work going into keeping that separate here: https://github.com/pypa/pkg_resources So I'm not sure this will be a problem in the next months.

I haven't confirmed if pkg_resources works with the various .exe making tools. I've always just used file paths. Thomas, does it work with Pynsist?

A single file .exe on windows used to be possible with pygame including all of the data. It worked by adding a .zip file to the end of the .exe and then decompressing that before running it. It actually made startup time slower, but the benefit was distribution was pretty easy. However, putting everything in a .zip file was just as good.

Perhaps we could work on adding a find_data_files() type function for setuptools, which would recursively add data files from data/ . We could include it in our 'skellington' package until such a thing is more widely installed by setuptools.

Despite all the issues of having a separate data/ folder, it is the convention so far. So my vote is to follow that convention and try fixing the issues in setuptools/pkg_resources.

Too many files, too complex for newbies.

3) Modern python packages have 20-30 files in the root folder. I have heard the complaint many times that it makes it difficult to figure out where to put things. It makes it complex. This is the strong feedback I got in one pyweek where lots of people decided to use the older skellington instead.

We can help people asking the question "where does my game code live?", "where do my image files go?" by putting it right up the top of the readme.

We can also help by using dotfiles ".file" so they are hidden. And also using .gitignore and such. We can also try to keep as many 'packaging' related files in a 'dist' folder. Even better would be to put things in our ' skellington ' package, in setuptools, or upstream wherever possible.

It used to be convention to have a '

dist

' folder which would contain various distribution and packaging scripts. (It's where distutils puts things too). I'm not sure putting scripts in there is a good idea.

Another reason I think a package based layout will work now is that compared to 3 years ago, the python packaging system has improved a lot. As well, we don't need to support older pythons with more broken things. Also, I think if a few people iterate on the skellington, it should become clearer and less buggy than what I presented to people 3 years ago.

The other problem with having a million config files is that the question " where do I change the app description? " becomes harder. With cookiecutter, we can make a template which fills templates with all the metadata. However, often you want to change that after you started. Maybe there's no real solution right now for all this. It is definitely a concern we need to try to address at least in some way.

I think it's important that we test the structure with people and gain their feedback early on. To do this, I'd like to ask someone who hasn't done a python package before and who has done a game to package it up using our structure and tools.

My vote would be to add simple instructions to the top of the readme, to work on fixing things upstream as much as possible, and to be very mindful about adding extra config files or scripts, moving much config out of the repo as is possible.

Talk Python to Me: #97 Flask, Django style with Flask-Diamond

$
0
0

There's a whole spectrum of python web frameworks. On one end we have the micro-frameworks like bottle, flask, and do some degree Pyramid. On the other things like Django and even CMSes like Wagtail (built on Django) in the far end.

While this is often positioned as an either / or choice, this week you'll meet Ian Dennis Miller, the creator of Flask-Django. An extension to Flask which brings many of the good things from Django to Flask's simple and small API.

Links from the show:

Flask-Diamond : flask-diamond.org

People API : pplapi.com

GitHub Impact : gh-impact.com

GThnk : gthnk.com

Rollbar's Talk Python Offer : rollbar.com/talkpythontome

Hired's Talk Python Offer : hired.com/talkpythontome

On mocks and stubs in python (free monad or interpreter pattern)

$
0
0

A few weeks ago I watched a video where Ken Scambler talks about mocks and stubs . In particular he talks about how to get rid of them.

One part is about coding IO operatioins as data and using the GoF interpreter pattern to

What he’s talking about is of course free monads, but I feel he’s glossing over a lot of details. Based on some of the questions asked during the talk I think I share that feeling with some people in the audience. Specifically I feel he skipped over the following:

How does one actually write such code in a mainstream OO/imperative language? What’s required of the language in order to allow using the techniques he’s talking about? Errors tend to break abstractions, so how does one deal with error (i.e.exceptions)?

Every time I’ve used mocks and stubs for unit testing I’ve had a feeling that “this can’t be how it’s supposed to be done!” So to me, Ken’s talk offered some hope, and I really want to know how applicable the ideas are in mainstream OO/imperative languages.

The example

To play around with this I picked the following function (in python):

def count_chars_of_file(fn): fd = os.open(fn, os.O_RDONLY) text = os.read(fd, 10000) n = len(text) os.close(fd) return n

It’s small and simple, but I think it suffices to highlight a few important points. So the goal is to rewrite this function such that calls to IO operations (actions) (e.g. os.read ) are replaced by data (an instance of some data type) conveying the intent of the operation. This data can later be passed to an interpreter of actions .

Thoughts on the execution of actions and the interpreter pattern

When reading the examples in the description of the interpreter pattern what stands out to me is that they are either

a list of expressions, or a tree of expressions

that is passed to an interpreter. Will this do for us when trying to rewrite count_chars_of_file ?

No, it won’t! Here’s why:

A tree of actions doesn’t really make sense. Our actions are small and simple, they encode the intent of a single IO operation. A list of actions can’t deal with interspersed non-actions, in this case it’s the line n = len(text) that causes a problem.

The interpreter pattern misses something that is crucial in this case: the running of the interpreter must be intermingled with running non-interpreted code. The way I think of it is that not only the action needs to be present and dealt with, but also the rest of the program , that latter thing is commonly called a continuation .

So, can we introduce actions and rewrite count_chars_of_file such that we pause the program when interpretation of an action is required, interpret it, and then resume where we left off?

Sure, but it’s not really idiomatic Python code!

Actions and continuations

The IO operations (actions) are represented as a named tuple:

Op = collections.namedtuple('Op', ['op', 'args', 'k'])

and the functions returning actions can then be written as

def cps_open(fn, k): return Op('open', [fn], k) def cps_read(fd, k): return Op('read', [fd], k) def cps_close(fd, k): return Op('close', [fd], k)

The interpreter is then an if statement checking the value of op.op with each branch executing the IO operation and passing the result to the rest of the program. I decided to wrap it directly in the program runner:

def runProgram(prog): def runOp(op): if op.op == 'open': fd = os.open(*op.args, os.O_RDONLY) return op.k(fd) elif op.op == 'read': text = os.read(*op.args, 10000) return op.k(text) elif op.op == 'close': os.close(*op.args) return op.k() while isinstance(prog, Op): prog = runOp(prog) return prog

So far so good, but what will count_char_of_file all of this do to count_chars_of_file ?

Well, it’s not quite as easy to read any more (basically it’s rewritten in CPS ):

def count_chars_of_file(fn): def cont_1(text, fd): n = len(text) return cps_close(fd, lambda n=n: n) def cont_0(fd): return cps_read(fd, lambda text, fd=fd: cont_1(text, fd)) return cps_open(fn, cont_0) Generators to the rescue

Python does have a notion of continuations in the form of generators.By making count_char_of_file into a generator it’s possible to remove the explicit continuations and the program actually resembles the original one again.

The type for the actions loses one member, and the functions creating them lose an argument:

Op = collections.namedtuple('Op', ['op', 'args']) def gen_open(fn): return Op('open', [fn]) def gen_read(fd): return Op('read', [fd]) def gen_close(fd): return Op('close', [fd])

The interpreter and program runner must be modified to step the generator until its end:

def runProgram(prog): def runOp(op): if op.op == 'open': fd = os.open(op.args[0], os.O_RDONLY) return fd elif op.op == 'read': text = os.read(op.args[0], 10000) return text elif op.op == 'close': os.close(op.args[0]) return None try: op = prog.send(None) while True: r = runOp(op) op = prog.send(r) except StopIteration as e: return e.value

Finally, the generator-version of count_chars_of_file goes back to being a bit more readable:

def count_chars_of_file(fn): fd = yield gen_open(fn) text = yield gen_read(fd) n = len(text) yield gen_close(fd) return n Generators all the way

Limitations of Python generators mean that we have either have to push the interpreter ( runProgram ) down to where count_char_of_file is used, or make all intermediate layers into generators and rewrite the interpreter to deal with this. It could look something like this then:

def runProgram(prog): def runOp(op): if op.op == 'open': fd = os.open(op.args[0], os.O_RDONLY) return fd elif op.op == 'read': text = os.read(op.args[0], 10000) return text elif op.op == 'close': os.close(op.args[0]) return None try: op = prog.send(None) while True: if isinstance(op, Op): r = runOp(op) op = prog.send(r) elif isinstance(op, types.GeneratorType): r = runProgram(op) op = prog.send(r) except StopIteration as e: return e.value Final thoughts

I think I’ve shown one way to achieve, at least parts of, what Ken talks about. The resulting code looks almost like “normal Python”. There are some things to note:

Exception handling is missing. I know of no way to inject an exception into a generator in Python so I’m guessing that exceptions from running the IO operations would have to be passed in

Learning Python

$
0
0

In my Why Automate? post I discussed how important I feel automation skills will be for engineers in the near future. I even went as far as to say thatthose who don’t learn to use automation will one day be left behind.Not only do I still stand by that statement, but I’d like to extend it to cover coding as well for the exact same reasons as the ones which were covered in my automation post.

With this in mind,I plan to write a series of posts covering python. The reason I’ve chosen Python is because it’s great for those who are new to coding, it’s extremely powerful, it has a great community behind it and a countless number of useful modules available.

Getting Started

Once you’ve completed a few online courses and read through some Python books, there are three main ways to continue your learning:

Think of a basic project you’d like to create. Dissect someone else’s project. A combination of the above two.

I’ve tried all three options and I now find that my preference, at the time of writing, is Option 2. The reason being that I’ve reacheda point where I understand the basics and am now looking to more experienced coders to see what techniques I can learn from them and their code.

If you decide to create your own project, the best adviceI can give you is to get the basics working first, then optimise and add additional features later. The reasonbeing that if you spend your time trying to optimise every single line of code while or you try to implement too much functionality in one go, you’ll never finish a project which will therefore stunt your learning.

Stay tuned for my next post where I’ll be dissecting Kirk Byer’s fantastic Netmiko Python module.

As always, if you have any questions or have a topic that you would like me to discuss, please feel free to post a comment at the bottom of this blog entry, e-mail at will@oznetnerd.com , or drop me a message on Twitter ( @OzNetNerd ) .

Note: This website is my personal blog. The opinions expressed in this blog are my own and not those of my employer.

Learning Python: Dissecting Netmiko, Part 1

$
0
0

Netmiko is a “Multi-vendor library to simplify Paramiko SSH connections to network devices”. It is written an maintained by Kirk Byers and is widely used by the python community and Python modules alike. Once such module is NTC Ansible which I’ve posted about previously and will post about again in the near future.

Before we start dissecting Kirk’s code though, I’d like to give both him and Gabriele Gerbino a shout out for all their assistance with my Python and Ansible queries. Both of these gentlemen have been invaluable for my studies and I thank them for it.

Follow the Bouncing Ball

Note:As time goes by, Netmiko’s code will be changed. The code which is being analysed in this series of posts can be found here.

The Netmiko README filegives us an example of how it can be used to obtain a ‘show’ command output from a Cisco router. After adapting this example to suit my lab environment, it looks like this:

from netmiko import ConnectHandler
cisco_gns3 = {
'device_type': 'cisco_ios',
'ip': '10.255.0.254',
'username': 'cisco',
'password': 'cisco',
'secret': 'cisco', # optional, defaults to ''
'verbose': False, # optional, defaults to False
}
net_connect = ConnectHandler(**cisco_gns3)
output = net_connect.send_command('show ip int brief')
print(output)

Running this codes provides the following output:

(new_venv) will@ubuntu:~/all/netmiko$ python ../nettest.py
Interface IP-Address OK? Method Status Protocol
FastEthernet0/0 10.255.0.254 YES NVRAM up up
FastEthernet0/1 10.255.2.1 YES NVRAM up up
FastEthernet1/0 10.255.3.1 YES NVRAM up up
FastEthernet2/0 unassigned YES NVRAM up up
FastEthernet3/0 unassigned YES NVRAM up up
FastEthernet4/0 unassigned YES unset up down
FastEthernet4/1 unassigned YES unset up down
FastEthernet4/2 unassigned YES unset up down
FastEthernet4/3 unassigned YES unset up down
FastEthernet4/4 unassigned YES unset up down
FastEthernet4/5 unassigned YES unset up down
FastEthernet4/6 unassigned YES unset up down
FastEthernet4/7 unassigned YES unset up down
FastEthernet4/8 unassigned YES unset up down
FastEthernet4/9 unassigned YES unset up down
FastEthernet4/10 unassigned YES unset up down
FastEthernet4/11 unassigned YES unset up down
FastEthernet4/12 unassigned YES unset up down
FastEthernet4/13 unassigned YES unset up down
FastEthernet4/14 unassigned YES unset up down
FastEthernet4/15 unassigned YES unset up down
Vlan1 unassigned YES NVRAM up down

Now that we know it works, let’s see what Netmiko is actually doing but where do we start? Well, given that we’re using a function called ConnectHandler , let’s start looking there.

ssh_dispatcher.py

ConnectHanldercan be found in ssh_dispatcher.py , however, instead of jumping straight to the ConnectHandler section, let’s approach this file in a top down approach.

First we see a bunch of imports, a small subset of which are below:

from netmiko.cisco import CiscoIosSSH
from netmiko.arista import AristaSSH
from netmiko.huawei import HuaweiSSH
from netmiko.f5 import F5LtmSSH
from netmiko.juniper import JuniperSSH

Whatthese imports do is make these classes (CiscoIosSSH, AristaSSH, HuaweiSSH, etc which can be found in the vendor directories ) available tothe code in this file.

If you’re wondering why one would want to split the code up into different files when this file (ssh_dispatcher.py) needs access to it all, it’s becausesplitting it up makse the code more modular and easier to manage than it would be if everything were contained in a single file.

Next we see a CLASS_MAPPER_BASE dictionary which contains key:value pairs, some of which are listed below:

CLASS_MAPPER_BASE = {
'cisco_ios': CiscoIosSSH,
'huawei': HuaweiSSH,
'f5_ltm': F5LtmSSH,
'juniper': JuniperSSH,
'arista_eos': AristaSSH,
}

Notice how the values in the CLASS_MAPPER_BASE match class imports we saw earlier? Also notice how the ‘cisco_ios’ key matches the ‘device_type’ code at the start of this post? As you might have guessed, thesebycoincidence. Let’s keep dissecting and see what we find…

The next block of code we come across is this:

# Also support keys that end in _ssh
new_mapper = {}
for k, v in CLASS_MAPPER_BASE.items():
new_mapper[k] = v
alt_key = k + u"_ssh"
new_mapper[alt_key] = v
CLASS_MAPPER = new_mapper
# Add telnet drivers
CLASS_MAPPER['cisco_ios_telnet'] = CiscoIosTelnet

What this code does is it takes the CLASS_MAPPER_BASE dictionary we saw earlier and does the following:

Adds the contents of CLASS_MAPPER_BASE to new_mapper . Adds a _ssh suffix to CLASS_MAPPER_BASE keys and adds them to new_mapper .

The above results in new_mapper being twice the size of CLASS_MAPPER_BASE . The reason being that contains both an ‘original’ key and new ‘_ssh’ key for each entry. For example:

'cisco_ios': CiscoIosSSH,
'cisco_ios_ssh': CiscoIosSSH,
'huawei': HuaweiSSH,
'huawei_ssh': HuaweiSSH The code then saves the new_mapper dictionary as CLASS_MAPPER and adds a new ‘cisco_ios_telnet’:Cisc

How do I add a placeholder on a CharField in Django?

$
0
0

This is a popular question since the Django documentation doesn't address this concern directly.

The general question is:

How do I add custom HTML attributes to any form field in Django?

In this post, I'd give you multiple answers to the above question.

The Problem

Consider the form:

class SearchForm(forms.Form): q = forms.CharField()

By default, it renders the input tag:

<input id="id_q" name="q" type="text" required />

The problem is how to do you get it to render the input tag with other HTML attributes like autofocus , placeholder and class :

<input class="search field" id="id_q" name="q" placeholder="Search GitHub" type="text" autofocus required /> Solution #1

Replace the default widget.

A widget is Django's representation of a field as an HTML element.

Hence, if you want to customize how a field is rendered you usually need to fiddle with it's widget.

Every widget has an attrs attribute that is a dictionary containing HTML attributes to be set when it is rendered.

You can populate the attrs attribute when instantiating a widget.

So, to solve your problem you can do the following:

class SearchForm(forms.Form): q = forms.CharField( widget=forms.TextInput( attrs={ 'autofocus': True, 'class': 'search field', 'placeholder': 'Search GitHub' } ) ) Solution #2

Customize the default widget.

The default widget class for a CharField is TextInput . Hence, you don't need to instantiate a new TextInput just to change attrs . You can access the existing instance and update its attrs attribute.

Here's how you do it:

class SearchForm(forms.Form): q = forms.CharField() def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.fields['q'].widget.attrs.update({ 'autofocus': True, 'class': 'search field', 'placeholder': 'Search GitHub' }) Solution #3

If you're dealing with a ModelForm then the previous two solutions will work since a ModelForm is a Form . But, you also have the option to specify a custom widget for a field by setting the widgets attribute of the inner Meta class.

class CommentForm(forms.ModelForm): class Meta: model = Comment widgets = { 'body': forms.Textarea(attrs={'cols': 80, 'rows': 20}) } Conclusion

In conclusion, if you ever need to add custom HTML attributes to a form field in Django then you either need to replace its default widget or customize its default widget.

And, in general, whenever you need to change the visual representation of a form field look to its widget for guidance.

Further reading:

Widgets attrs Overriding the default fields on a ModelForm

I hope this post has been helpful. Let me know in the comments below which solution you prefer and why.

P.S. Subscribe to my newsletter if you're interested in getting exclusive Django content.

Reuven Lerner: Python function brain transplants

$
0
0

Reuven Lerner: Python function brain transplants
What happens when we define a function in python?

The “def” keyword does two things: It creates a function object, and then assigns a variable (our function name) to that function object. So when I say:

def foo():
return "I'm foo!"

Python creates a new function object. Inside of that object, we can see the bytecodes, the arity (i.e., number of parameters), and a bunch of other stuff having to do with our function.

Most of those things are located inside of the function object’s __code__ attribute. Indeed, looking through __code__ is a great way to learn how Python functions work. The arity of our function “foo”, for example, is available via foo.__code__.co_argcount. And the byte codes are in foo.__code__.co_code.

The individual attributes of the __code__ object are read-only. But the __code__ attribute itself isn’t! We can thus be a bit mischievous, and perform a brain transplant on our function:

def foo():
return "I'm foo!"
def bar():
return "I'm bar!"
foo.__code__ = bar.__code__

Now, when we run foo(), we’re actually running the code that was defined for “bar”, and we get:

"I'm in bar!"

This is not likely to be something you want to put in your actual programs, but it does demonstrate some of the power of the __code__ object, and how much it can tell us about our functions. Indeed, I’ve found over the last few years that exploring the __code__ object can be quite interesting!

Django的数据迁移(Data migration)

$
0
0

Django支持ORM模型,我们不必写一条SQL语句,就可以方便使用面向对象管理数据库。数据库的管理可以分为结构迁移和数据迁移。

结构迁移是只是指表的结构改变。我们只需要修改Django的数据模型,然后用 python manage.py makemigrations 自动生成需要执行的数据迁移代码(一般保存在 app/migrations 文件夹下),然后用 python manage.py migrate 执行代码,就可以将变化应用到数据库。

另一种是数据迁移。之前我的做法是,写一个url映射到view,在view中写操作数据库的代码。执行完毕后删掉url。这样做有很多坏处:

将操作数据库开放了(即使可以写管理员权限) 污染url和view的代码 不能将所有有关数据库管理的代码放下 migrations 下面

好处是,如果常用的话,可以作为管理员“维护数据库”的一种方式。但如果是这种用途的话,还是写成一个中规中矩的view比较好。

其实,Django本身就有数据迁移的功能,上面这种做法太“歪门邪道”了。

数据迁移和结构迁移类似,只不过Django不是直接为你生成迁移结构的代码,而是生成一个框架,然后你在这个框架根绝自己的需要写代码。其实这个样子更灵活。

1.生成数据迁移的空白文件

执行下面的命令,生成一个迁移的脚本,等着你填上需要执行的命令。

pythonmanage.py makemigrations --emptyyourappname

生成的空白文件,如下所示。

# -*- coding: utf-8 -*- # Generated by Django A.B on YYYY-MM-DD HH:MM from __future__ import unicode_literals from django.dbimport migrations, models class Migration(migrations.Migration): initial = True dependencies = [ ('yourappname', '0001_initial'), ] operations = [ ] 2.填写数据迁移代码

文件中的 Operations 是空着的,这个地方是需要执行的数据迁移操作。数据操作可以写成函数,然后在 Operations 里面用RunPython运行函数。

函数的第一个变量是app,第二个是 SchemaEditor ,可以用来手动修改数据表的结构。不过不推荐这样做,做自动迁移的时候可能遇到问题。

下面这个迁移是自动根据以前名字的firstname和lastname生成name。导入模型的方式有点特殊,将要做的操作写到Python函数里面,然后在 Operations 里运行Python函数。除了RunPython之外,也可以RunSQL等等。

# -*- coding: utf-8 -*- from __future__ import unicode_literals from django.dbimport migrations, models def combine_names(apps, schema_editor): # 这里不能直接导入Person模型,因为这里的模型要求是特定的一个版本 # 如果直接导入,就会是最新的版本了 Person = apps.get_model("yourappname", "Person") for personin Person.objects.all(): person.name = "%s %s" % (person.first_name, person.last_name) person.save() class Migration(migrations.Migration): initial = True dependencies = [ ('yourappname', '0001_initial'), ] operations = [ migrations.RunPython(combine_names), ] 3.执行migrate命令,应用数据迁移

和应用结构迁移一样,执行下面的命令就可以了。

pythonmanage.pymigrate 4.使用其他app里面的Model

如果RunPython函数里面用到别的app(不是此migrations存在的app)里面的model,应该在 Dependencies 里面模仿例子加上别的app的名字以及最新的migrations依赖。

否则,使用 apps.get_model() 的时候,会遇到 LookupError: No installed app with label 'myappname' 。

下例,app1里面的 migrations 用到了app2的Model。

class Migration(migrations.Migration): dependencies = [ ('app1', '0001_initial'), # 加入app名字,和最后的migrations ('app2', '0004_foobar'), ] operations = [ migrations.RunPython(move_m1), ] 5.参考资料 Django文档,Migrations

Python Command-line skeleton

$
0
0

Writing a command-line interface (CLI) is an easy way to extend the functionality and ease of use of any code you write.

python comes with the built-in module, argparse , that can be used to easily develop command-line interfaces. To speed up the process, I have developed a ‘skeleton’ application that can be forked on github and used to quickly develop CLI programs in python.

The repo has the following features added:

Testing with travis-ci and py.test Coverage analysis using coveralls A setup file that will install the command a simple argparse interface

To get started, you should signup for an account on travis-ci and coveralls , and fork the repo!

python-cli-skeleton on Github

Python Top 10 Articles for the Past Year (v.2017)

$
0
0

Python Top 10 Articles for the Past Year (v.2017)
python Top 10 Articles for the Past Year(v.2017)

For the past year, we’ve ranked nearly 10,000 Python articles to pick the Top 10 stories (0.1% chance) that can help you advance your career in 2017.

This Python list includes topics such as: Django, Data Science, Numpy, Data Mining, Stock Trading, Home Automation, Self Driving Car, Dataset. Machine Learning Top 10 is published separately.

This is an extremely competitive list and Mybridge has not been solicited to promote any publishers. Mybridge A.I . ranks articles based on the quality of content measured by our machine and a variety of human factors including engagement and popularity. Academic papers were not considered in this batch.

Give a plenty of time to read all of the articles you’ve missed this year. You’ll find the experience and techniques shared by the Python leaders particularly useful.


Python Top 10 Articles for the Past Year (v.2017)
Rank 1

The Hitchhiker’s Guide to Python: Best practices guidebook written for Humans.


Python Top 10 Articles for the Past Year (v.2017)
Rank 2

Scipy Lecture Notes ― Learn numerics, science, and data with Python.


Python Top 10 Articles for the Past Year (v.2017)
Rank 3

30 Essential Python Tips and Tricks for Programmers.


Python Top 10 Articles for the Past Year (v.2017)
Rank 4

Computational and Inferential Thinking for Data Science. Courtesy of UC Berkeley


Python Top 10 Articles for the Past Year (v.2017)
Rank 5

Welcome to Python cheatsheet.


Python Top 10 Articles for the Past Year (v.2017)

PyCharm: PyCharm 2017.1 EAP 5 (build 171.2822.19)

$
0
0

The fifth Early Access Program (EAP) release of PyCharm 2017.1 is available now. Get it from our website!

This EAP introduces several new features:

Support for the ‘six’ library. The six library is a tool for writing python applications (or libraries) that support both Python 2 and Python 3 A faster debugger for Python 3.6, we’re using language features introduced in PEP 523 to make the debugging experience quicker We’ve revamped our test runner, it nowcommunicates with test frameworks using theTeamCitytest protocol. This ensures that your unit tests will be run identically on your machine and on the CI server. The new runner enables a more consistent, and more debuggable testing experience.If you’d like to read more technical details, check out the confluence page our developer wrote We’ve added a ‘Data View’ window, if you’re doing data science using PyCharm, you can now have an overview window with your Pandas DataFrames and NumPy Arrays.
PyCharm: PyCharm 2017.1 EAP 5 (build 171.2822.19)
We’ve added the Google javascript style guide as a preset. Load it by going to Settings | Editor | Code Style | JavaScript, and then use the ‘set from’ link on the right to choose ‘Google JavaScript Style Guide’. [Pro only]
PyCharm: PyCharm 2017.1 EAP 5 (build 171.2822.19)

We’ve also worked hard to fix bugs:

Many Pyramid bugs have been resolved: template language selection, run configuration issues, some exceptions, and last but not least we updated the logo.[Pro only] Issues with Django test configurations[Pro only] JupyterNotebook issues

Any improvements marked ‘Pro only’ are only available in PyCharm Professional Edition.You can use the EAP version of PyCharm Professional Edition for free for 30 days.

We’d like to encourage you to try out this new EAP version. To keep up-to-date with our EAP releases set your update channel to Early Access Program: Settings | Appearance & Behavior | System Settings | Updates, Automatically check updates for “Early Access Program”

We do our best to find all bugs before we release, but in these preview builds there might still be some bugs in the product. If you find one, please let us know on YouTrack , or contact us on Twitter @PyCharm .

-PyCharm Team

The Drive to Develop

DC SVD I: Just Can’t Let It Go

$
0
0

It’s been way, way, way too long since I’ve posted. I haven’t been slacking though, I’ve merely been busy. Really.

I decided to dive back into the SVD problem and look at an alternative to the QR based SVD computations. Namely, I’m going to give a breakdown of the divide-and-conquer approach to computing the SVD. Similar to the relationship between bidiagonal SVD and tridiagonal QR decompositions, there is a close relationship between dividing-and-conquering QR and SVD. I’m going to start from "the inside out" with the innermost task of DC (divide-and-conquer) process: solving a particular equation known as the secular equation (secular meaning "not heavenly" i.e., earthly or planetly check it out on wikipedia).

One note: I feel very guilty about still using python 2. I had intented to transition this project to Python 3 over the course of the last year. Alas, my major training clients over the last year were using Python 2 and I didn’t really want to have a mixed development environment on my laptop. Well, at least there’s something to do if I make a book out of these posts!

Enough preamble, let’s get down to business.

The Secular Equation Divide-and-conquer SVD is built on computing the roots of the secular equation . The roots, or zeros, of an equation are the \(\lambda\) s such that \(f(\lambda) = 0\) . The secular equation is (where we take \(\rho=1\) ): \[f(\lambda) = 1 + \rho \sum_{i=1}^n \frac{z_{i}^2}{d_i \lambda} = 1 + \sum_{i=1}^n \frac{z_{i}^2}{d_i \lambda}\]

Here is a graph of a secular equation and its roots:

In[1]: import numpy as np import numpy.linalg as nla import matplotlib.pyplot as plt %matplotlib inline xs = np.linspace(-1,10,10000) # graph secular equation (blue curve) # plot z,d --> y zs = np.array([.6,1.2,1.8]) ds = np.array([0.0,3,5]) with np.errstate(divide='ignore'): # plain vanilla to "fancy-schmancy" #ys1 = 1.0 + (1.0 / (0.0-xs_sq)) + (4.0 / (9-xs_sq)) + (16.0 / (25 - xs_sq)) #ys2 = 1.0 + (zs[0]**2 / (ds[0]**2-xs_sq)) + (zs[1]**2 / (ds[1]**2-xs_sq)) + (zs[2]**2 / (ds[2]**2 - xs_sq)) ys3 = 1.0 + ((zs**2).reshape(3,1) / np.subtract.outer(ds, xs)).sum(0) # assert np.allclose(ys1, ys2) and np.allclose(ys2,ys3) plt.plot(xs, ys3) # add an x-axis (yellow horizontal line) plt.plot(xs, np.zeros_like(xs), 'y-') # add poles/asymptotes (grey vertical lines) plt.vlines(ds, -10, 10, '.75') # add roots (red dots) # use equivalent matrix (for these z,d) and eigenvalue computation to find roots # ds are diagonal entries, zs are first columns (note, ds[0] is 0.0 by definition) ROU = np.diag(ds) + np.outer(zs, zs) #tm[:,0] = zs zeros_act = nla.eig(ROU)[0] plt.plot(zeros_act, np.zeros_like(zeros_act), 'r.') #scaling plt.ylim(-10,10), plt.xlim(-1, 10);
DC SVD I: Just Can’t Let It Go

A quick note, the secular equation as written above is most directly used to compute eigenvalues not singular values . There is a strong relationship between the two sets of values and they lead to only slightly different forms in the secular equation. I mention this because you might see slightly different forms of the secular equation depending on whether you are reading about eigenvalues or singular values. We will use the form above for solving both problems by slightly modifying the resulting \(\lambda\) s.

Back to our regularly scheduled program. We will find our desired roots by isolating out \(f(\lambda)\) between each set of poles. The poles occur at the values of \(d_i\) . So, on the interval \((d_i, d_{i+1})\) , the problem simplifies to finding a single root of the secular equation. Throughout this post, we’ll assume that \(d_i < d_j\) for \(i<j\) (in English, the \(d_i\) s are distinct and sorted). We’ll find a single root \(n\) times and find all \(n\) roots. The cleverest will now point out that \(n\) poles only hold \(n-1\) zeros between them. The last zero is to the right of \(d_n\) .

How do we find the single roots?

Newton’s Method

Let’s take a second and review Newton’s method for finding a root of an equation \(f(x)\) near \(x_0\) .

Approximate \(f(x)\) by a linear function \(l(x) = ax+b\) Apply constraints such that \(f(x_0) = l(x_0)\) and \(f'(x_0) = l'(x_0)\) . These determine the coefficients of \(l(x)\) . Find the root of \(l(x)\) and call it \(x_1\) . In this case, the root is the \(x\) -intercept of \(l(x)\) . That root of \(l(x)\) is a better guess as to the root of \(f(x)\) .

Newton’s method is a great technique and it is used broadly because it is conceptually and formally simple, easy to implement, and the iteration steps are reasonably fast. However, in the case of the secular equation, as the numerators get very small, the corner in the graph will go from a gentle bend to a sharp corner. In other words, the graph will go from vertical to horizontal very quickly. When this happens, a linear approximation on the nearly horizontal part (which is the majority of the pole-to-pole interval), will also be approximately horizontal and be aimed far away from the true root in this interval.

A Modified Newton’s Method

So, we need to try something else. Fortunately, we can maintain the outline of Newton’s method while using a different approximating function. Instead of using a linear form, we will use the following rational function of \(\lambda\) which has poles at \(d_i\) and \(d_{i+1}\) :

\[h(\lambda) = \frac{C_1}{d_i-\lambda} + \frac{C_2}{d_{i+1}-\lambda} + C_3\]

So, we are approximating \(f(\lambda)\) with \(h(\lambda)\) :

\[f(\lambda) = 1 + \sum_{i=1}^n \frac{z_{i}^2}{d_i \lambda} \approx
\frac{C_1}{d_i-\lambda} + \frac{C_2}{d_{i+1}-\lambda} + C_3 = h(x)\]

Also, since we want to avoid numerical problems with term cancellation, we will break the sum in \(f(\lambda)\) into two parts: (1) the sum up to term \(k\) and (2) the sum from term \(k+1\) on. Along with the an ordering assumption on the \(d_i\) , this means that the lower terms are all negative and the upper terms are all positive for \(\lambda \in (d_i, d_{i+1})\) . We will also give names to those partial sums.

\[
\begin{eqnarray}
f(\lambda) &=& 1 + & \sum_{i=1}^k \frac{z_{i}^2}{d_i \lambda} + \sum_{i=k+1}^n \frac{z_{i}^2}{d_i \lambda} &=& 1 + \Psi_1(\lambda) + \Psi_2(\lambda)\\
f'(\lambda) &=& & \sum_{i=1}^k \frac{z_i^2}{(d_i \lambda)^2} + \sum_{i=k+1}^n \frac{z_i^2}{(d_i \lambda)^2} &=& \Psi'_1(\lambda) + \Psi'_2(\lambda)
\end{eqnarray}
\]

Note that the derivative (and the derivatives of the partial sums) is strictly positive except at \(\lambda = d_i\) (the poles of \(f\) ).

Since we have broken \(f\) into pieces, we will also break \(h\) into pieces:

\[h(\lambda)=1 + h_1(\lambda) + h_2(\lambda)\]

and we will give them each the same form, which is similar to that of \(h\) :

\[h_1(\lambda) = \hat{c}_1 + \frac{C_1}{d_k \lambda} \quad
h_2(\lambda) = \hat{c}_2 + \frac{C_2}{d_{k+1} \lambda}\] We will also enforce the "Newton conditions" that both the value of the approximations \(h_i\)

Julien Danjou: Gnocchi 3.1 unleashed

$
0
0

It's always difficult to know when to release, and we really wanted to do it earlier. But it seems that each week more awesome work was being done in Gnocchi , so we kept delaying it while having no pressure to push it out.

A photo posted by Julien Danjou (@juldanjou) on Jan 22, 2017 at 5:43am PST

I've made my own gnocchis to celebrate!

But now that the OpenStack cycle is finishing, even Gnocchi does not strictly follow it, it seemed to be a good time to cut the leash and leave this release be.

There are again some major new changes coming from 3.0. The previous version 3.0 was tagged in October and had 90 changes merged from 13 authors since 2.2. This 3.1 version have 200 changes merged from 24 different authors. This is a great improvement of our contributor base and our rate of change even if our delay to merge is very low. Once again, we pushed usage of release notes to document user visible changes, and they can be read online .

Therefore, I am going to summary quickly the major changes:

The REST API authentication mechanism has been modularized. It's now simple to provide any authentication mechanism for Gnocchi as a plugin. The default is now a HTTP basic authentication mechanism that does not implement any kind of enforcement. TheKeystone authentication is still available, obviously.

Batching has been improved and can now create metrics on the fly, reducing the latency needed when pushing measures to non-existing metrics. This is leveraged by the collectd-gnoccchi plugin for example.

The performance of Carbonara based backend has been largely improved. This is not really listed as a change as it's not user-visible, but an amazing work of profiling and rewriting code from Pandas to NumPy has been done. While Pandas is very developer-friendly and generic, using NumPy directly offers way more performance and should decrease gnocchi-metricd CPU usage by a large factor.

The storage has been split into two parts: the storage of incoming new measures to be processed, and the storage and archival of aggregated metrics. This allows to use e.g. file to store new measures being sent, and once processed store them into e.g. Ceph. Before that change, all the new measures had to go into Ceph. While there's no specific driver yet for incoming measures, it's easy to envision a driver for systems like Redis or Memcached .

A new Amazon S3 driver has been merged. It works in the same way than the file or OpenStack Swift drivers.


Julien Danjou: Gnocchi 3.1 unleashed

I will write more about some of these new features in the upcoming weeks, as they are very interesting for Gnocchi's users.

We are planning to run a scalability test and benchmarks using the ScaleLab in a few weeks if everything goes as planned. I will obviously share the result here, but we also submitted a talk for the next OpenStack Summit in Boston to present the results of our scalability and performance tests hoping the session will be accepted.

I will also be talking about Gnocchi this Sunday at FOSDEM .

We don't have a very determined roadmap for Gnocchi during the next weeks. Sure we do have a few ideas on what we want to implement, but we are also very easily influenced by the requests of our user: therefore feel free to ask for anything!

以数据看机器学习语言趋势 Python领导地位超然

$
0
0
以数据看机器学习语言趋势 python领导地位超然

一点号奇酷学院2小时前

对于开发者来说,掌握什么编程语言能更容易找到机器学习或者数据科学的工作?这是个许多人关心的问题,非常实际,也在许多论坛被翻来覆去地讨论过,今天奇酷学院用数据来展示各门开发语言在工业界的实际使用情况。

我们来看看 2016 年开发语言使用情况统计,到底哪门语言的使用人数上升最快?居前几位的都是哪些?


php?url=0FZJIHEJtz" alt="以数据看机器学习语言趋势 Python领导地位超然" />
机器学习与数据科学领域各语言的雇主招聘指数对比

如图所示,这是利用美国职位搜索引擎 indeed.com 得出的机器学习、数据科学招聘趋势:对这些领域内开发职位所列出的编程语言要求进行了统计。它展示出公司、雇主们都在寻找哪些语言技能。我们可以清楚看出,美国雇主最需要的前四大语言排名是 Python,Java,R,C++。其中,Python 在 2015 年中超过 Java 跃升至第一。然后,把搜索结果限制在机器学习领域(去掉数据科学),数据其实差不多:


以数据看机器学习语言趋势 Python领导地位超然
机器学习领域各语言的雇主招聘指数对比

我们能从这两组数据中推断出什么?Python 是市场的领先者,作为最受欢迎的机器学习语言当之无愧。 另外,Python 与 Java 之间的差距正在被拉开。但是 Java 与 R 之间的差距正在被缩小。

但是,当我们聚焦于细分领域“深度学习”,数据就变得很不一样:


以数据看机器学习语言趋势 Python领导地位超然

深度学习领域各语言的雇主招聘指数对比

在深度学习市场,对 Python 的招聘需求仍然最高。但前五大语言的排序变成了Python,C++,Java,C,R。这里有很明显的对高性能计算语言的侧重。 而且,Java 的增长速度惊人,它可能很快成为深度学习市场的第二位。在可预期的将来,R 还不会成为最受欢迎的深度学习语言。令人惊讶的是 Lua 的存在感之低。要知道,开源框架巨头之一的 Torch 便是基于 Lua,许多开发者因此会认为它在深度学习市场占有特殊地位。

对于文章开头提出的问题--雇主需要掌握什么语言的开发者,答案已经很明确了:在机器学习和数据科学市场,Python, Java, 和 R 的招聘需求最大;在深度学习领域, Python, Java, C++ 以及 C 更被公司欢迎。

无论从哪个方面看,Python都处于领导地位,学习Python就成为许多开发者目前最迫切的任务。奇酷学院重磅打造Python高端开发课程,让你快人一步,站在行业前沿!

Sentry 8.13.0 发布,Python 实时日志平台

$
0
0

Sentry 8.13.0 发布了。Sentry 是一个实时的事件日志和聚合平台,基于Django构建。

Sentry 可以帮助你将 python 程序的所有 exception 自动记录下来,然后在一个好用的 UI 上呈现和搜索。处理 exception 是每个程序的必要部分,所以 Sentry 也几乎可以说是所有项目的必备组件。


Sentry 8.13.0 发布,Python 实时日志平台

更新内容:

Added individual filters for legacy browsers to improve customization of error filtering based on browser versions

Support for setting a custom security header for javascript fetching.

start using ReleaseProject and Release.organization instead of Release.project

Project quotas are no longer available, and must now be configured via the organizational rate limits.

Quotas implementation now requires a tuple of maximum rate and interval window.

Added security emails for adding and removing MFA and password changes.

Added the ability to download an apple compatible crash report for cocoa events.

Add memory and storage information for apple devices

The legacy API keys feature is now disabled by default.

Show Images Loaded section for cocoa events with version number.

Fixed bug where workflow notification subject may not include a custom email prefix.

Added configurable subject templates for individual alert emails ( mail:subject_template option).

Added data migration to populate ReleaseProject.new_groups

Added support for managing newsletter subscriptions with Sentry.io

Schema Changes

Added ReleaseProject.new_groups column.

Added OrganizationAvatar model.

API Changes

Added avatar and avatarType to /organizations/{org}/ endpoint.

Provide commit and author information associated with a given release

Provide repository information for commits

下载地址:

Viewing all 9596 articles
Browse latest View live