Quantcast
Channel: CodeSection,代码区,Python开发技术文章_教程 - CodeSec
Viewing all articles
Browse latest Browse all 9596

The hardest problem in computerscience

$
0
0

…is, of course,naming.

Not just naming variables or new technologies. Oh no. We can’t even agree on names for basicconcepts.

A thousand overlappingvernaculars

Did you know that the C specification makes frequent reference to “objects”? Not in the OO sense as you might think ― a C “object” is defined as a “region of data storage in the execution environment, the contents of which can represent values”. The spec then goes on to discuss, for example, “objects of type char “ .

“ Method” is a pretty universal term, but you may encounter a C++ programmer who only knows it as “member function”. Conversely, Java may not have functions at all, depending on who you ask. “Procedure” and “subroutine” aren’t used much any more, but a procedure is a completely different construct from a function inPascal.

Even within the same language, we get sloppy: see if you can catch a python programmer using “property” to refer to a regular “attribute”, when property is a special kind ofattribute.

There’s a difference between “argument” and “parameter”, but no one cares what it is, so we all just use “argument” which abbreviates more easily. I use the word “signature” a lot, but I rarely see anyone else use it, and sometimes I wonder if anyone understands what Imean.

A float is single-precision in C and double-precision in Python. I reflexively tense up whenever someone says “word” unqualified, because it could mean any of three or four differentsizes.

Part of the problem here is that we’re not actually doing computer science. We’re doing programming , with a wide variety (hundreds!) of imperfect languages with different combinations of features and restrictions. There are only so many words to go around, so the same names get used for vaguely similar features across many languages, and native speakers naturally attach their mother tongue’s baggage to the jargon it uses. Someone who got started with javascript would have a very different idea of what a “class” is than someone who got started with Ruby. People come to Python or JavaScript andexclaim that they “doesn’t have real closures” because of a quirk of namebinding.

Most of the time, this is fine. Sometimes, it’s incredibly confusing. Here are my (least?) favorite lexical clashes. (That was onetoo!)

Arrays, vectors, andlists In C, an array is a contiguous block of storage, in which you can put some fixed number of values of the same type. int[5] describes enough space to store five int s, all snuggled right next to each other. There’s no such thing as a “vector”. “List” would likely be interpreted as a linked list, in which each value is stored separately and has a pointer to the nextone.

C++ introduced vector , an array that automatically expands to fit an arbitrary number of values. There’s also a standard list type, which is a doubly-linked list. (The exact implementations may be anything, but the types require certain properties that make an array and a linked list the most obvious choices.) But wait! C++11 introduced the initializer_list , which is actually anarray.

Lisp dialects are of course nothing but lists, but under the hood, these tend to be implemented as linked lists ― which is no doubt why Lisp originally handled lists in terms of heads and tails (very easy to do with linked lists), rather than random access (very easy to do with contiguous arrays). Haskell works similarly, and additionally has a Data.Array module which offers fast randomaccess.

Perl (5)’s sequence type is the array, though “type” is a little misleading here, because it’s really one of Perl’s handful of shapes of variables. Perl also has a distinct thing called a “list”, but it’s a transient context that only exists while evaluating an expression, and is not a type of value. It’s weird and I can’t really explain it within a singleparagraph.

Meanwhile, in Python, list is the fundamental sequence type, but it has similar properties to a C++ vector and (in CPython) is implemented with a C array. The standard library also offers the rarely-used array type, which packs numbers into C arrays to save space ― a source of occasional confusion for new Python programmers coming from C, who think “array” is the thing theywant.

JavaScript has an Array type, but it’s (semantically) built on top of the only data structure in JavaScript, which is a hashtable.

php ’s sole data structure is called array , but it’s really an ordered hash table where you are free to use integers for the keys if you want. It also has a thing called list , but it’s not a type, just quirky syntax for doing deconstructing assignment. People coming from PHP to other languages are occasionally frustrated that hash tables lose theirorder.

Lua likewise has only a single data structure, but is more upfront in calling its structure a “table”; there’s nothing in the language called “array”, “vector”, or “list”.

While I’m at it, the names for mapping types are all over theplace:

C++: map (actually a binary tree; C++11’s unordered_map is a hashtable) JavaScript: object(!) Lua:table PHP : array(!) Perl: hash (another “shape”, and somewhat misleading since a “hash” is also a different thing), though the documentation likes to say “associative array” alot Python:dict Rust: HashMap Pointers, references, andaliases

C has pointers, which are storage addresses. This is pretty easy for C to do, since it’s all about operating on one big array of storage. A pointer is just an index into thatstorage.

C++ inherited pointers from C, but chastizes you for using them. As an alternative it introduced “references”, which are exactly like pointers, except you can leave off the * . This added a very strange new capability that didn’t exist in C: two regular ol’ local variables could refer to the same storage, so that a = 5; could also change the value of b .

And so all programming conversation was doomed forever, but more on that in amoment.

Rust has things called references, and uses the C++ reference syntax for their types, but they’re really “borrowed pointers” (i.e., pointers, but opaque and subject to compile-time lifetime constraints). It also has lesser-used “raw pointers”, which use C’s pointersyntax.

Perl has things called references. Two different kinds of things, in fact. The ones people generally refer to are “hard references”, which are pretty much like C pointers, except the “address” is supposed to be opaque and can’t be arbitrarily operated on. The others are “soft references”, where you use the contents of a variable as the name of another variable using much the same syntax as hard references, but this is forbidden by use strict so doesn’t see much use (and can be done other ways anyway). Perl also has things called aliases, which work like C++ references ― but they don’t work on local variables, and they’re not really a type, just explicit manipulation of the symbol table. (Cool fact: Perl functions receive their arguments as aliases! It’s easy not to notice, because most

Viewing all articles
Browse latest Browse all 9596

Trending Articles