Author: | Andrew Montalenti |
---|---|
Date: | 2015-06-01 |
How this was made
This document was created using Docutils/reStructuredText and S5.
Simplicity begets elegance.
Exercises
Me: I've been using Python for over 10 years. I use Python full-time, and have for the last 4 years.
Professionally: I'm the co-founder/CTO of Parse.ly, a tech startup in the digital media space. We build web analytics systems and APIs for the web's best publishers. I'm also the founder/principal at Aleph Point, an agile software engineering consulting and training firm.
E-mail me: andrew@alephpoint.com
Follow me on Twitter: amontalenti
Connect on LinkedIn: http://linkedin.com/in/andrewmontalenti
Simplicity begets elegance.
>>> nums = [45, 23, 51, 32, 5]
>>> for idx, num in enumerate(nums):
... print idx, num
0 45
1 23
2 51
3 32
4 5
It's embedded in the heart of the language.
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
...
Watch out for the increasing human cost of a project as time goes on!
Didier is a Parse.ly cofounder and my favorite Python engineer on the planet.
My choices at the time were Java (the Eclipse-driven language), C (a language of last resort), C++ (a language that comes with reference book), Perl (a Babylonian language), and Matlab (closed source).
At the time, I was done with computer science and I needed a language that I could use for prototyping, like Matlab was, but more general.
There are many programming languages that meet the minimum criteria for a strong platform for modern software development. Python, like many of those options, is:
Python is unique in that it also has:
>>> nums = [45, 23, 51, 32, 5]
>>> for idx, num in enumerate(nums):
... print idx, num
0 45
1 23
2 51
3 32
4 5
>>> nums = [45, 23, 51, 32, 5]
Declares a label called nums, and binds it to a list value.
Could just as easily be written this way, although no Pythonista would ever do this.
>>> nums = list()
>>> nums.append(45); nums.append(23)
>>> nums.append(51); nums.append(32)
>>> nums.append(5)
Contrast to Java:
// Java code
List<Integer> nums = new ArrayList<Integer>();
nums.add(45); nums.add(23);
nums.add(51); nums.add(32);
nums.add(5);
Or, at best:
List<Integer> nums =
Arrays.asList(new Integer[] {45, 23, 51, 32, 5});
In Java, I have even tapped into some obscure features, using Arrays.asList to initialize the ArrayList, leveraging the fact that arrays, but not lists, have a concise initialization syntax.
In Python, lists are declared using [item1, item2, item3, ...], a simple and concise syntax that can be easily read.
This is but one example of hundreds made in the language.
>>> for idx, num in enumerate(nums):
Iterates over each item in the nums list, yielding a tuple for each iteration step that contains the index of the list value and the list value itself.
>> list(enumerate(nums))
[(0, 45), (1, 23), (2, 51), (3, 32), (4, 5)]
... print idx, num
The for loop has created two new bindings, idx and num, for each iteration step of the loop. These bindings are unpacked from the tuples yielded by the enumerate built-in function.
We can break this down by looking at how one step would work.
>>> idx, num = (0, 45)
>>> idx
0
>>> num
45
>>> print idx, num
0 45
Two readable lines of code that do a whole lot of work for you. You've already been subtly exposed to some features we'll learn about in the upcoming course:
- concise list syntax
- variable bindings and scope
- tuples
- iterators (and even generators!)
- value unpacking
- print keyword
Behind every good Python programmer is a good development environment.
... cue elevator music ...
In traditional compiled languages like C/C++, you think about user programs which run directly on the operating system.
Python introduces one more layer of abstraction, the Python interpreter, that is like a mini operating system running above the actual operating system.
There is no separate "compilation" step like C/C++/Java.
You simply run the python command, and you can start evaluating code.
>>> print 1 + 2
3
>>> print 'charles' + 'darwin'
charlesdarwin
The python command, when run with no arguments, opens the interactive shell.
When run with arguments, it acts as a runtime for code that can be stored in files or provided at the command line.
python -c "print 1 + 2"
python myprog.py
We need to move briskly.
Though labels can be re-assigned among types at will, the actual values behind the labels do have types.
And Python does not go out of its way to coerce types that cross "semantic boundaries".
>>> string = "two"
>>> number = 3
>>> string, integer = integer, string # swap values
>>> print string + number
Traceback (most recent call last)
string + number
TypeError: cannot concatenate 'int' and 'str' objects
This makes Python a dynamic, but strongly typed language. Contrast with Perl, which is both dynamic and weakly typed.
"Wait a second. Are you serious? Whitespace is significant in Python?"
"Where the hell are my curly braces!?"
>>> from __future__ import braces
SyntaxError: not a chance (<ipython console>, line 1)
You heard the interpreter.
Short answer: because we do it anyway.
Better answer: because some of us don't.
Python Style Guide (PEP 8) recommends 4 spaces, and no tab characters.
One problem with significant indentation is how to break long statements on multiple lines.
Python has two options:
- The \ character, which indicates that the end of a line has been reached and tells the interpreter to treat the next line as a continuation of the current line; best to avoid this one if possible
- The (...), {...} and [...] characters, which create an implicit continuation; use this one if possible
Another problem with indentation is that it makes it harder to specify "empty statements". e.g., in JavaScript function() {} represents an "empty function". Python does this with a special keyword called pass, that simply does nothing. An example:
if normal_condition:
do_something_normal()
elif special_condition:
pass
else:
do_default()
The middle condition is a "no-op". It's also used for stubs, e.g.:
def placeholder(): pass
# prefer this:
"{foo} {bar} {baz}".format(
foo=foo, bar=bar, baz=baz)
# to this:
"{foo} {bar} {baz}"\
.format(foo=foo, bar=bar, baz=baz)
>>> import string
>>> dir(string)
['__builtins__', '__doc__', ..., 'atof', 'atof_error', 'atoi', ...]
>>> print string.count.__doc__
count(s, sub[, start[,end]]) -> int
Return the number of occurrences of substring sub in string
s[start:end]. Optional arguments start and end are
interpreted as in slice notation.
With no arguments, the help function opens an interactive help prompt.
But usually, you only need help on a specific function or module. So, then you can call help with an argument.
>>> import string
>>> help(string)
Help on module string:
NAME
string - A collection of string operations...
FILE
/usr/lib/python2.6/string.py
MODULE DOCS
http://docs.python.org/library/string
...
>>> print "Hello World!"
Hello World!
>>> cars = 100
>>> print "There are", cars, "available"
There are 100 cars available
>>> print??
Type: builtin_function_or_method
...
print(value, ..., sep=' ', end='\n', file=sys.stdout)
Prints the values to a stream, or to sys.stdout by default.
#!/usr/bin/env python
# Written by John Doe, 6/5/2011
#
if __name__ == "__main__": # only runs when script is executed
print "Hello, World"
>>> x = 3
>>> 1 < x < 3
False
>>> 2 < x < 5
True
>>> 3 == x
True
>>> x + 0.1
3.1000000000000001
Python number literals are either int or float, depending on context. If an expression is a float, IEEE 754 floating point rules apply. Otherwise, normal integer math rules apply.
>>> type(0.1)
<type 'float'>
>>> type(3)
<type 'int'>
>>> type(3 + 0.1)
<type 'float'>
There are two simple ways to do string formatting in Python, and both are very popular.
- %, defined on str objects.
- str.format, a more verbose method available on str.
>>> "%s cars crossed the intersection \
in the last %s hours" % (5, 24)
5 cars crossed the intersection in the last 24 hours
>>> "{num} cars crossed the intersection \
in the last {hrs} hours".format(num=5, hrs=24)
5 cars crossed the intersection in the last 24 hours
There are many programmers for whom strings play a much more vital role than number types.
I am one of those programmers.
My applications deal with the web and with large-scale text processing.
A core understanding of strings and a language with capabilities to manipulate them is critical to get work done in this environment.
In Python, strings can be enclosed with single, double, or triple quotes.
>>> 'single'
'single'
>>> "double"
'double'
>>> """triple"""
'triple'
>>> """though single and double are mostly interchangeable,
triple quotes have a special property: they ignore line
breaks. Thus, they can be used as a kind of 'heredoc'."""
"though single and double are mostly interchangeable,\n triple ..."
They may also be preceded with a special character prefix, indicating a special string mode. Currently, only two are supported: raw strings using prefix r, and unicode strings using prefix u.
>>> print('C:\new\node\nell.exe')
C:
ew
ode
ell.exe
>>> print(r'C:\new\node\nell.exe')
C:\new\node\nell.exe
>>> print(u"\u2192")
→
Strings support a healthy number of methods, including:
- common transformations like lower and upper
- conveniences like strip and startswith
- utilities like find, replace, split, and join
>>> ". ".join("PYTHON IS GREAT".lower().split()) + "."
python. is. great.
Strings also support some great operators:
- + for concatenation
- [idx] for slicing
- * for repeating
- in for substring matching
>>> p = "PYTHON"
>>> g = ("GREAT " * 3).strip()
>>> "GREAT" in g
True
>>> p + (" %s " % "IS") + g[0:15] + "..."
'PYTHON IS GREAT GREAT GRE...'
Python automatically imports a slew of functions, types, and symbols, which are known collectively as built-ins.
These provide the "basic language constructs" before you start referencing the modules of the standard library.
>>> sorted(vars(__builtins__).keys())[-5:]
['tuple', 'type', 'unichr', 'unicode', 'vars', 'xrange', 'zip']
>>> sorted is __builtins__.sorted
True
Built-in functions are covered in this document:
http://docs.python.org/library/functions.html
Some early ones to look at are:
>>> line = "GOOG,100,490.10"
>>> field_types = [str, int, float]
>>> raw_fields = line.split(",")
>>> fields = [ty(val) for ty, val in zip(field_types, raw_fields)]
>>> fields
['GOOG', 100, 490.1000000000002]
The fact that everything in Python is first-class is often not fully appreciated by new programmers.
How would you have written this in Java / C# / C++ / C?
Among the imported symbols from __builtins__ are True, False, and None.
These core symbols used in boolean logic, and are typically utilized with keywords such as is, not, and, or, etc.
>>> x, y = (0, 1)
>>> y == True # aka, "y is truthy"
True
>>> x == False # aka, "x is falsy"
True
>>> y is True # aka, "y is True singleton"
False
>>> x is False # aka, "x is False singleton"
False
>>> y is not True
True
>>> y is not False
True
// Java code
switch (file.getType()) {
case FileTypes.HTML:
return "HTML Document";
case FileTypes.DOC:
return "MS Word";
case FileTypes.EXCEL:
return "MS Excel";
default:
return null;
// ...
}
ftype = file.type
if ftype == "text/html":
return "HTML Document"
elif ftype == "application/ms-word":
return "MS Word"
elif ftype == "application/ms-excel":
return "MS Excel"
else:
return None
>>> x = []
>>> x.append(5)
>>> x.extend([6, 7, 8])
>>> x
[5, 6, 7, 8]
>>> x.reverse()
>>> x
[8, 7, 6, 5]
Lists are reference types, which means they simply contain pointers to objects that exist elsewhere.
Lists can be altered (mutated) at will, which does not require recreation of the entire list.
Lists can be concatenated with other lists using extend, which creates a new list but does not require reallocating all the data.
Lists have a convenient syntax for accessing single elements or even ranges of elements. This is called slicing and can also be done with the slice builtin function or [i:j:stride] syntax.
Lists can be aliased simply by binding a new label to the list. This sometimes leads to bugs!
Lists can also be arbitrarily nested, or contain other types altogether like tuples, sets, or dictionaries.
>>> d = {}
>>> d['a'] = 5
>>> d['b'] = 4
>>> d['c'] = 18
>>> d
{'a': 5, 'c': 18, 'b': 4}
>>> d['a']
5
>>> e = dict(a=5, b=4, c=18)
>>> e
{'a': 5, 'c': 18, 'b': 4}
Dictionaries contain a "magically stored" set of items, which are key to value mappings.
long = file.type
long2short = {
"text/html": "HTML Document"
"application/ms-word": "MS Word"
"application/ms-excel": "MS Excel"
}
return long2short.get(long, None)
>>> ppl = ["Andrew", "Joe", "Bob", "Joe", "Bob", "Andrew"]
>>> ppl = set(ppl)
>>> print ppl
set(["Andrew", "Joe", "Bob"])
>>> trainers = set(["Andrew"])
>>> print ppl - trainers
set(["Joe", "Bob"])
Similarly to dictionaries, sets "magically" store their values. Since sets can only store unique (hashable) values, it will remove duplicates from other collections like lists. Sets also support fast set operations like union, intersection, addition and subtraction.
Good:
for key in d:
print key
Bad:
for key in d.keys():
print key
For consistency, use key in dict, not dict.has_key():
# do this: if key in d: ...do something with d[key] # not this: if d.has_key(key): ...do something with d[key]
A list is a mutable heterogeneous sequence
A tuple is an immutable heterogeneous sequence
i.e., a list that can't be changed after creation
Why provide a less general type of collection?
>>> primes = (2, 3, 5, 7)
>>> print primes[0], primes[-1]
2 7
>>> empty_tuple = ()
>>> print len(empty_tuple)
0
>>> one_tuple = (0,)
>>> print len(one_tuple)
1
Must use (val,) for one-tuples, due to some grammar ambiguity. (...) overloaded for tuples, function invocation syntax, and operator grouping!
>>> pairs = ((1, 10), (2, 20), (3, 30), (4, 40))
>>> for low, high in pairs:
... print low + high
...
11
22
33
44
// Java code
List<String> colors = new ArrayList<String>();
colors.add("yellow");
colors.add("magenta");
colors.add("lavender");
for (int i = 0; i < colors.size(); i++) {
String color = colors.get(i);
System.out.println(i + " " + color);
}
Quite a contrast when you compared to this!
# Python code
>>> items = ['yellow', 'magenta', 'lavender']
>>> for i, name in enumerate(colors):
... print i, name
A Python sequence is any type that supports these operations:
Examples of sequences:
def divide(a, b):
"""Divides operands a and b using integer division.
Returns a quotient and remainder of division
operation in a 2-tuple."""
q = a // b
r = a - q * b
return q, r
>>> from mymath import divide
>>> help(divide)
Help on function divide in module mymath:
divide(a, b)
Divides operands a and b using integer division.
Returns a quotient and remainder of division
operation in a 2-tuple.
...
Logic flows around the function body, then re-enters it upon invocation.
>>> divide(5, 2)
(2, 1)
>>> quotient, remainder = divide(5, 2)
>>> print "quotient is %s, remainder is %s" \
... % (quotient, remainder)
quotient is 2, remainder is 1
>>> return_value = divide(5, 2)
>>> print "quotient is %s, remainder is %s" \
... % return_value
quotient is 2, remainder is 1
def connect(host, port=80, scheme="http", timeout=300):
http = HTTPClient()
http.connect("{scheme}://{domain}:{port}".format(
scheme=scheme,
domain=domain,
port=port), timeout=timeout)
return http
Expect to receive a URL of the format you would find on the web:
"http://www.linked.com/in/andrewmontalenti"
Implement a function, url_parse, that splits this string into dictionary with the component parts, including: scheme, port, host, path, fragment (hash), query string. For the above, it would be:
{ "scheme": "http", "host": "www.linkedin.com",
"path": "/in/andrewmontalenti",
"port": 80, "fragment": None, "query": None }
def url_parse(url):
assert url is not None
scheme, rest = url.split(":", 1)
rest = rest[2:]
offset = rest.find("/")
if offset == -1:
host = rest
path = None
else:
host = rest[0:offset]
path = rest[offset:]
if ":" in host:
host, port = host.split(":", 1)
else:
port = None
return dict(
Scheme=scheme, Host=host,
Port=port, Path=path)
if __name__ == "__main__":
print "Running test cases... ",
url = "http://www.linkedin.com/in/andrewmontalenti"
parts = url_parse(url)
assert parts["Scheme"] == "http", "scheme must match"
assert parts["Host"] == "www.linkedin.com", "host must match"
assert parts["Port"] is None, "port must match"
assert parts["Path"] == "/in/andrewmontalenti", "path must match"
print "OK."
def format_url(d):
fmt = "{Scheme}://{Host}:{Port}{Path}"
url = fmt.format(**d) #** ignore that operator for now
if ":80" in url:
url = url.replace(":80", "")
return url
url = "http://www.linked.com/in/andrewmontalenti"
# end-to-end test
assert format_url(url_parse(url)) == url
Python is a dynamic and opinionated language.
You've already learned the basics:
So far, we learned just enough to be dangerous.
Now, let's act dangerously.
data = []
for line in open('data/commented-data.txt'):
if line.startswith("#"):
continue
data.append(int(line))
print data
data = [1, 2, 3, 4, 5]
f = open('data/output.txt', 'w')
for item in data:
f.writeline(item)
f.close()
reader = open('data/sizable.txt')
lines = reader.readlines()
sorted(int(line) for line in lines)
Explanation of that last line will come soon!
The only parts of the standard library we have used so far are the ones that are built-in -- either methods of built-in types like list and str, built-in functions like sorted, built-in constants like True, or built-in exception types like NameError.
There are a wealth of other functions available via the import mechanism and modules.
This is also how you utilize 3rd-party modules.
Every file named with {name}.py is a Python module called name automatically. If it's on the $PYTHONPATH, it can be imported. It's that simple!
A Python package is a directory full of Python modules containing a special file, __init__.py, that tells Python that the directory is a package.
Packages are for collections of library code that are too big to fit into single files, or that have some logical substructure (e.g. a central library along with various utility functions that all interact with the central library).
>>> import pprint
>>> from pprint import pprint as pp
Importing functions from other modules is probably the simplest form of code reuse available in the Python language
The import keyword function has a shorthand for aliasing a symbol with the as keyword
The from...import syntax tends to be preferred to direct imports
>>> import pprint
>>> pprint = pprint.pprint
Eek, too many pprints!
First, a module.
Then, a label.
Then, a function within a module of the same name!
>>> from pprint import pprint, pformat
>>> pf = pformat
>>> pf is pformat
True
An integer is 32 bits of data...
...that labels can refer to
A string is a sequence of bytes representing characters...
...that labels can refer to
A function is a sequence of bytes representing instructions...
...and yes, labels can refer to them to
This turns out to be very useful, and very powerful
>>> def positive(x): return x >= 0
>>> print filter(positive, [-3, -2, 0, 1, 2])
[0, 1, 2]
>>> def negate(x): return -x
>>> print map(negate, [-3, -2, 0, 1, 2])
[3, 2, 0, -1, -2]
>>> def add(x, y): return x+y
>>> print reduce(add, [-3, -2, 0, 1, 2])
-2
So, you can pass functions around to other functions.
Just the same, functions can also return functions.
These are sometimes known as higher-order functions.
def simple_func(arg1):
pass
simple_func = memoize(simple_func)
simple_func = debug_log(simple_func)
Decorators were meant to make the use of higher-order functions easier in Python. Code like the above used to be common, but now, it can be written more simply.
@memoize
@debug_log
def simple_func(arg1):
pass
Before we talk about decorators, though, we have to understand some more mechanics about functions. The star argument syntax:
def decorator(fn): def wrapper_fn(*args, **kwargs): # ... do something before ... val = fn(*args, **kwargs) # ... do something after ... return val return wrapper_fn @decorator def my_fn(arg1, arg2, default1=None): pass
it = iter(s)
while 1:
try:
item = it.next()
except StopIteration:
break
process(item)
for item in s:
process(item)
This is the right way to write it.
If a class supports __iter__, it is said to be Iterable.
The object returned by __iter__ needs to support two methods: __iter__ and next. This object is an Iterator. Think of it like a "cursor" on the sequence, or as a "yielder of values" from the sequence.
items = [process(item) for item in s]
Python also supports a special syntactic construct that is particularly popular among Pythonistas, known as the iterator expression or list comprehension.
It is basically a declarative version of this code:
items = []
for item in s:
items.append(process(item))
items = [process(item)
for item in s
if item in process_list]
These can be very powerful. The above processes and filters a list at once.
>>> [n ** 2 for n in range(10) if n % 2]
[1, 9, 25, 49, 81]
>>> words = 'The quick brown fox jumps over the lazy dog'.split()
>>> [(w.upper(), w.lower(), len(w)) for w in words]
[('THE', 'the', 3),
('QUICK', 'quick', 5),
('BROWN', 'brown', 5),
('FOX', 'fox', 3),
('JUMPS', 'jumps', 5),
('OVER', 'over', 4),
('THE', 'the', 3),
('LAZY', 'lazy', 4),
('DOG', 'dog', 3)]
>>> import os
>>> from glob import glob
>>> [f for f in glob('*.py*') if os.stat(f).st_size > 6000]
>>> [l[i] + l[i+1] for i in range(0, 1000, 2)][0:5]
[1, 5, 9, 13, 17]
Just because you can doesn't mean you should!
Generators take iterators one step further by allowing them to be easily written and lazily evaluated.
def infinity():
i = 0
while 1:
i += 1
yield i
Yes, this yields an infinite sequence of numbers. (Well, plus or minus overflow)
But calling infinity doesn't yield them all at once. It gives you a generator object, which is an iterator that will return the values specified in the yield expression!
Auto-generates the __iter__() and next() methods.
Saves application state up to the yield keyword.
Raises StopIteration when the generator terminates.
In combination, these features make it easy to create iterators with no more effort than writing a regular function.
>>> inf = infinity()
>>> inf
<generator object infinity at ... >
>>> inf.next()
1
>>> inf.next()
2
>>> for i in range(1000):
... print inf.next(),
3 4 5 6 7 8 9 ...
>>> for idx, val in enumerate(infinity()):
... if 1000 > idx > 1005:
... print idx,
... if idx == 5000:
... break
1001 1002 1003 1004
You can think of the yield keyword similarly to the return keyword, but with a twist.
The function containing the yield is replaced with one that simply returns a generator.
The generator only executes your function upon first call of the .next() method. The function runs until it reaches a yield, and when it does, it yields the value and returns it from .next(). But then, execution stops.
Until the next .next().
def range(num_ints):
vals = []
i = 0
while i < num_ints:
vals.append(i)
i += 1
return vals
def xrange(num_ints):
i = 0
while i < num_ints:
yield i
i += 1
Similarly to list comprehensions / iterator expressions, generators have a declarative form.
>>> is_even = lambda x: x % 2 == 0
>>> sum(x for x in xrange(10000) if is_even(x))
2499950000
>>> (x for x in xrange(10000))
<generator object <genexpr> at ...>
>>> sum([x**2 for x in range(1000)])
332833500
>>> sum(x**2 for x in xrange(1000))
332833500
Our listcomps allocate a new list and populate it with all the values specified in your listcomp expression.
Meanwhile, genexps create a generator object.
If output is only needed as an input for a function expecting a sequence, you should prefer genexps. They will keep memory stable.
If you need to mutate the list (delete or append elements), you need to stick with listcomps.
Most expressions of form fn([listcomp]) and be rewritten fn(genexp) safely.
Is Python a functional language?
Yes.
Is Python an object-oriented language?
Yes.
Is Python schizophrenic?
Maybe.
Perhaps programming paradigms aren't paradigms, but just states of mind.
If you are writing a math utility library, functions may be the best organizing principle.
If you are writing a database-oriented business application, perhaps classes and objects are appropriate.
If you are writing a framework that is meant to be both used and extended, maybe some combination of both is in order.
The standard library shows the flexibility in action.
urllib is basically a set of utility functions for dealing with URLs.
But it can be extended by looking into urllib.FancyURLopener.
pprint and pformat are handy functions for pretty-printing.
But if you need more, your can extend pprint.PrettyPrinter.
Just like modules, classes create a new namespace. But they also have special handling for class attributes.
Every class object is a function which, when called, creates an instance of the class. There is no new keyword.
>>> from profiles import Person
>>> help(Person)
>>> person = Person()
>>> person.<TAB> # (to see methods)
# class with no attributes
class Person(object): pass
person = Person()
# but attributes can be added
person.first_name = "John Doe"
person.age = 42
This is certainly a simple class, but doesn't do much for us.
Roughly interchangeable with using a dict, except don't need to use string literals to get access to named values.
class CoffeeMaker(object):
def __init__(self, num_cups, bean="arabica"):
self.num_cups = num_cups
self.bean = bean
def make_coffee(self):
fmt "made {num_cups} cups of coffee \
using {bean} beans"
print fmt.format(num_cups=self.num_cups, bean=self.bean)
>>> c = CoffeeMaker(2)
>>> c.make_coffee()
made 2 cups of coffee using arabica beans
from datetime import datetime
class Employee(object):
def __init__(self, id_, name, birth_year, role_id=None):
self.id = id_
self.name = name
self.birth_year = birth_year
self.role_id = role_id
def get_age(self):
return datetime.now().year - self.birth_year
id2role = {
"FTE": "Full-time Employee",
"PT": "Part-time Employee",
"INT": "Intern",
"CNT": "Contractor"
}
def get_role(self):
if self.role_id is None:
return None
return self.id2role.get(self.role_id, "UNKNOWN")
>>> emp1 = Employee(1, "John Doe", 1975, role_id="FTE")
>>> emp2 = Employee(2, "Jane Doe", 1954, role_id="PT")
>>> emp3 = Employee(3, "Peter Travis", 1982)
>>> emp1.get_age()
36
>>> emp2.get_role()
"Part-time Employee"
>>> emp3.get_role() is None
True
>>> employees = (emp1, emp2, emp3)
>>> age_sum = sum(employee.get_age() for employee in employees)
>>> avg_age = age_sum/len(employees)
>>> print avg_age
40
import fetchers
import re
class Client(object):
# base URL for Guardian open content API
base_url = 'http://content.guardianapis.com/'
# Map HTTP paths to instance methods:
path_method_lookup = (
('^/search$', 'search'),
('^/tags$', 'tags'),
('^/item/(\d+)$', 'item'),
)
def __init__(self, api_key, fetcher=None):
self.api_key = api_key
self.fetcher = fetcher or fetchers.best_fetcher()
def search(self, query):
self.do_call(query)
A real-world example.
class Person(object):
def __init__(self, name, gender):
self.name = name
self.gender = gender
def __repr__(self):
return "<%s name=%s, gender=%s>" % (
self.__class__.__name__,
self.name, self.gender)
char2gender = dict(M="Male", F="Female")
def __str__(self):
name = self.name
gender = self.char2gender.get(self.gender, None)
if gender is None: return name
return "%s (%s)" % name, gender
>>> person = Person("John Doe", "M")
>>> person
<Person name=John Doe, gender=M>
>>> print person
John Doe (Male)
>>> str(person)
"John Doe (Male)"
>>> repr(person)
'<Person name=John Doe, gender=M>'
>>> person.gender = None
>>> print person
John Doe
Every function declared inside a class body is known as a method, which is automatically bound to an instance of the class at initialization time.
Inside a bound method, the first argument, typically named self, contains the instance of the object in question.
Many programmers find this annoying, and wish they could work around it.
I'm one of those programmers.
But I'm here to tell you, in practice, you just get used to it and it ain't so bad.
Plus, GvR blesses this decision and has said it won't change.
Any non-function attributes are considered "class attributes", and are not treated in any special way. Note these are shared for all instances of the class, somewhat similarly to static properties in other languages.
Any attributes added to the instance directly using self.attr = val are considered "instance attributes".
Sometimes, you'll hear Pythonistas refer to the class dictionary and the instance dictionary. That's because, under the hood, Python classes are basically fancy dictionaries.
A built-in, vars(object), lets you return the "instance dictionary" for a given instance of a class. This is useful to fetch all the data of a class in a generic way.
At its core, a class is simply a class definition and typically, an initialization sequence implemented by __init__.
Instances of that class can act either as data containers or, more typically, bundles of state and behavior.
Classes can let you design object-oriented interfaces that are intuitive for business people to grasp.
You might wonder, with all of this flexibility...
How do I decide between a function and a class?
Only take on as much complexity as you actually need! This is the Python Way!
Let's now do a quick run through a few useful Python standard library modules to give you a sense of what they mean by "batteries included".
>>> import random
>>> random.choice(['apple', 'pear', 'banana'])
'apple'
>>> random.sample(xrange(100), 10) # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random() # random float
0.17970987693706186
>>> random.randrange(6) # random integer chosen from range(6)
4
>>> from datetime import date
>>> now = date.today()
>>> now
datetime.date(2003, 12, 2)
>>> now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.")
'12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.'
>>> birthday = date(1964, 7, 31)
>>> age = now - birthday
>>> age.days
14368
>>> import json
>>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
'["foo", {"bar": ["baz", null, 1.0, 2]}]'
>>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4)
{
"4": 5,
"6": 7
}
>>> json.loads('["foo", {"bar": ["baz", null, 1.0, 2]}]')
[u'foo', {u'bar': [u'baz', None, 1.0, 2]}]
>>> from collections import defaultdict
>>> s = 'mississippi'
>>> d = defaultdict(int)
>>> for k in s:
d[k] += 1
>>> d.items()
[('i', 4), ('p', 2), ('s', 4), ('m', 1)]
>>> is_even = lambda x: x % 2 == 0
>>> items = sorted(range(1000), key=is_even)
>>> from itertools import groupby
>>> odd, even = groupby(items, is_even)
>>> odd[1].next()
1
>>> odd[1].next()
3
...
Use your powers wisely, and always remember...
It's turtles all the way down!
Now it's time to explore a real Python web application that is using some of the best web prototyping technologies around. These include:
https://github.com/amontalenti/fastflask
https://github.com/amontalenti/fastflask/blob/master/README.rst
Fin!
Your host:
Andrew Montalenti, http://pixelmonkey.org
CTO, Parse.ly, http://parse.ly
Principal, Aleph Point, http://alephpoint.com