Pages tagged as ‘py3k’

Jinja 3K

March 27th, 2008

So Python 3 is getting ready and looking at all the libraries I maintain I was really afraid of the final release. Looking at all those language changes it sounded like a horrible job to do. But there is that nice 2to3 script which should transform sourcecode from python2 to python3 in a semiautomatic manner. And what should I say? It kick ass.

Getting the current hg head of Jinja running on top of Python 3 was a matter of roughly 15 minutes. The only things I had to change was adding a missing sys import that came from the intern -> sys.intern translation and fixing a couple of sort() calls. Sort now requires you to use a key function. But that was easy to solve and it was only there for python 2.3 backwards compatibility anyways. The other minor thing I had to change was unicode behavior. Previously Jinja has had a env.to_unicode function that coerced bytestrings and unicode strings into unicode strings depending on the encoding setting on the environment. With Python 3 everything will use unicode internally so no need to have an environment charset anways so I replaced env.to_unicode with str everywhere. Theo the other unicode change was replacing file(foo, 'r').read().decode(charset) (pseudocode) with file(foo, encoding=charset).read().

That’s it! Well. I must admit Jinja is by far the easiest templating language to port from the trio of templating languages I use (Jinja, Mako, Genshi) as it has it’s own parser and does not relay on the now removed compiler module which was superseded with the new _ast with a different structure. So it clearly depends on what you’re using in your library how the quality of the conversion will be. For most libraries 2to3 will do a very good job.

Oh. What it also doesn’t convert are C extensions :) I don’t know if the Jinja traceback tools or speedups module work in Python 3 too as they are optional and I was too lazy to compile them, but that’s another thing you have to keep in mind. Also I was unable to run the Jinja testsuite to run everything as py.test doesn’t work on Python 3 so far.

But I’m very happy. If the process stays that simple my forecast of not using Python 3 before 2012 may be wrong. So, go on, port your libraries now, at least for testing purposes and let’s make some of the best available right with the Python 3.0 release which is scheduled for September. If the initial number of working libraries is high enough for some applications it improves Python’s changes for a quicker adoption a lot. I know it’s very unlikely for big projects to switch in a short timeframe but at least smaller projects will be able to benefit from Python 3 earlier.

And even if Guido disagrees: Python 3 is *the* change to break *small pieces* of the API. Don’t break them in a way that everybody is confused and doesn’t know how things work any more. But if there are some design flaws in the library you want to change adapt them with the change to Python 3 rather than between two Python 2 versions. (And of course document them!)

Multiple Inheritance Considered Awesome

March 3rd, 2008

I must admit that I was not that much interested in Python 3 pretty much because I maintain a couple of libraries and all that backwards incompatible stuff looks like pain in the ass. Additionally lots of the stuff from the PEPs looks like it completely changes the way the language behaves. But I was wrong.

One of the things I always liked about Python was the ability to use multiple inheritance. Back in the pocoo days we stated using interfaces in a Zope/trac like manner and then noticed that we can do the very same with simple base classes and isinstance calls too. Yes of course you can abuse that and create inheritance trees nobody wants to look at, but seriously, your fault then.

But what always confused me was that nobody used ruby like mixins. Having multiple inheritance is predestined for mixin in functionality but the only classes in the standard library that did something like that was the DictMixin. And that mixin had another problem: it was not a subclass of dict (obviously). Now many libraries do something like isinstance(foo, dict) to switch between modes. A very common situation where this is necessary is if you want to accept iterables of tuples or dicts. This is a situation where duck-typing doesn’t really work out. Of course you can do hasattr(x, 'items') to check if that object implements the dict interface, but that is ugly and can lead to unexpected behavior. For example: what’s a dict? There are ancient dict like implementations missing the __contains__ method and just have has_key for example.

In many languages missing multiple inheritance this is solved by specifying a IMapping interface and implementing that in custom classes. But of course Python can do better and with Python 3 it finally did. And it did that in a incredible cool way that integrates nicely into the language and doesn’t break the zen of python which is freaking awesome.

So what’s the solution Python 3 has? Abstract base classes! So how do they work. As I already said above in the python python developers usually relayed on two things: duck typing (testing if an object implements a specific method) or instance checks against specific types. Now abstract base classes do both in a clever way. The builtin isinstance function is now overridable via __instancecheck__ and __subclasscheck__. While I doubt that anyone will override that by hand there is some cool metaclass magic going on in the abstract base classes that do that for you.

So an abstract base classes isn’t necessarily a baseclass of the object you are testing against but they could. Let me give you a small example. In Python 2.4 testing if an object is iterable worked like this:

try:
    iter(obj)
except TypeError:
    do_something_with_not_iterable_object(obj)
else:
    do_something_with_iterable_object(obj)

That wasn’t that bad and it has the advantage that we don’t have to implement some IIterable interface in all the iterable things. To check if an object is iterable an call to iter() is enough. But with Python 3 we also have an abstract base class called Iterable which we can use for testing now:

from collections import Iterable
if isinstance(obj, Iterable):
    the_object_is_iterable()
else:
    the_object_is_not_iterable()

But how does that work? The obj does not necessarily inherit from that class. As said above the metaclass of that abstract base class Iterable overrides the test functions and performs the checking for us. It sees that the object responds to __iter__ or the sequence iteration protocol inherited from older python versions and returns True so that we can react to it.

Additionally the metaclass of the abstract base classes keeps a registry of classes that provide a compatible interface. This makes it possible to let isinstance(some_dict, Mapping) return True in no time by just comparing the object type against the list of known classes that registered themselves for the abstract base class.

This happens for example for all the builtin classes. Inside the python module that specifies the ABCs this piece of code can be found:

MutableMapping.register(dict)

Imagine you wrote your own C extension that implements a cool linked list implementation. All you have to do to register your list as Sequence is this piece of code:

from _yourlinkedlist import YourLinkedList
from collections import Sequence
Sequence.register(YourLinkedList)

But that’s just one way to use abstract base classes. The nicest feature of them is that they ship a lot of annoying repetitive bootstrapping code. For example in werkzeug I wrote that nice HeaderSet which is basically a sorted case-insensitive set. But what I do not implement is __and__ and all the other set stuff because I’m a) lazy and b) doubt that someone will seriously missing it for the use case of that object. But what if the set behavior would come for free by just subclassing from Set? That’s what’s now possible in Python 3*:

from collections import Set

class HeaderSet(Set):

    def __init__(self, initial=()):
        self._ordering = list(initial)
        self._storage = set(map(str.lower, initial))

    def __contains__(self, x):
        return x.lower() in self._storage

    def __iter__(self):
        return iter(self._ordering)

    def __len__(self):
        return len(self._ordering)

    def __le__(self, other):
        return self._storage.__le__(other)

* actually I lied here. Right now it’s not possible. The basics work but as soon if I do foo & bar I either get a NameError, itertools not defined or a TypeError because __rle__ is not defined. But that’s why it’s an alpha ;-)

And what has this to do with multiple inheritance? In that example it might not be obvious but have a look at the class graph for that HeaderSet:

Set(Sized, Iterable, Container)
    HeaderSet

Very cool stuff. So what can this be used for? Specifying otherwise unspecified protocols. For example it was a common idom in Python 2.x to duck-type accept io like objects. For example in Django it’s pretty common to do something like this:

response = HttpResponse(mimetype='image/png')
my_pil_image.save(response)
return response

But it was pretty much unspecified what PIL calls on that response object. Just write()? write() and writeline()? Does it seek()? Now we have io.IOBase, io.BufferedIOBase and a lot more. This is great stuff and seriously, can’t wait porting my stuff to Python 3.

cogitations driven by wordpress