Cogitations

Command of the Day

ssh-keygen -t dsa -b 1024 -f /etc/ssh/ssh_host_dsa_key -N '' &&
  ssh-keygen -t rsa -b 1024 -f /etc/ssh/ssh_host_rsa_key -N '' &&
  /etc/init.d/ssh restart

I really don’t think distributors should try to patch cryptographic stuff, especially not to silence debuggers.

Mail Problems

May 12th, 2008

Small notice for persons trying to mail me the last ~three days: While I was off to Vienna I noticed that HE disabled the E-Mail routing for the domains I moved to domainfactory. Mails send to me between Friday and today are probably lost.

Jinja2 Documentation Online

May 7th, 2008

I now uploaded the documentation for Jinja2 to the website for those of you who are eager and want to play with it :-) On jinja.pocoo.org you have now the choice to chose between Jinja1 and Jinja2.

The new docs are powered by Sphinx and Jinja2 with a custom templating bridge.

Read the documenation.

Simple batch function for Python

Often I have an iterable i want to group. For example a list of integers and i want to process two at once. That’s a pretty nice idom I found in the documentation translated to itertools:

from itertools import izip, repeat

def batch(iterable, n):
    return izip(*repeat(iter(iterable), n))

Use it like that:

>>> for key, value in batch([1, 2, 3, 4], 2):
...  print key, value
... 
1 2
3 4

rst2html + git == personal wiki

This Makefile:

RSTOPTS=--time --link-stylesheet --stylesheet=style.css

SOURCES=$(wildcard *.rst)
HTML=$(foreach file,$(SOURCES),_build/$(basename $(file)).html)

all: html

_build/%.html: %.rst
        rst2html.py $(RSTOPTS) $^ > $@

html: $(HTML)

clean:
        rm -f $(HTML)

plus make html in .git/post-{commit.update} + python and docutils + a stylesheet in _build (all paths relative to your repository) is the perfect cross platform wiki :-)

Notice: my blog kills the tabs, copy/paste from the pastebin

How super() in Python3 works and why it’s retarded

I’m deeply sorry for the title of that post, but I hope that gives the topic the awareness I think it should get. In the last weeks something remarkable happened in the Python3 sources: self kinda became implicit. Not in function definitions, but in super calls. But not only self: also the class passed to super. That’s remarkable because it means that the language shifts into a completely different direction.

super was rarely used in the past, mainly because it was weird to use. In the most common use case the current class and the current instance where passed to it, and the super typed returned looked up the parent methods on the MRO for you. It was useful for multiple inheritance and mixin classes that don’t know their parent but confusing for many.

The main problem with replacing super(Foo, self).bar() with something like super.bar() is that self is explicit and the class (in that case Foo) can’t be determined by the caller. Furthermore the Python principle was always against functions doing stack introspection to find the caller. There are few examples in the stdlib or builtins that do some sort of caller introspection. Those are the special functions vars(), locals(), globals(), and __import__ and some functions in the inspect module. Four functions, and all of them do nothing more than getting the current frame and accessing the dict of locals or globals. What super in current Python 3 builds does goes way beyond that.

Currently if super is called without arguments Python performs these steps:

  • getting the current frame of the caller as well as the code object.
  • looking at “co_argcount” to make sure there is a first argument, if there is one it gets the object from the “f_localsplus” array on the frame object. This is btw an attribute not accessible from the Python code.
  • then it checks the “co_freevars” of the code object and iterates over all of them to check if one of them is “__class__” (because accessing __class__ in Python 3 creates a special bytecode that returns the class the function was defined in).
  • It it can’t find the __class__ in there it dies. How does __class__ end up there? Apparently the compiler checks if “super” or “__class__” is accessed. That’s right. It breaks if you alias super to another name and try to call that name.
  • Once it has that information it uses that as two first arguments. The class and the reference to self

I’m sorry, but that’s a very, very bad idea. It’s way more magical than anything we’ve had in Python in the past and just doesn’t fit into the language. We do have an explicit self in methods and we do not have methods. Our methods are functions, just that a descriptor puts a method object around it to pass the self as first arguments. That’s an incredible cool thing and makes things very simple and non-magical. Breaking that principle by coming up with an automatic super harms the whole thing a lot. Defs in classes are not completely differently from defs in the global scope or within another def.

Another odd thing is that Python 3 starts keeping information on the C Layer we can’t access from within Python which is a shame. Super is one example — it’s currently impossible to implement that from within Python. The other good example in Python 3 are methods. They don’t have a descriptor that wraps them if they are accessed via their classes. This as such is not a problem as you can call them the same (just that you can call them with completely different receivers now) but it becomes a problem if some of the functions are marked as staticmethods. Then they look completely the same when looking at them from a classes perspective:

>>> class C:
...  normal = lambda x: None
...  static = staticmethod(lambda x: None)
... 
>>> type(C.normal) is type(C.static)
True
>>> C.normal
<function <lambda> at 0×4da150>

As far as I can see a documentation tool has no chance to keep them apart even though they are completely different on an instance:

>>> type(C().normal) is type(C().static)
False
>>> C().normal
<bound method C.<lambda> of <__main__.C object at 0×4dbcf0>>
>>> C().static
<function <lambda> at 0×4da198>

While I was quite happy with the Python 3 progress so far, these two things are a major, major step into the wrong direction. I really hope that will be rolled back. If there is need for an automatic super self has to go away and __class__ become a free variable all the time or super a keyword. Everything else is too magical and more magical.

Update: I posted the subject on the python-dev mailing list.

The Pythonistas are Wrong

There’s something that’s been bugging me for a long time that I need to get off my chest. Some of you may hate me for it, but perhaps there are others out there with the same complaint, silently in agony, wishing for death to take the pain away. It’s time to set the record straight, and prove once and for all that the Pythonistas are wrong.

Pythons almost NEVER look like this:
python logo

The frog shown here is what the Python Foundation refers to as a “snake” (though it looks more like a frog), more specifically a blue/yellow one. The name “Python” however refers to a group of six British Gentleman* and something like 86.43% people know that. The name was chosen because snakes just suck. Get it? It’s not a snake, they are British.

Pythons however are better represented by a 16-ton weight or a dead parrot. But they are NOT represented by snakes.

scipy logo
See that one in the scipy logo? That’s a public domain circle someone added a white snake to. A SNAKE. Look at the wikipedia article and search for “snake”. Yeah, no match.

pycon08 logo
Even the Pycon (where Guido van Rossum himself spoke) has made the mistake of choosing this stupid snake.

xml tag python
lxml is doing it wrong too.

And probably your favourite Python module too. So keep in mind: Pythons are not Snakes!. And I think that proves once and for all that there are tons of projects with the wrong logo out there.

Sorry headius for taking advantage of your blog post but I wanted to blog about that for quite some time anyways ;-)

Update: fixed my mistake about all Pythons being British. Thanks Joe Pantuso.
Update 2: apparently they are all British now. *Terry Gilliam renounced his American citizenship. Thanks meow

JavaScript WTF

Today in the daily wtf is the well known JavaScript parseInt behavior regarding octal numbers. I know that one, it’s old and not really a WTF. The only stupid thing in that is, that the error return value of parseInt is 0 rather than undefined or an exception. But what I found today by accident is this (tested with Firefox):

>>> eval("09")
9
>>> eval("0100")
64
>>> parseInt("09")
0
>>> parseInt("0100")
64

Now that’s a WTF ;-)

Werkzeug Talk at GLT 08

April 20th, 2008

Yesterday I held my talk about Werkzeug at the Grazer Linuxtage. Unfortunately it went not as well as I hoped it would so I’m not unhappy that there is no audio/video record of it ;-)

The slides however are useful I think so I uploaded them in German and English:

Hope someone finds them useful.

Jinja2 — Making Things Awesome

April 13th, 2008

Some of you might have noticed that pocoo.org redirects to the development center now. No direct reference to the old pocoo project any more. The reason for this is that a part of the pocoo team and the ubuntuusers webteam is working on a new software that combines wiki, forum, planet, pastebin and blog into one big portal software. It’s not yet open sourced but it will become a pocoo project, most likely licensed under the GPL around June, with a semi stable release at the end of 2008. I’ll post some details the next week I think.

But why am I writing about this? Mainly because the work on inyoka (the name of the pocoo successor) showed some weaknesses in Jinja and we did some brainstorming to come up with really cool improvements for Jinja that resulted in a complete rewrite. Because we love backwards compatibility these changes will go into a package called “jinja2″ and both Jinja 1.x and the Jinja 2.x will get updates in the future. Also Jinja 2 is not yet released and you shouldn’t expect a release anytime soon, if you want to play with it you will need the current hg version.

But what are the changes to Jinja 1 and why is 2 better? What we noticed when working on inyoka is that quite a lot of display logic code goes into the templates. Quite a lot of logic is in those templates (not application logic though). This isn’t a bad thing actually but it showed that especially for large templates with many dynamic parts such as loops or even variable tags Jinja doesn’t have the performance we wished it would have. Especially if templates become more complex quite a lot of CPU time is spent there are there was room of improvement in Jinja. Unfortunately a design mistake made it impossible to do any optimization there and I was forced to change the scoping rules to something saner. I would say for 98% of the templates the changed scoping won’t make any difference as many have avoided the side effects of the old scoping anyways. But still, it’s a change that will break things, so Jinja 2.

But right after I decided to break backwards compatibility there was more room for improvement which resulted in some real kickass features. Finally filters are simple python functions again and not factories any longer. They now also support keyword arguments, a feature on the wishlist for nearly as long as Jinja exists. Dynamic inheritance is now something that works without pain too and the template inclusions where simplified. And that also means that templates can be included into namespaces now. So {% include macros = "helpers/macros.html" %} gives you an object with you can print or where you can access all the macros or defined variables etc.

What really makes me happy is that on the surface few things have changed. Most of the implementation details are hidden in the compiler and are not visible on the templates. For example that Jinja 2 knows five different ways to iterate over sequences and that stuff is not visible in templates at all. By changing the scoping rules everything of the user friendliness is still in place.

Other nice changes are that you can now filter for loops while iterating over them like list comprehensions in python and that the lexer knows line statements now. Line statements are lines prefixed with some optional whitespace and one or more marker characters. Everything after that until the end of the line is a single statement. That was shamelessly stolen from Mako and Cheetah and is something you have to enable explicitly, but I kinda like it for some template scenarios:

# for item in seq
  * {{ item }}
# endfor

Another very neat idea we implemented in Jinja 2 (wasn’t my idea unfortunately ^^) is that we broke the template scoping into globals and locals. Globals are known at compile time and render time, locals are known at render time only. That way Jinja can automatically evaluate parts of the template at compile time. The idea is that in many templates you have data that doesn’t change that often. A good example is a forum software. The list of forums changes seldom, at ubuntuusers about twice a year but every information that lives longer than fifteen minutes or so can be treated the same way. Anyways. The list of forums is information that doesn’t change, if we can give it Jinja as global variable for the template, Jinja can do quite a lot of compile time optimizations. In the best case the whole list of forums is one giant static block not processed at render time at all.

That keeps the templates designer friendly as the person who designs the template doesn’t have to know anything about that but gives the application developer a simple way to optimize performance. This approach however has some rough edges we have to polish.

And the last important change is that the sandbox is optional now and even disabled by default. Most users were not using it anyways and there is no need to put that into the system by default. It’s however a built in functionality and will get some improvements as well. The new sandbox will give a better control over what’s secure and what not.

How fast is it currently? I really don’t want to throw pointless numbers around but for the test table we’re using currently the speed without sandbox is more than comparable with mako and a lot, lot faster than django’s templates. But please take those numbers with a shovel of salt as we’re talking about an unreleased project here and a more than biased benchmark for one particular use case. Jinja tries to make templating as simple as possible and not as fast as possible.

Making a template engine that’s fast is incredible simple. But making a template engine that doesn’t suck and performs well is a lot harder.

cogitations driven by wordpress