Cogitations

Stupid Ruby Leak

def leak
  "a string".split(/s+/)
end

leak while true

Workaround: assign the return value to a temporary variable and then return it.

How can I compile my Python Scripts?

I see that question about once a week, maybe a bit less often, on the German Python forum. Very often it turns out that py2exe is what the poster was looking for because he doesn’t want to the user to install Python. I can understand that because for the average windows user it doesn’t really make sense why he should install that Python thing. But Python is a runtime environment for the application, like the JVM or the .NET framework windows users do install. So if the windows version of Python would automatically upgrade itself somehow that wouldn’t be a big problem.

However the motivation is very often a different one: “I don’t want others to see my code” aka “I want to obfuscate my code”. Often with the addition that they are afraid others steal their code. The first point is understandable, I just have to look at the code they post in their questions they ask in that forum to see that they really have to be ashamed of. However that’s part of the learning process and by sharing your applications with others including the source you can learn from that. The second argument is that people can steal your code. But surprise: they can do it even if they don’t have the sourcecode. Python has very high-level bytecode and you can decompile it into very nice looking code, even with the original variable names preserved. I would argue that it’s even easier for them to steal code from small closed source Python applications because with a little bit of work the resulting source code looks so fundamentally different when directly compared with the original that one wouldn’t see many differences.

But really. The solution is a proper license, not trying to lock people out. Especially for small applications. Either pick one of the really good open source licenses that range from having a heavy copyleft such as the GPL to something like the do-what-the-fuck-you-want license which allows everything. Or pick a license that doesn’t allow any modifications and put your Copyright into every file. If a bad person actually steals your (probably still crappy) code you can then write a blog post and show that you were the first person that wrote that code. I doubt you want to sue the person that stole your code because that’s pretty expensive, but with the power of the blogosphere you can at least give that copycat a very bad reputation.

By having an open sourced codebase (even if it’s bad) you will attract some other developers that will probably provide patches and help you improve the product. And if it’s good enough, you can make money with it. Just look at WordPress. Some parts of the code look like written by bozos not much better than the average person asking on the python forum how to obfuscate Python code. And still, they are the number one blogging platform and I can only admire what they achieved.

Do you still want to obfuscate your code?

Mail Subjects

I found two mails in my Junk folder today that where ham. I spotted them by accident as the subjects where very stupid. The first mail arrived with “Re:” as subject, the second with “Hi”. Please make my life easier and use subjects that don’t look like spam. Thanks in advance

And no, I haven’t written a mail without subject in the first place.

Command of the Day

ssh-keygen -t dsa -b 1024 -f /etc/ssh/ssh_host_dsa_key -N '' &&
  ssh-keygen -t rsa -b 1024 -f /etc/ssh/ssh_host_rsa_key -N '' &&
  /etc/init.d/ssh restart

I really don’t think distributors should try to patch cryptographic stuff, especially not to silence debuggers.

Mail Problems

May 12th, 2008

Small notice for persons trying to mail me the last ~three days: While I was off to Vienna I noticed that HE disabled the E-Mail routing for the domains I moved to domainfactory. Mails send to me between Friday and today are probably lost.

Jinja2 Documentation Online

May 7th, 2008

I now uploaded the documentation for Jinja2 to the website for those of you who are eager and want to play with it :-) On jinja.pocoo.org you have now the choice to chose between Jinja1 and Jinja2.

The new docs are powered by Sphinx and Jinja2 with a custom templating bridge.

Read the documenation.

Simple batch function for Python

Often I have an iterable i want to group. For example a list of integers and i want to process two at once. That’s a pretty nice idom I found in the documentation translated to itertools:

from itertools import izip, repeat

def batch(iterable, n):
    return izip(*repeat(iter(iterable), n))

Use it like that:

>>> for key, value in batch([1, 2, 3, 4], 2):
...  print key, value
... 
1 2
3 4

rst2html + git == personal wiki

This Makefile:

RSTOPTS=--time --link-stylesheet --stylesheet=style.css

SOURCES=$(wildcard *.rst)
HTML=$(foreach file,$(SOURCES),_build/$(basename $(file)).html)

all: html

_build/%.html: %.rst
        rst2html.py $(RSTOPTS) $^ > $@

html: $(HTML)

clean:
        rm -f $(HTML)

plus make html in .git/post-{commit.update} + python and docutils + a stylesheet in _build (all paths relative to your repository) is the perfect cross platform wiki :-)

Notice: my blog kills the tabs, copy/paste from the pastebin

How super() in Python3 works and why it’s retarded

I’m deeply sorry for the title of that post, but I hope that gives the topic the awareness I think it should get. In the last weeks something remarkable happened in the Python3 sources: self kinda became implicit. Not in function definitions, but in super calls. But not only self: also the class passed to super. That’s remarkable because it means that the language shifts into a completely different direction.

super was rarely used in the past, mainly because it was weird to use. In the most common use case the current class and the current instance where passed to it, and the super typed returned looked up the parent methods on the MRO for you. It was useful for multiple inheritance and mixin classes that don’t know their parent but confusing for many.

The main problem with replacing super(Foo, self).bar() with something like super.bar() is that self is explicit and the class (in that case Foo) can’t be determined by the caller. Furthermore the Python principle was always against functions doing stack introspection to find the caller. There are few examples in the stdlib or builtins that do some sort of caller introspection. Those are the special functions vars(), locals(), globals(), and __import__ and some functions in the inspect module. Four functions, and all of them do nothing more than getting the current frame and accessing the dict of locals or globals. What super in current Python 3 builds does goes way beyond that.

Currently if super is called without arguments Python performs these steps:

  • getting the current frame of the caller as well as the code object.
  • looking at “co_argcount” to make sure there is a first argument, if there is one it gets the object from the “f_localsplus” array on the frame object. This is btw an attribute not accessible from the Python code.
  • then it checks the “co_freevars” of the code object and iterates over all of them to check if one of them is “__class__” (because accessing __class__ in Python 3 creates a special bytecode that returns the class the function was defined in).
  • It it can’t find the __class__ in there it dies. How does __class__ end up there? Apparently the compiler checks if “super” or “__class__” is accessed. That’s right. It breaks if you alias super to another name and try to call that name.
  • Once it has that information it uses that as two first arguments. The class and the reference to self

I’m sorry, but that’s a very, very bad idea. It’s way more magical than anything we’ve had in Python in the past and just doesn’t fit into the language. We do have an explicit self in methods and we do not have methods. Our methods are functions, just that a descriptor puts a method object around it to pass the self as first arguments. That’s an incredible cool thing and makes things very simple and non-magical. Breaking that principle by coming up with an automatic super harms the whole thing a lot. Defs in classes are not completely differently from defs in the global scope or within another def.

Another odd thing is that Python 3 starts keeping information on the C Layer we can’t access from within Python which is a shame. Super is one example — it’s currently impossible to implement that from within Python. The other good example in Python 3 are methods. They don’t have a descriptor that wraps them if they are accessed via their classes. This as such is not a problem as you can call them the same (just that you can call them with completely different receivers now) but it becomes a problem if some of the functions are marked as staticmethods. Then they look completely the same when looking at them from a classes perspective:

>>> class C:
...  normal = lambda x: None
...  static = staticmethod(lambda x: None)
... 
>>> type(C.normal) is type(C.static)
True
>>> C.normal
<function <lambda> at 0×4da150>

As far as I can see a documentation tool has no chance to keep them apart even though they are completely different on an instance:

>>> type(C().normal) is type(C().static)
False
>>> C().normal
<bound method C.<lambda> of <__main__.C object at 0×4dbcf0>>
>>> C().static
<function <lambda> at 0×4da198>

While I was quite happy with the Python 3 progress so far, these two things are a major, major step into the wrong direction. I really hope that will be rolled back. If there is need for an automatic super self has to go away and __class__ become a free variable all the time or super a keyword. Everything else is too magical and more magical.

Update: I posted the subject on the python-dev mailing list.

The Pythonistas are Wrong

There’s something that’s been bugging me for a long time that I need to get off my chest. Some of you may hate me for it, but perhaps there are others out there with the same complaint, silently in agony, wishing for death to take the pain away. It’s time to set the record straight, and prove once and for all that the Pythonistas are wrong.

Pythons almost NEVER look like this:
python logo

The frog shown here is what the Python Foundation refers to as a “snake” (though it looks more like a frog), more specifically a blue/yellow one. The name “Python” however refers to a group of six British Gentleman* and something like 86.43% people know that. The name was chosen because snakes just suck. Get it? It’s not a snake, they are British.

Pythons however are better represented by a 16-ton weight or a dead parrot. But they are NOT represented by snakes.

scipy logo
See that one in the scipy logo? That’s a public domain circle someone added a white snake to. A SNAKE. Look at the wikipedia article and search for “snake”. Yeah, no match.

pycon08 logo
Even the Pycon (where Guido van Rossum himself spoke) has made the mistake of choosing this stupid snake.

xml tag python
lxml is doing it wrong too.

And probably your favourite Python module too. So keep in mind: Pythons are not Snakes!. And I think that proves once and for all that there are tons of projects with the wrong logo out there.

Sorry headius for taking advantage of your blog post but I wanted to blog about that for quite some time anyways ;-)

Update: fixed my mistake about all Pythons being British. Thanks Joe Pantuso.
Update 2: apparently they are all British now. *Terry Gilliam renounced his American citizenship. Thanks meow

cogitations driven by wordpress