Pages tagged as ‘django’

Jinja2 — Making Things Awesome

April 13th, 2008

Some of you might have noticed that pocoo.org redirects to the development center now. No direct reference to the old pocoo project any more. The reason for this is that a part of the pocoo team and the ubuntuusers webteam is working on a new software that combines wiki, forum, planet, pastebin and blog into one big portal software. It’s not yet open sourced but it will become a pocoo project, most likely licensed under the GPL around June, with a semi stable release at the end of 2008. I’ll post some details the next week I think.

But why am I writing about this? Mainly because the work on inyoka (the name of the pocoo successor) showed some weaknesses in Jinja and we did some brainstorming to come up with really cool improvements for Jinja that resulted in a complete rewrite. Because we love backwards compatibility these changes will go into a package called “jinja2″ and both Jinja 1.x and the Jinja 2.x will get updates in the future. Also Jinja 2 is not yet released and you shouldn’t expect a release anytime soon, if you want to play with it you will need the current hg version.

But what are the changes to Jinja 1 and why is 2 better? What we noticed when working on inyoka is that quite a lot of display logic code goes into the templates. Quite a lot of logic is in those templates (not application logic though). This isn’t a bad thing actually but it showed that especially for large templates with many dynamic parts such as loops or even variable tags Jinja doesn’t have the performance we wished it would have. Especially if templates become more complex quite a lot of CPU time is spent there are there was room of improvement in Jinja. Unfortunately a design mistake made it impossible to do any optimization there and I was forced to change the scoping rules to something saner. I would say for 98% of the templates the changed scoping won’t make any difference as many have avoided the side effects of the old scoping anyways. But still, it’s a change that will break things, so Jinja 2.

But right after I decided to break backwards compatibility there was more room for improvement which resulted in some real kickass features. Finally filters are simple python functions again and not factories any longer. They now also support keyword arguments, a feature on the wishlist for nearly as long as Jinja exists. Dynamic inheritance is now something that works without pain too and the template inclusions where simplified. And that also means that templates can be included into namespaces now. So {% include macros = "helpers/macros.html" %} gives you an object with you can print or where you can access all the macros or defined variables etc.

What really makes me happy is that on the surface few things have changed. Most of the implementation details are hidden in the compiler and are not visible on the templates. For example that Jinja 2 knows five different ways to iterate over sequences and that stuff is not visible in templates at all. By changing the scoping rules everything of the user friendliness is still in place.

Other nice changes are that you can now filter for loops while iterating over them like list comprehensions in python and that the lexer knows line statements now. Line statements are lines prefixed with some optional whitespace and one or more marker characters. Everything after that until the end of the line is a single statement. That was shamelessly stolen from Mako and Cheetah and is something you have to enable explicitly, but I kinda like it for some template scenarios:

# for item in seq
  * {{ item }}
# endfor

Another very neat idea we implemented in Jinja 2 (wasn’t my idea unfortunately ^^) is that we broke the template scoping into globals and locals. Globals are known at compile time and render time, locals are known at render time only. That way Jinja can automatically evaluate parts of the template at compile time. The idea is that in many templates you have data that doesn’t change that often. A good example is a forum software. The list of forums changes seldom, at ubuntuusers about twice a year but every information that lives longer than fifteen minutes or so can be treated the same way. Anyways. The list of forums is information that doesn’t change, if we can give it Jinja as global variable for the template, Jinja can do quite a lot of compile time optimizations. In the best case the whole list of forums is one giant static block not processed at render time at all.

That keeps the templates designer friendly as the person who designs the template doesn’t have to know anything about that but gives the application developer a simple way to optimize performance. This approach however has some rough edges we have to polish.

And the last important change is that the sandbox is optional now and even disabled by default. Most users were not using it anyways and there is no need to put that into the system by default. It’s however a built in functionality and will get some improvements as well. The new sandbox will give a better control over what’s secure and what not.

How fast is it currently? I really don’t want to throw pointless numbers around but for the test table we’re using currently the speed without sandbox is more than comparable with mako and a lot, lot faster than django’s templates. But please take those numbers with a shovel of salt as we’re talking about an unreleased project here and a more than biased benchmark for one particular use case. Jinja tries to make templating as simple as possible and not as fast as possible.

Making a template engine that’s fast is incredible simple. But making a template engine that doesn’t suck and performs well is a lot harder.

Genshi Slot @ GSoC 2008

March 26th, 2008

The TurboGears project has been accepted as a mentoring organization for the 2008 Google Summer of Code. Even if you’re not interested in TurboGears because your framework of choice is something else you might still be interested in that one as it includes two Genshi project ideas: Performance and Jython compatibility.

If you have a solid knowledge of XML/HTML, Python and you’re looking for a GSoC project that’s interesting, read Christopher’s blog post about it and go for it :-)

Sphinx Python Documentation Tool Released

March 21st, 2008

As mentioned in an earlier blog post I’m not a fan of full automatically generated documentation as I think it’s clearly the wrong way to solve documentation problems. Whenever I encounter epydoc generated documentation and I find out that that’s the only kind of documentation I run away :)

In the past there have been two kinds of documentation in the python world: handwritten documentation (Django for example) and full automated API documentation (Paste’s). Both of them have advantages and disadvantages for user and developer but none of them was perfect. At least for me. Django’s documentation is one of the best I ever encountered but writing such documentation yourself is painful. I tried to do something similar with Werkzeug and Jinja and it sucks. On the one hand because usually you start improving stuff and add the documentation later on, often forgetting about it. It’s not unlikely that the documentation is slightly different from the actual implementation because someone (usually me) forgot to “sync” them.

With Werkzeug 0.2 I was experimenting with combining those two things. I added some directives to docutils that pulled docstrings automatically from the objects and added them to the rst file. That way the documentation is perfectly in sync with the sourcecode and because the members are specified explicitly I can hide implementation details and private functions. However the Werkzeug documentation builder was and is a hack and was never meant to be used by anyone except the Werkzeug project, and even that one just for one release. The reason for that is that Georg Brandl spent the last couple of weeks rewriting and improving parts of the python documentation tool which powers the new Python 2.6 and 3.0 docs to support non cpython projects too.

The resulting library Sphinx is in my opinion the best general purpose documentation tool since sliced bread! It’s intended to be a tool for handwritten documentation that builds a documentation into standalone HTML files, CHM HTML files, LaTeX or pickles which can be used to display the documentation in a WSGI application.

It uses reStructuredText as markup language for all the documentation, supports syntax highlighting of code blocks via pygments, embedded doctests so that you can extend your testsuite with doctests from your documentation, has support for automatic object documentation by including the docstrings from objects listed (semiautomated documentation as I did with Werkzeug), automatic cross linking, index generation, changelog generators, many custom roles and directives for rst and much more.

While it’s the first release it’s already a very good documented and well tested library and used for the Python documentation. While I hate the word “framework” I think Sphinx could be that for documentation tools. The extension API can be used to add missing features and may also be used for more automated documentation generation in the future.

Unfortunately I wasn’t that active in the implementation of Sphinx so far, so I’m clearly the wrong person for further questions about the development direction of Sphinx but I’m sure Georg will answer your questions. You can contact him in #pocoo on irc.freenode.net or via E-Mail at georg guesswhichcharcomeshere python dot org.

Links:

Werkzeug 0.2 Released!

February 13th, 2008

Wohoo. Werkzeug 0.2 is out now. Werkzeug started as simple collection of various utilities for WSGI applications and has become one of the most advanced WSGI utility modules. It includes a powerful debugger, full featured request and response objects, HTTP utilities to handle entity tags, cache control headers, HTTP dates, cookie handling, file uploads, a powerful URL routing system and a bunch of community contributed addon modules.

So, what’s new in 0.2? Countless things and too many for this small list, but here the most important ones:

  • The path converter limitation is gone. rejoice!
  • In the contrib package there is now a secure cookie (basically a hashed a client side session storage)
  • Exceptions can now return response objects so that you can add headers etc.
  • You can now convert a response object to a different type of response objects on the fly (for example if you have your own response object subclass with special features but the response object returned by a function is a simple BaseResponse)
  • There are a bunch of extra features for response and request objects now (available as mixin classes) for HTTP header parsing and dumping
  • All the routing exceptions are now HTTPExceptions which simplifies dispatching a lot
  • werkzeug.script has a much simpler way of specifying boolean parameters
  • lazy_property is now called cached_property, update your code!
  • many cool small helper functions that deal with python modules and packages. There is find_modules which can return a generator for all the modules below a package and import_string which allows you to simply import objects from a string. No more __import__ hackery needed.
  • the usage of the map adapter is much easier now too and a lot more rest compliant. See the new documentation
  • dozens of small fixes and additions!

There is also a new website and documentation and the tutorial was translated to German. For 0.3 we hopefully have some more translations for the tutorial and a better documentation for the contrib modules which are currently just documented in docstrings.

Grab it from the cheeseshop while it’s hot.

New Werkzeug Website / Docs

February 6th, 2008

Disclaimer: Yes I know the colors are too bright. Everybody who saw that webpage told me that, so I will change that soon. Anyways. Because there is an upcoming 0.2 release I deployed the documentation for the 0.2 release already because most of the users are probably already working on the hg tip (at least that’s what the situation looks like in #pocoo).

The new website and documentation.

Additionally if everything works well there will be a Werkzeug presentation on the Grazer Linuxtage April the 19th. The Werkzeug 0.2 release will be Feb 14th hopefully, until then I have to fight with Jabber, don’t ask :-)

Mercurial for Subversion Users

January 28th, 2008

More and more projects are switching over to mercurial or similar DVCS. Great as mercurial is, it’s hard to get started if you are used to subversion because the concept behind Subversion (svn) and mercurial (hg) is fundamentally different. This article should help you understand how mercurial and similar systems work and how you can use it to contribute patches to the pocoo projects.

If you compare Subversion to mercurial you won’t find that many similarities beside the command arguments. Subversion works like FTP whereas mercurial is bittorrent. In Subversion the server is special: it keeps all the revision log and all the operations require a connection to this server. In mercurial I can take down the central repository if there is one an all developers will still be able to exchange changes. All the revision information is available to anyone and there is absolutely no difference between server and clients.

This fundamental design decision means that there are dozens of separate branches of the code. hg makes it easy to merge and branch and it’s developed exactly for that. In Subversion branching and merging is painful an often people just don’t branch and don’t commit there changes until the testsuite etc. passes again which of course results in huge changesets. But let’s step right into it!

The first thing in Subversion you do is either creating a repository on the server or checking it out on the client. In hg there is no difference between server and client so the process of creating a repository is available to everybody. Creating a repository is just as simple as typing “hg init name_of_the_repository”. If that folder does not exist yet it will create an empty folder and initialize it as root of the repository, otherwise it will create the repository in the name of that folder.

The process of checking out is a bit different from Subversion because it’s effectively the same as creating a branch. Say you want to check out the current Pygments version to do some changes. The first thing you will do is looking for a way to access this repository. There are three very common ways to access it: filesystem, HTTP or SSH. Pygments is available as SSH and HTTP, but for non core developers only HTTP is available. Interestingly quite a few people have problems locating the checkout URL which is not very surprising because hgweb handles that. hgweb is the standard mercurial web interface which doesn’t only provide a way to look at the changesets and tree but also handles patch exchange. In the case of Pygments this command should give you a fresh checkout in a few seconds into the new folder “pygments”:

hg clone http://dev.pocoo.org/hg/pygments-main pygments

One thing you will notice is that it’s incredible fast and even though the repository contains the whole history the checkout is pretty small. By the time I’m writing this blog post the pygments sourcecode including the unittests and example sourcecode without the revision history is 2.5MB. A complete mercurial checkout is only 5MB even though it includes 486 changesets.

After you got your very own repository by cloning the pygments one you will notice that all the subversion-like commands (”hg ci”, “hg add”, “hg up”, …) work locally only. You check into your local version of the repository and hg up won’t incorporate remote changes. One of the things that happen on hg clone is that mercurial will set the path to the repository you cloned from into the hgrc of the newly created repository. This file (”.hg/hgrc”) is used to store per-repository configuration like the path of remote repositories, the name used for checkins, plugins that are only enabled for this repository and more. Executing “hg pull” will automatically pull changes from this remote repository and put them into the current repository as second branch. To see what “hg pull” will pull from that remote repository you can execute “hg incoming” and it will print a list of changesets that are in the remote repository but not yet in the local one. After you have pulled you have to update the repository with “hg up” so that you can actually see the changes. If there were remote changes that require merging you have to “hg merge” them and “hg ci” the merge.

Because this process is very common there are ways to simplify it. “hg pull && hg update” can be written as “hg pull -u”. All the commands (pull, update, merge and checkin if required) can be handled in one go using “hg fe”. This command however is part of a plugin which is disabled by default. If you want to use it you have to add the following lines into the repository hgrc or your personal one:

[extensions]
hgext.fetch=

The other important difference to subversion is how you push your changes back to the server. In open source projects usually only a small number of developers has access to the main repository and contributors create patches using “diff” or “svn diff” and mail it to one of the persons with commit rights or attach it to a ticket in the project’s tracker. If you are a person with push privileges you can do “hg push” and it will push the changesets which are not yet on the server (you can look at them using “hg outgoing”). If you don’t have push access you can create a bundle of changes and attach that to a ticket rather than a patch. A bundle stores multiple changesets in one file and it also preserves the correct author information and timestamps. Another way is mailing the changes to a different developer using the patchbomb extension (I won’t cover that here, just google it up). Or you can let other people pull from your repository. Therefore you either have to configure your apache to server a hgweb instance or you just call “hg serve” and it will spawn a server on localhost:8000 everybody can pull from.

Once the developer has decided to put your changes into the central repository and pushed them, your changes will appear there unaltered and with the same revision hashes. What will be different is the local number the changeset is given. If the revision was called deadbeef:42 locally it could be called deadbeef:52 on the server because different changesets were applied first.

All the commands that interact with remote repositories (”hg pull”, “hg push”, “hg fe”, …) also take a different path than the default path from the hgrc as argument. This allows you to pull changes from repositories shared over the web.

A cool example what mercurial allows you to do is our last ubuntuusers webteam meetup. There we used my notebook to store the central repository and everybody pushed the changes every once in a while to it. Additionally some people exchanged patches to not yet working features among each other so that the code on the central repo was seldom broken. When I left everybody had all the changes locally because they pulled and I could remove my notebook and everybody continued working on their way home. When we met again on IRC I copied my repo on the server and everybody pushed their local changes to it.

Secure Client Side Sessions

Since a few changesets Werkzeug provides a module for client side sessions. While it’s of course not yet perfect I think it’s an interesting approach and we’re currently replacing django’s sessions with Werkzeug’s secure cookie.

How does it work? Basically the data is stored as pickled data inside the cookie. The pickled data is hashed and added to the cookie too. Whenever the data is loaded from the cookie a new hash is created and the two hashes are compared. If they match the data is unserialized, otherwise a new SecureCookie object is created with empty data. As a matter of fact there is no “session key” so if you want to use it with django you have to fake the “session key” by inserting it into the data. We use the following session middleware as replacement for the django session middleware:

try:
    from hashlib import md5
except ImportError:
    from md5 import md5
from time import time
from random import random
from django.conf import settings
from django.utils.http import cookie_date
from werkzeug.contrib.securecookie import SecureCookie

class Session(SecureCookie):

    @property
    def session_key(self):
        if not 'session_key' in self:
            self['session_key'] = md5('%s%s%s' % (random(), time(),
                                      settings.SECRET_KEY)).hexdigest()
        return self['session_key']

class ClientSideSessionMiddleware(object):

    def process_request(self, request):
        data = request.COOKIES.get(settings.SESSION_COOKIE_NAME)
        if data:
            session = Session.unserialize(data, settings.SECRET_KEY)
        else:
            session = Session(secret_key=settings.SECRET_KEY)
        request.session = session

    def process_response(self, request, response):
        try:
            modified = request.session.modified
        except AttributeError:
            return response

        if modified or settings.SESSION_SAVE_EVERY_REQUEST:
            if request.session.get('is_permanent_session'):
                max_age = settings.SESSION_COOKIE_AGE
                expires_time = time() + settings.SESSION_COOKIE_AGE
                expires = cookie_date(expires_time)
            else:
                max_age = expires = None
            response.set_cookie(settings.SESSION_COOKIE_NAME,
                                request.session.serialize(),
                                max_age=max_age, expires=expires,
                                domain=settings.SESSION_COOKIE_DOMAIN,
                                secure=settings.SESSION_COOKIE_SECURE or None)
        return response

Additionally to the normal django session middleware this middleware can handle both persistent sessions and browser-session bound sessions. Per default a session ends at the end of the browser session, if you want to make it persistent you have to set “is_permanent_session” in the session data to True.

Keep in mind that cookies have a size limit and that users will be able to look at that data (but not alter it). The example above requires the current werkzeug tip and not the 0.1 release version.

Django’s Problems and Why 2.0 is a Bad Idea

December 12th, 2007

I stumbled about this thread on django-developers which proposes calling the Django 1.0 release Django 2.0. One the one hand version numbers say nothing. Just take Wine or Trac as two examples that are already very stable but still below the magical 1.0 release. Open Source software often takes some time until a 1.0 is released and that’s perfectly fine. However skipping a version number is purely a marketing trick IMO. Just think of Java which currently names 1.6 Java 6 whereas 1.4 still was Java 2.

With django it looks like the plan is to keep up with Rails which went to 2.0 a few days ago. While I love to see that django kicks ass and it’s moving toward a stable release I have a bad feeling naming it 2.0. Because there is currently a huge gap between rails and django unfortunately. Rails has gained really good integrated migrations, REST webservices, a debugger and many other things django is still lacking.

Django makes an incredible good framework if you get your problem into the use case of django. But as soon as you break out of it and need something that goes beyond what’s possible in django you wish you have chosen something else. The django ORM is far from optimal, the admin rocks but as soon as the number of users exceeds 10.000 users it’s impossible to use it (chose yourself in a dropdown of 50.000 users …) or becomes utterly complex. Complex data models also look awkward in the admin or become too complicated to manage. And if you want to stick with the admin you cannot replace the user model. Now what do you do if you have a forum and want to count the posts? Use a UserProfile module? And how do you want to display a list of users sorted by their number of posts?

Yes there are ways to hack around it but the more complex the application becomes the more you of django’s strengths become obsolete. The application I’m working on right now now is only using two more contrib modules. The auth and the admin, and it looks like we have to drop them too, due to the limitations. All applications in that project hack around ORM limitations, we have an incredible number of recreated base middlewares, we have to monkey patch the request object to hack in subdomain support.

I was talking with David Cramer from curse gaming about some of the issues and he told me that they have forked django at a given point and patched the ORM. The django template engine was replaced by Jinja (our application does the same) and they are caching the hell out of the application to scale it. Bryan McLemore from the curse team told me some time ago that some pages have up to 30 queries on a page.

I don’t want to say that django fails in what it’s doing. But it’s far from 2.0.

Werkzeug Debugger in Django

(Thanks monkey patching). Ugly but works: using the werkzeug debugger with django.

Now digg/reddit it ;-)

Werkzeug 0.1 Released

December 9th, 2007

Finally (sorry for the long delay) Werkzeug 0.1 is out. Here parts of the current featurelist:

  • Provides Request and Response objects for WSGI
  • Handles file uploads by using temporary files for incoming data.
  • Provides a middleware for static data for development purposes
  • Tiny wrapper around wsgiref for easier development (autoreload, optional
    multithreaded enviornment)
  • Unicode aware data processing. Just use unicode everywhere, werkzeug
    handles that for you.
  • Mini template engine. Sometimes string formattings just are not enough
    and real template engines are too big for that tiny task.
  • Context locals. Don’t pass request/user/database connections and
    other objects around. Put them on a global context local object and
    werkzeug makes sure that everyting is cleaned up end delivered well.
  • Test utilities. Create fake WSGI environments and requests to test
    your application.
  • Interactive debugger. Application dies with an error? Hook the debugger
    in and inspect every frame.

Here a screenshot of the debugger in action:
screenshot of the debugger

Small werkzeug example applications can be found in the trac. In the werkzeug.contrib package are also some pieces of code that can be useful for django developers. For example there is a stream wrapper that limits incoming form data to a given number of bytes. This is useful for django because django streams into the memory and not to the file system.

Have fun and report bugs / feature wishes in the trac :-) Get it while it’s hot from the Cheeseshop.

cogitations driven by wordpress