Why Jinja is not Django and why Django should have a look at it

Today I wrote a little Django to Jinja2 template converter. While it can translate most of the builtin template tags into Jinja constructs it doesn’t fully automate the process because you have to extend it for your own custom tags and it doesn’t adapt your templates to the changed semantics.

And these differences in semantics (and the underlying architecture) are something I want to discuss a bit here. Whenever someone mentions Jinja in the Django IRC channel you can be pretty sure that someone else will write something like “… if you don’t have your logic under control” into the channel and position Jinja in the corner where failed concepts lurk around. Of course Jinja leaves more room for abuse than Django does… But this time this isn’t actually what I want to talk about here :)

First of all a small disclaimer: This article covers Jinja 2.0 and Django 1.0.

Lexing

If you compare Jinja and Django template system internals you have a lexer in both of them. The lexer basically breaks the template into small pieces for easier processing. But that’s where the similarities end because the lexers operate on very different levels. Take the following template as a simple example:

Hello {{ name|upper }}!

This is one of those templates that look and work exactly the same in Jinja and Django. First have a look what tokens the Jinja2 tokenizer yields:

>>> from jinja2 import Environment
>>> for token in Environment().lex("Hello {{ name|upper }}!"):
...  print token
... 
(1, 'data', u'Hello ')
(1, 'variable_begin', u'{{')
(1, 'whitespace', u' ')
(1, 'name', u'name')
(1, 'operator', u'|')
(1, 'name', u'upper')
(1, 'whitespace', u' ')
(1, 'variable_end', u'}}')
(1, 'data', u'!')

And here what Django outputs:

>>> from django.template import Lexer, StringOrigin
>>> origin = StringOrigin("Hello {{ name|upper }}!")
>>> for token in Lexer(origin.source, origin).tokenize():
...  print token
... 
<Text token: "Hello ...">
<Var token: "name|upper...">
<Text token: "!...">

So as you can see, whereas Jinja creates very tiny bits of the input string, Django only distinguishes between four different kinds of tokens: text, variables, blocks and line comments. While this is a lot easier to implement for the developer of the template engine, it doesn’t have any advantages over the concept Jinja has chosen. It actually has a lot of negative side effects. For example it’s impossible to write {{ '{% a block in a variable %}' }} in Django. (I know you can use templatetag openblock and templatetag closeblock, but beautiful is something else). It also has the huge disadvantage that tag has to split up the contents of the tag itself which often causes different semantics and syntactic specialities in tags and that for the developer of such a tag it’s hugely more work to do that. The former is probably the worse part of it. For example the url tag in Django takes arguments separated by commas (that are not even allowed to be followed by whitespace) but cycle expects arguments to be separated by whitespace.

The root of the problem is definitively the weak lexer of the Django template engine and I really think that should be replaced by something that yields proper tokens. That would simplify things for tag developers a lot and also lead to a more intuitive experience for template designers that can expect the same basic syntax rules everywhere.

Parsing

The next step is coverting those tokens into meaningful elements. That’s what people refer to as “parsing” usually. Jinja2 has very basic grammatical rules that can be parsed with a simple LL(1) parser (I think it’s LL(1), but don’t ask me, I’m not a compiler guy). The parser goes through the stream of incoming tokens from the lexer and converts those into logical nodes that belong together. For example if you have the template {{ 1 + 2 + 3 }} and the “cursor” of the parser is right before the first digit in the simple calculation, the parser parses this into Add(Add(Const(1), Const(2)), Const(3)). This is useful because the developer of a custom tag doesn’t have to deal with that, the Parser already knows how an Expression looks like. Now you could argue that calculations don’t belong into templates and my point is not valid, but even in the Django template language you have expressions.

The only expression Django knows about are filter expressions. In Jinja2 the parser converts {{ var|escape|upper }} into a proper filter node for you. Django provides a TokenParser for that which can do something very similar. However that parser is not used in every tag and has it’s limitations too. Furthermore was that parser introduced long after the initial implementation of the template language which means that many core tags don’t use it. Because in Jinja it’s a matter of calling parser.parse_expression() to get an expression called, the same requires a lot more typing and checking in Django. A lot of the tags that lurk around in various pastebins or websites don’t even support filters but only variables in some places. Even worse, some people are evaluating the part between the block braces using eval() against the context object.

Again, this simple design of the parser helps nobody but the developers of the template engine. I’ve seen enough Django projects by now that have to write their own template tags because the core tags just don’t do what they need, and in any case the process of developing the tag was more painful than it had to be.

With a newly implemented lexer that yields all tokens of a block or variable one after another a new parser could be implemented based on the design of the Jinja one. And by doing that one has the chance to specify some operators. Nobody is harmed if the templating language supports {% user.karma >= 20 and user.karma < 40 %} and that hardly counts as logic in templates.

Compilation

This step is the step that Django is missing. After the parser assembled tree of blocks and variables and text and everything (called an abstract syntax tree), Jinja compiles the tree down into Python bytecode. It does that by first creating python code and passing that to the builtin “compile()” function to generate bytecode of it. It does not generate bytecode directly, though it would be significantly easier, in order to better support more exotic platforms like appengine and jython.

The compilation of the syntax tree into bytecode is not that interesting in general. Jinja does it because it’s possible and provides optimizations that are otherwise not possible. More about that in the semantics section a bit later.

Evaluation

What’s more important is what Django does on template evaluation. Django is basically rendering the syntax tree on template evaluation. That’s pretty nice and a often used pattern for simple languages from what I’ve seen so far. The problem with Django however is, that it’s incredible slow and currently everything but thread safe. Many tags in the core system modify state variables on the (shared) nodes during rendering. You can easily see that for yourself by using {% cycle "odd" "even" %} inside a loop that iterates over 5 items. Start up your Django server, go to that page and hit refresh over and over again. You will notice that one time the output starts with “even”, one time with “odd”. The reason for that is that the node tree is shared. If you start up the application on a multithreaded server and hit it with tons of ab/siege requests you will even notice that you often get lists that look like “even even even odd even odd” or something similar. And that’s not only for cycle, that also affects block tags. If you extend from a variable template block.super will probably point to a totally different template when the server is under high load.

This is unacceptable behaviour and should be fixed. I’m currently wiring up a patch for that as the ticket was changed from “thread in-safety” to “reset cycle tag after iteration” which shows that at least the editor of that ticket doesn’t get the problem and is lurking around in the Django trac for too long.

The evaluation of a Jinja template doesn’t work over the ast but by evaluating the previously generated Bytecode. And yes, it’s thread safe but that’s not the point.

About Performance

The Django template engine has multiple problems as said above, and one is certainly the performance. Many people argue that the Django template engine is fast enough. Actually, could be. But think about this for a moment: For many CRUD applications you pull stuff from the database without any joins and iterate over the result set. Now guess where (at least in Django) most of the action takes place: In the template. Even the database queries are often sent by the template engine because the querysets are lazy and the initial query is sent in the template. What makes this problematic is that Django’s template engine is an AST evaluator. For every node you have in the template (and that are a lot!) you have one render method that is called. Now imagine you have extended two templates, are four blocks and two ifs deep inside your template. That are already about 10 calls deep. Now try to find read a profiler output.

To show you that I’ve uploaded two profiler outputs (one for Jinja and one for Django) rendering the very same template with the difference that the Jinja version of the template is using a macro and the Django version custom template tags:

Before you try to understand them, a few notes: test_jinja / test_django are the functions that invoke the test rendering process. The reason why the Jinja graph is not joined is that the invocation of the bytecode Jinja generates doesn’t count as regular call and the profiler is unable to connect those. So you have to think yourself the line between render -> and root. In both cases the template engine rendered the templates already a few hundred times before the profiler profiles one single call, so the templates are already parsed (and compiled in Jinja’s case). If you are wondering why there seems to be the template parser active in the django graph, I’m wondering that too. You can have a look at the benchmark to see how it works. If you think the template parser invocation in that profiler output comes from the djangoext.py, you are wrong. That’s what I suspected too. Turns out, even if I don’t use the loader there but preload the template, it’s still happening. So I take that as normal behaviour cause by template inheritance or something like that.

That profiler output shows only the rendering of a pretty normal template situation. Now imagine you have a query somewhere there because of django’s lazy querysets. Now try to figure out what the heck is going on. I was running the profiler against the changeset rendering page in bitbucket and had a call tree so complex that it was impossible for me to figure out what was going on because of 400ms for that page, 300ms were spend in the template. Just that the template invoked mercurials diffing system. That’s insane. That AST evaluator is seriously killing every possibility to get useful profiler information out of the system.

Generating Python-Code Doesn’t Make it Faster

Someone on #django asked why I don’t contribute “the thing that makes Jinja fast” to Django. That’s quite easy to answer: because it’s not that simple. Jinja sets some limitations on the engine to achieve a high performance. For example in Jinja the template context (the data structure you pass to the template) is a data source, not a data container. In Django if you have a custom template tag it is passed a context object you can modify and it will hold the variables of the template. In Jinja the template context object exists, but after the initial creation it is not modified by the engine any more. It’s only used to load yet unknown variables into the namespace Jinja is actually using for template evaluation. What this means is that it’s impossible for a tag to modify the context unless the custom tag knows at compile time the name of the variable it wants to assign to.

This knowledge gives Jinja a huge advantage over Django. Take this little template code:

<ul class="users">
{% for user in users %}
  <li>{{ user.username }}</li>
{% endfor %}
</ul>
<div class="notification">Hello {{ user.username }}</div>

This template code executes in both Jinja2 and Django. However the assumptions the template engine takes are vastly different. Jinja2 is able to translate the template to this Python code internally (without the comments obviously):

# these two variables (users and user) are used in the template
# without being initialized in the template.
l_users = context.resolve('users')
l_user = context.resolve('user')
yield u'<ul class="users">\n'
# because the loop overrides user we assign it to a temporary variable
t_1 = l_user
for l_user in l_users:
    yield u'\n  <li>%s</li>\n' % (
        environment.getattr(l_user, 'username'),
    )
# after the loop we restore the variable
l_user = t_1
yield u'\n</ul>\n<div class="notification">Hello %s</div>' % (
    environment.getattr(l_user, 'username'),
)

If we would want to transform the Django AST into Python code without changing the behavior we would have to do something like this:

buffer.append(u'<ul class="users">\n')
context.push()
context['forloop'] = t1 = {'parentloop': context.resolve('forloop'))
t2 = context.resolve('users')
if not hasattr(t2, '__len__'):
    t2 = list(t2)
t3 = len(t2)
for t4, item in enumerate(t2):
    # Shortcuts for current loop iteration number.
    t1['counter0'] = t4
    t1['counter'] = t4+1
    # Reverse counter iteration numbers.
    t1['revcounter'] = t3 - t4
    t1['revcounter0'] = t3 - t4 - 1
    # Boolean values designating first and last times through loop.
    t1['first'] = (t4 == 0)
    t1['last'] = (t4 == t3 - 1)
    buffer.append(u'\n  <li>%s</li>\n' % (
        environment.getattr(context.resolve('user'), 'username'),
    ))
context.pop()
buffer.append(u'\n</ul>\n<div class="notification">Hello %s</div>' % (
    environment.getattr(context.resolve('user'), 'username'),
))

As you can see. A 1:1 conversion to Python code of what Django templates do currently produces a lot more code. Now I can hear you arguing that the Django example does more because it puts a forloop object into the context. However it has to do that. Because the variables in Jinja are not guaranteed to show up anywhere we have a lot of room for optimizations. If a loop doesn’t use the special loop variable, Jinja won’t create one. It’s that simple. If you don’t access loop variables that require knowledge about the length, Jinja won’t convert the object into a list. What’s a bit unfair is that the Django example has to use buffering. But because tags must have the chance to render nodes they are stored inside them, buffering is necessary unless the custom tag system is changed too.

What’s even worse than the list object inside this loop is context.resolve. And that’s something Django does for every variable access. Imagine you are three levels inside your template (a with, a loop and another loop) and now you try to access a variable inside your loop that was passed to the template. Django has to traverse the context four levels up to get to that data. That’s very expensive. Especially compared to what Jinja does. A local variable in Python as used by Jinja does not end up in a dictionary unless locals() is called or frame.f_locals is accessed. And as long as it’s not in a dictionary no hash code is calculated and no dict resizing takes place. Instead the name gets a number and a place to be. When the function is called Python has already reserved space for that variable. These fast-locals (the internal name for those) are blazingly fast compared to normal dict lookup already, and even faster compared to what django does to resolve variables and you can’t get that without creating bytecode or generating Python code and compiling that.

Synopsis

Django templates are currently…

  • …a lot slower than they have to be
  • …caused by a very weak design that doesn’t really help anyone
  • …also not threadsafe due to some bugs
  • …impossible to further optimize, especially not by “just compiling it to python”
  • …Django’s weakest component
  • …pain in the ass if you want to profile Django

My Pony Request

Django 1.0 is out but that doesn’t mean it’s a good time to stop working on making Django better. It doesn’t help justifying the template language implementation detail by saying it’s fast to parse. All the sub-parsers involved make it rather slow and if you have threading problems under control the memory stay in the memory until shutdown of the process anyways.

Improvement of the template engine is possible, not that hard and will make everybody happier and you don’t have to sacrifice your logic-less templates for that.

And if that’s too radical it would be a step into the right direction to get the threading problems solved.

21 Responses to “Why Jinja is not Django and why Django should have a look at it”

  1. Great post. I want one this pony too.

    Comment by SmileyChris — Tuesday, September 16th, 2008 @ 11:43 pm
  2. Very convincing. I have to say that I’ve always written off Jinja as a hacked up version of Django’s template language, but I’ll definitely give it a second look. One question–I have written several templatetags that depend on having access to the context. For example django-pagination does a lot with the context and I use it in pretty much all of my projects. What would be the process for porting something like that over to Jinja? Would it be something that would simply be impossible with Jinja?

    Thanks, and great work!

    Comment by Eric Florenzano — Wednesday, September 17th, 2008 @ 12:11 am
  3. Also, I want to say sorry for having written it off!

    Comment by Eric Florenzano — Wednesday, September 17th, 2008 @ 12:31 am
  4. Nice writeup!

    The doomsday of some of Django’s initial design decisions has been held off. More often than not for good reason.
    Path dependency factors may render a full scale revision of the template system, as well as native support for SQLAlchemy, unlikely at best. Time will tell.

    Regards,
    zoura

    Comment by zoura — Wednesday, September 17th, 2008 @ 2:21 am
  5. Very good post, thanks for sharing the experience!

    Comment by sean — Wednesday, September 17th, 2008 @ 2:58 am
  6. How about using the Django template system to render its own templates into Jinja2 templates? :)

    Comment by Chris Leary — Wednesday, September 17th, 2008 @ 4:29 am
  7. There’s lots to like about Jinja, but here’s what’s keeping me from considering it:

    From the Jinja docs: “By writing extensions you can add custom tags to Jinja2. This is a non trival task and usually not needed as the default tags and expressions cover all common use cases.”

    This seems odd to me. Although writing custom tags is a coding process, the actual use of the tags should be designer-friendly. I’m not sure expressions are.

    I just don’t understand the idea that Jinja or any template language could cover all common use cases. Almost every Django application I’ve seen includes custom tags that extend templating in a way that could not possibly be addressed by the authors of the template engine. The same applies to all the other systems I have worked with.

    I must be missing something.

    Comment by Greg Fuller — Wednesday, September 17th, 2008 @ 5:01 am
  8. Yes, this is why I’m using django + jinja2 now.

    Comment by Andrey Popp — Wednesday, September 17th, 2008 @ 6:43 am
  9. When you say ‘also not threadsafe due to some bugs’, are these bugs still in Django 1.0?

    At this point had been saying that Django 1.0 was probably okay to use in multithreaded configuration with mod\_wsgi, but if there are still multithreading problems, may have to go back to counselling caution in using multithreading and suggest keeping with a prefork single threaded type model.

    Comment by Graham Dumpleton — Wednesday, September 17th, 2008 @ 7:15 am
  10. > That AST evaluator is seriously killing every possibility to get useful profiler information out of the system.

    Oh yes, profiling Django is something unbelievably hard. :\

    Comment by Alexander Solovyov — Wednesday, September 17th, 2008 @ 7:29 am
  11. Great post! I agree even if i think the world would be better without templates at all.

    Whats your nick on irc?

    Comment by M — Wednesday, September 17th, 2008 @ 8:30 am
  12. This is really well done. The world needs more of these in-depth posts.

    Comment by Le Roux — Wednesday, September 17th, 2008 @ 8:52 am
  13. @7: There are some tags missing in (core) Jinja where I agree that the extension system is useful. For example “{% cache %}“ or “{% blocktrans %}“ (which however is named “trans” in the implementation shipped in the extension module). But those are the minority of things you are writing tags for in Django currently. For example instead of writing a tag for “{% url %}“ it’s easier for you and the template designer to just call a function “{{ url() }}“ in Jinja.

    I like the custom tag system in Django, and I wouln’t want to see it go away if the template engine is improved. But for example “ifnotequal foo bar” “endifnotequal” is no way better than “if foo != bar” “endif”.

    @9: Yep, they are still there.

    Comment by Armin Ronacher — Wednesday, September 17th, 2008 @ 9:30 am
  14. @11: mitsuhiko on irc.freenode.net

    Comment by Armin Ronacher — Wednesday, September 17th, 2008 @ 10:10 am
  15. @7: I’m not thinking of those kinds of tags at all. I’m thinking of application specific tags, not utility or logic tags. Many systems have them, here’s a couple of examples:

    http://www.movabletype.org/documentation/appendices/tags/

    They can be the primary way to expose content, or a way to expose certain kinds of content conditionally, or even a way of pulling in forms:

    http://docs.djangoproject.com/en/dev/ref/contrib/comments/#comment-template-tags

    Comment by Greg Fuller — Wednesday, September 17th, 2008 @ 12:20 pm
  16. @15: thats what jinja expressions just do fine

    Comment by RonnyPfannschmidt — Wednesday, September 17th, 2008 @ 12:54 pm
  17. @15: should have said @13, not @7

    Comment by Greg Fuller — Wednesday, September 17th, 2008 @ 1:02 pm
  18. I’ve found custom template tags extremely useful to pull in data in situations where I’m overriding a template tag provided by a reusable app and I don’t want to touch the view code. Last time I wanted to show some custom stuff on a certain page in my Satchmo installation, so I simply wrote a custom template tag that assigned to the context, smth like: {% get_my_data as my_data %}.

    This may sound stupid, but can I do that in Jinja? Macros can’t access the database but templatetags can.

    Comment by Erik Allik — Wednesday, September 17th, 2008 @ 7:19 pm
  19. @18: You would put a function `get_my_data` into the global template context and do something like “{% set my_data = get_my_data() %}“.

    Comment by Armin Ronacher — Wednesday, September 17th, 2008 @ 7:50 pm
  20. Malcolm Tredinnick wrote this on the django-developers mailing list:
    http://groups.google.com/group/django-developers/browse_thread/thread/4f5b01719e497325/a60123ee5af0c566#a60123ee5af0c566

    Comment by Jens Diemer — Thursday, September 18th, 2008 @ 8:20 am
  21. I like jinja2 because I write less code and it’s faster than with django. That’s what I am looking for. I have just migrated the user home page of mixin.com (http://www.mixin.com/users/ndengler/) from django to jinja2 on our test environment. The time used for rendering the template shrunk from ~1000ms to ~400ms (which is still to much). It’s more than 2 times faster and we are planning to migrate our complete application on jinja2. I have also removed many template filters which were just calling method with arguments. I wish I knew jinja before. Thank you for work.

    Comment by jvisinand — Thursday, September 18th, 2008 @ 4:18 pm

Leave a Reply

cogitations driven by wordpress