Armin Ronacher

WSGI and the unsolved problems

written by Armin Ronacher, on Friday, June 9, 2006 0:00.

WSGI rocks. But there are still some unsolved problems.

WSGI defines how a python application should talk with a gateway which talks to the webserver. And that works quite good. Also the middleware system. But there is much more then just sending output to the client's browser.

Basically a web application has to:

  • send static content (css files...)
  • send dynamic content
  • get user submitted data (parse POST data...)
  • handle multiple connections.
  • be fast

When I started to develop web applications I used PHP as many other people did. PHP works very basic. When the webserver encounteres a .php file it calls the interpreter with that file as argument. Using mod_php or fastcgi instead of cgi improves performance by keeping the interpreter in the ram. But for the PHP developer that doesn't matter. He just uses echo to send data and uses the magic superglobal variables to access user provided data.

PHP developers also don't have to think about static files since everything lays in the document root. Static and dynamic content. URL normally don't get designed but rewritten using the apache mod_rewrite module.

The advantages are: Easy to deploy, easy to develop, you don't have to think about threading or internal structures.

Now let's think about the python situation. We have many frameworks and many cgi scripts. And something called mod_python, twisted, insert your development system here, ...

Some years ago Phillip J. Eby recogniced the problem and wrote the famous pep333. The situation changed a bit. We now have another 20 new implementations like web.py, colubrid, django, rhubarb tart... But they have something in common. They build upon WSGI so they run on FastCGI, mod_python, CGI and some standalone servers. A big improvement in my oppinion. But this doesn't solve problems like handling static files, parsing form data...

Each of those implementations parses form data on it's own. django and colubrid use the python Email package, web.py uses the CGI fieldstorage and others do similar things. Many of them (paste, pylons, django, colubrid) implement a request object with provide simple access to user data...

And then there is the "how to send data to the browser" problem. django, rhubarbtart, colubrid and pylons use a response object to send data to a central controller which converts that to a valid wsgi iterable. Others like web.py monkeypatch sys.stdout with a thread local request mapper which internal collects data from the views.

The web.py way looks like the php way. Just using the python print statement. Pylon does similar things by implementing thread local objects like c, m...

No of them thinks of static files... (subdomains and one-folder solutions are a solution for some cases I'll explain later)

Python isn't PHP

That's the most important part. PHP doesn't allow you to manage threads since it only gives you access to the thread of the current request, with a new root scope. (I know that PHP isn't threadsafe by know but once it will it won't work different)

We can't do the same in python when we want fast applications. I know that it's possible to do something like like this:

import os
root = '/home/www-data/public_html'

def run(filename):
    ns = {}
    execfile(os.path.join(root, filename), ns)

This then would work like php. But that's not fast. And I don't think that that's an pythonic way for webdevelopment.

So. In my mind patching sys.stdout or implementing an module with a magic threadlocal request object isn't a good way.

WSGI

WSGI proposes that applications are callables which get passed a method for sending headers to the server and an dict with access to CGI variables. Each iteration of that application sends data and flushes after. That's pythonic. But not easy to use.

Also WSGI thinks that you have one application file. So nothing with "pass everything which matches *.py to the webserver". You have one handler file and use the passed PATH_INFO to feed your own url dispatcher. This is pythonic too and leads to nice urls and to an problem. Where the hell can the application publish static files if everything get's passed to an internal url dispatcher?

You can...

  • ...use an alias in the serverconfig to map an filesystem folder to a subfolder of your application which bypasses the wsgi application.
  • ...use a subdomain
  • ...serve the files using your wsgi app (which is dangerously slow)
  • ...avoid using static files :-)

django wants you to use a subdomain. But imagine you want to use a Moin wiki, a pocoo forum and a django cms. And every application requires it's own subdomain or static files alias...

Or think about an extensable application like a forum system with plugins. Each plugin might publish static javascript files, images... No problem with php but requires some additional code in python. And immagine that you have more than one application, maybe four. And each of your applications handles url dispatching and static file serving on it's own. Four times. That's redundant code. A common problem in the python community.

Shift me to an higher level

Now restart there where mod_php starts. Each .py file get's passed to the python interpreter. The python interpreter imports that file and looks for a function called process_request:

#!/usr/bin/python_cgi
# -*- coding: utf-8 -*-

def process_request(req):
    req.start_response(200)
    req.header('Content-Type: text/plain')
    req.write('Hello World!\n\n')
    req.write('QUERY: %r\n' % req.args)
    req.write('POST: %r\n' % req.form)
    req.flush()

The first important line is the hashbang. It doesn't open this file with python but with a python file that might look like this:

import sys
from cgi import execute
execute(sys.argv[1])

Next the start response method. As you can see the only parameter is a response object you can use for sending and retrieving data. It features already parsed form data, url parameters as well as the raw values and access to the input, output and error stream.

You're kidding...

Yes. I'm. Bcause the solution posted above isn't better. It's even worse. Looks like the bastard child of mod_python. And that's what it is. It would fix the static file problem, remove the cool middleware thingy and the url dispatching. Welcome in the world of a python that looks like PHP.

And now?

The last days I had looked at mod_ruby, mod_perl, mod_python, mod_php, various FastCGI implementations and even java. Looks like the python situation isn't the worst. We don't have to edit xml files, restart servers for code changes, compile source, don't lack an unique interface like wsgi and don't have threading issues. The only problems left:

  • a missing base wsgi package
  • static files

The latter is unfixable IMO. The first one is easy to fix. Take a bunch of good python webdeveloper, do some brainstorming and introduce a wsgi package which helps you to parse form data, provides a standalone sevrer with code reloading, some often used middlewares (debugging, static exports...) and avoid thread local magic. Then Frameworks can use this module to avoid DIY solutions.

And when finished that rewrite the import statement to allow eggs to import into subpackages of applications, bundle setuptools and use those eggs to deploy applications and plugins.

ps3 and wii

written by Armin Ronacher, on Thursday, June 8, 2006 0:00.

This year Nintendo releases the wii and Sony does the same with their playstation3. Here my comment to the situation.

I like videogames. I own a Nintendo64, a GameCube and an Nintendo DS. Looks like I'm one of those nintendo fanboys. Maybe that is true. I often played SNES games together with a friend of mine and when I heard of that new Nintendo64 I wished that the Christ Child would bring me that cool new console with the Super Mario and Mario Kart games. And I liked them. Later I bought myself some more games and had fun playing them.

When I got older I bought a GameCube with Super Smash Bros which was one of the coolest multiplayer games ever.

I never bought a PS or PS2 because I played shooters like battlefield, ghost reacon and others on my computer. But I realized that GameCube is death end so I wanted to buy a PS3. (Also because it will run Linux)

But hey. I read bad news:

Looks like PS3 will cost at least 500$ which is much more than I would pay for a console. But not a problem. I could wait until the console reaches 300$ or less.

PS3 will ship with a blueray drive. I know that's a fact that is not that bad but think that: a) HD DVD is approved by the DVD Forum as the HDTV successor of the DVD. That means that a ps3 costumer can't be sure that he will be able to buy films in two years when it's not supported by more companies than Sony. HD DVD is supported by Microsoft and others and there are both media content and playable devices available by now. And b) I don't want a movie player, I want a console.

hardware upgrades possible. Argh. Not a second n64 extpansion pack buy-or-die situation for gamers.

I watched the e3 sony presentation and it was very boring in comparison to the nintendo presentation. (And the psp demo of the emulator with the ridge racer ps1 game was depressing. Nobody applaused and the sony officer tried his best to motivate the audience)

The wii news and blog posts sound great by now. I can't wait the launch to try it out myself.

I'm not sure I'll by any of them this year. I'll have to leave school and iz sounds like I've to learn a lot. Videogames arn't the best for improving success at school :-)

The 500 Miles Cronjob

written by Armin Ronacher, on Tuesday, June 6, 2006 0:00.

Imagine that you have one server located in Paris. And a second directly in the same datacenter, same rack and on the same switch. And a server you nearly had forgotten in Nuremberg - in a different country 500 miles away...

I'm working for the german ubuntu locoteam since November 2004 - short after the warty release. When I and Sascha started to build up a community portal we hosted a phpbb (arrr! we still use that crappy software) forum on a virtual server. As time went by and ubuntu got hyped the vserver was to slow for the high amount of connections. Matthias Urlichs hosted ubuntuusers for us on his private box called netz which is located in Nuremberg.

July 2005 the server reached its limit and together with the french locoteam we got two new servers (actually more but one isn't working by now and the other one is in an external datacentre) and moved the webpages to that new servers.

After moving the german portal software to mawu (the name of the apache server) I took a vacation and the others of the team finished the setup alone.

I always wanted to know where the searchindex indexer is running because I never saw a task for it. But I always thought the task might be called just python so I never thought more about it. Until today. I wanted to change something in the sourcecode and havn't found the script. It looked like nobody had moved the script to the new server. But the searchindex looked up to date. I logged in on netz ant top showed me a running indexer script. It used smurfs own python module for connecting to the database. Since we updated the cfgss on netz too it automagically connected to sql.ubuntu-eu.org, the server in Paris.

Damn. The whole indexer was running over a big distance and nobody noticed it...

btw: I noticed the the archives in this blog are broken. I'll try to fix that asap

AFK

written by Armin Ronacher, on Tuesday, May 30, 2006 0:00.

grml. afk for four weeks.

I'm afk for 4 weeks. broke my upper arm so no updates in my repositories from me in any project I'm involved. I'll try to stay online from time to time to get updated, you may reach me on irc.freeenode.net

Vim Theme

written by Armin Ronacher, on Monday, May 29, 2006 0:00.

Because someone recently asked me what vim theme I use, here the theme for download.

It's a very basic dark theme as an replacement for the shipped "desert" colorscheme. A bit more colorful :-) You can find a screenshot here.

Please not that this is a gui theme. The commandline support isn't that good.

have fun

download native.vim