WSGI and the unsolved problems
WSGI rocks. But there are still some unsolved problems.
WSGI defines how a python application should talk with a gateway which talks to the webserver. And that works quite good. Also the middleware system. But there is much more then just sending output to the client's browser.
Basically a web application has to:
- send static content (css files...)
- send dynamic content
- get user submitted data (parse POST data...)
- handle multiple connections.
- be fast
When I started to develop web applications I used PHP as many other people did. PHP works very basic. When the webserver encounteres a .php file it calls the interpreter with that file as argument. Using mod_php or fastcgi instead of cgi improves performance by keeping the interpreter in the ram. But for the PHP developer that doesn't matter. He just uses echo to send data and uses the magic superglobal variables to access user provided data.
PHP developers also don't have to think about static files since everything lays in the document root. Static and dynamic content. URL normally don't get designed but rewritten using the apache mod_rewrite module.
The advantages are: Easy to deploy, easy to develop, you don't have to think about threading or internal structures.
Now let's think about the python situation. We have many frameworks and many cgi scripts. And something called mod_python, twisted, insert your development system here, ...
Some years ago Phillip J. Eby recogniced the problem and wrote the famous pep333. The situation changed a bit. We now have another 20 new implementations like web.py, colubrid, django, rhubarb tart... But they have something in common. They build upon WSGI so they run on FastCGI, mod_python, CGI and some standalone servers. A big improvement in my oppinion. But this doesn't solve problems like handling static files, parsing form data...
Each of those implementations parses form data on it's own. django and colubrid use the python Email package, web.py uses the CGI fieldstorage and others do similar things. Many of them (paste, pylons, django, colubrid) implement a request object with provide simple access to user data...
And then there is the "how to send data to the browser" problem. django, rhubarbtart, colubrid and pylons use a response object to send data to a central controller which converts that to a valid wsgi iterable. Others like web.py monkeypatch sys.stdout with a thread local request mapper which internal collects data from the views.
The web.py way looks like the php way. Just using the python print statement. Pylon does similar things by implementing thread local objects like c, m...
No of them thinks of static files... (subdomains and one-folder solutions are a solution for some cases I'll explain later)
Python isn't PHP
That's the most important part. PHP doesn't allow you to manage threads since it only gives you access to the thread of the current request, with a new root scope. (I know that PHP isn't threadsafe by know but once it will it won't work different)
We can't do the same in python when we want fast applications. I know that it's possible to do something like like this:
import os
root = '/home/www-data/public_html'
def run(filename):
ns = {}
execfile(os.path.join(root, filename), ns)
This then would work like php. But that's not fast. And I don't think that that's an pythonic way for webdevelopment.
So. In my mind patching sys.stdout or implementing an module with a magic threadlocal request object isn't a good way.
WSGI
WSGI proposes that applications are callables which get passed a method for sending headers to the server and an dict with access to CGI variables. Each iteration of that application sends data and flushes after. That's pythonic. But not easy to use.
Also WSGI thinks that you have one application file. So nothing with "pass everything which matches *.py to the webserver". You have one handler file and use the passed PATH_INFO to feed your own url dispatcher. This is pythonic too and leads to nice urls and to an problem. Where the hell can the application publish static files if everything get's passed to an internal url dispatcher?
You can...
- ...use an alias in the serverconfig to map an filesystem folder to a subfolder of your application which bypasses the wsgi application.
- ...use a subdomain
- ...serve the files using your wsgi app (which is dangerously slow)
- ...avoid using static files :-)
django wants you to use a subdomain. But imagine you want to use a Moin wiki, a pocoo forum and a django cms. And every application requires it's own subdomain or static files alias...
Or think about an extensable application like a forum system with plugins. Each plugin might publish static javascript files, images... No problem with php but requires some additional code in python. And immagine that you have more than one application, maybe four. And each of your applications handles url dispatching and static file serving on it's own. Four times. That's redundant code. A common problem in the python community.
Shift me to an higher level
Now restart there where mod_php starts. Each .py file get's passed to the python interpreter. The python interpreter imports that file and looks for a function called process_request:
#!/usr/bin/python_cgi
# -*- coding: utf-8 -*-
def process_request(req):
req.start_response(200)
req.header('Content-Type: text/plain')
req.write('Hello World!\n\n')
req.write('QUERY: %r\n' % req.args)
req.write('POST: %r\n' % req.form)
req.flush()
The first important line is the hashbang. It doesn't open this file with python but with a python file that might look like this:
import sys from cgi import execute execute(sys.argv[1])
Next the start response method. As you can see the only parameter is a response object you can use for sending and retrieving data. It features already parsed form data, url parameters as well as the raw values and access to the input, output and error stream.
You're kidding...
Yes. I'm. Bcause the solution posted above isn't better. It's even worse. Looks like the bastard child of mod_python. And that's what it is. It would fix the static file problem, remove the cool middleware thingy and the url dispatching. Welcome in the world of a python that looks like PHP.
And now?
The last days I had looked at mod_ruby, mod_perl, mod_python, mod_php, various FastCGI implementations and even java. Looks like the python situation isn't the worst. We don't have to edit xml files, restart servers for code changes, compile source, don't lack an unique interface like wsgi and don't have threading issues. The only problems left:
- a missing base wsgi package
- static files
The latter is unfixable IMO. The first one is easy to fix. Take a bunch of good python webdeveloper, do some brainstorming and introduce a wsgi package which helps you to parse form data, provides a standalone sevrer with code reloading, some often used middlewares (debugging, static exports...) and avoid thread local magic. Then Frameworks can use this module to avoid DIY solutions.
And when finished that rewrite the import statement to allow eggs to import into subpackages of applications, bundle setuptools and use those eggs to deploy applications and plugins.