I'm very happy and proud to release the latest version of Werkzeug, the WSGI utility library for Python. This release is probably the most interesting one so far. We refactored a lot of internals, overhauled the documentation and pushed out some exiting new features. Unfortunately that release also adds some deprecations and drops Python 2.4 support, but you will notice that it's worth it :)
Improved Test Client
Werkzeug came with a minimalist test client since one of the first releases. However it was never really fun to use and some of the functionality was available through other (and independent) functions such as create_environ as well. With 0.5 we changed that, improved the test client's interface and rewrote it to use the other functions as well. This has the big advantage that whatever you use now, the parameters are the same.
Because of the unified interface you can now automatically create arbitrary WSGI environments for internal requests or whatever you want. Let me give you a small example that shows how you can create new WSGI environments:
>>> from werkzeug import EnvironBuilder
>>> b = EnvironBuilder(path='/foo', base_url='http://example.com/bar')
>>> b.base_url
'http://example.com/bar/'
>>> b.path
'/foo'
>>> b.script_root
'/bar'
>>> b.host
'example.com'
You can easily use this to create a WSGI environment with file uploads as payload:
>>> b.method = 'POST'
>>> b.add_file('file', '/path/to/the/file.txt')
And then create the WSGI environment:
>>> env = b.get_environ()
Not to forget: the test client supports cookies and internal redirects now.
Stream Limiting and Form Data Parsing
WSGI for the longest time had the problem that the specification says the input stream has to provide a readline method but must not support the size hint parameter. Fortunately no implementation we came across cares about that and provides the size hint. With 0.5 Werkzeug does not require readline to support the size argument for form data parsing which makes it fully WSGI compliant for the first time. With 0.5 onwards the input streams you care about are automatically limited to the content length. This means you can savely call read() on the input stream (the one on the request object!) and not cause a lockup. If you don't want to use the request objects you can still use the new LimitedStream class that implements the stream limiting. If you want to savely iterate line by line over the input stream and not break the WSGI specification by supplying a size for readline(), you can use the make_line_iter helper function that will return an iterator that iterates over a stream.
But there is more. The parsing system was rewritten for Werkzeug 0.5 which makes it possible to decide where to store the file before the upload started. In the past Werkzeug always created a temporary file no matter what was uploaded, now you can react to the size of the upload and provide a different stream instead. For example you can decide to store the stuff in memory if the uploaded files are smaller than one megabyte or stream them directly to disk.
You can also limit the upload size so so that Werkzeug exhausts the input stream and throws away the data if the user uploaded a file over the threshold:
from werkzeug import Request
class LimitedRequest(Request):
#: not more than 8 MB are accepted
max_content_length = 1024 * 1024 * 8
#: the maximum size for regular form data (not files) is 1MB
max_form_memory_size = 1024 * 1024
If the user tries to upload more than that and Werkzeug tries to parse the uploaded data it will raise a RequestEntityTooLarge exception. You can either return that as generic error or catch it and display something nicer instead or ignore it.
Another thing we noticed when working on file uploads is that many users try to upload and store files on the file system with (nearly) the same name. Because you can easily cause troubles that way we added a function to secure a filename. This makes sure that a user can't upload a file with a spoofed filename to leave the upload target folder. It also removes non-ASCII characters and whitespace so that the code works the same on unicode and non-unicode file systems:
>>> secure_filename("My cool movie.mov")
'My_cool_movie.mov'
>>> secure_filename("../../../etc/passwd")
'etc_passwd'
>>> secure_filename(u'i contain cool ümlauts.txt')
'i_contain_cool_umlauts.txt'
Other major Changes
The common request headers such as content type, length, referrer or date are now exposed to the full request objects. The handling of content types was simplified and you can now directly access mimetype parameters. The user agent matcher was improved as well and knows about Google Chrome and some other modern browsers.
Again we beefed up our HTTP support and added more parsing and dumping functions for all kinds of headers. Besides content ranges we should have everything covered now. For better URL compliance the magic “&” / “;” switch on the URL decoder is gone for good.
All attributes on the request object are read only now which also includes collections. This gives us the opportunity to fine tune the objects in the future without breaking code and causing confusing behavior. By doing this we also implemented read only collection classes of all builtin containers:
>>> from werkzeug import ImmutableDict
>>> d = ImmutableDict({"foo": "bar"})
>>> d["foo"]
'bar'
>>> isinstance(d, dict)
True
>>> d["bar"] = "new value"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'ImmutableDict' objects are immutable
The builtin webserver no longer requires wsgiref. It should no work a lot better on Windows systems where we had some annoying problems with DNS lookups in the past. Also it supports any HTTP method now and no longer causes troubles if your code emits a Date header.
Because Webservers (and browsers) are often pretty buggy, we added some fixes for servers and browsers to the contrib module. These work around bugs with lighttpd and IIS servers and the infamous internet explorer.
New Documentation
Last but not least the best feature of this release: The new documentation. We worked hard documenting every single interface Werkzeug provides, improving the tutorial and finally documenting the contributed modules. The new documentation also has a section on how to configure web servers, how to test applications and some useful notes on request data handling.
Get It
You can get the latest release version directly from the Python Package Index and the Werkzeug website.