written on Saturday, April 14, 2012
A couple of months ago I finally came to the conclusion that the way I am approaching HTTP is fundamentally flawed and I am already so far down the rabbit hole that it's nearly impossible to turn around and fix it.
The core problem is that I never added enough abstraction to the libraries that were implementing HTTP. While there are request and response objects, neither are entirely independent of the actual ongoing HTTP communication. By that I mean that you can't just serialize/unserialize request and response objects and expect them to work. Granted, it sortof works for response objects and in Werkzeug and most libraries or frameworks you can actually serialize request objects as well if you know how to. But generally speaking, it's not something that was built into the design of those libraries, they are very lightweight wrappers around an external resource (the actual TCP socket connection to the browser).
The reason this was not done in the first place by developers is that you need to support requests of arbitrary input size and responses of arbitrary output size. And that might be a lot more than what you want to store in memory for the duration of the request. That is especially true if you're talking about handling thousands of requests at the same time. You really want to have the smallest footprint as possible.
But the price we're paying for that is just too much. One of the direct consequences for instance is that the first WSGI request object that starts consuming for data is the one that ends up being the only one that can have it. The next request object that starts reading from the stream is destroying your request to the point where you just stall the whole thing and have to wait for the browser to timeout.
What if we would do away with streaming? I know what you're saying next: “are you crazy Armin?” — but think about it. How many of the requests you're handling in your application actually need to be streamed. And how many responses would need that? Obviously you just can't get rid of streaming in general and I'm not arguing for that. But if I would have to do everything over again I would come up with an entirely separate API for streaming that you have to opt in explicitly.
For the code base at Fireteam we are actually exploring this a little bit for a bunch of different reasons and it was the natural result because of different requirements. Our API is not only available via HTTP, it's also available to a bunch of other protocols in order to better support console games. The internal code base still has some HTTP implementation details exposed, but at the end of the day, we're entirely agnostic of HTTP.
Internally we're passing data around and it's glorious. If your request comes in from HTTP we're taking the path and find the handler for it. That handler is not just a function that is invoked, it has metadata attached to it. Each handler specifics its semantics (if it updates a resources, if it deletes a resource etc.) and what kind of data it is accepting.
Once the handling function is invoked we already handled all the stuff from the HTTP layer. The function at that point does not at all care if it was invoked from HTTP or anything else. We're just passing parsed data around. When finally data is about to be transmitted back to the client the metadata of the handler specifies what kind of format is expected and will encode it appropriately.
Why do this? We're independent of HTTP internals for the most part which makes it incredible nice to deal with. We have one central place that is ensuring the data we're handling is of the right format. So a client can't just disrupt the service by sending too much data and it's handled automatically. It also means that we can dispatch that data nicely across process and language boundaries if the need comes up.
How do we deal with streamed stuff? For the most part we don't. Request data can be streamed in, but as the data is arriving we're constructing objects from it and storing them in memory before passing them to the functions that expect them. There are obviously situations where this does not work, namely file uploads. These however are handled entirely separate in our code base and don't share much with the rest of the interface.
We decided to explicitly split the implementation for streamed and non streamed API functions. Not only makes it our code base on the server side nicer, it also means that the client side implementation can make assumptions about how it deals with the data and ignore streaming if it wants to.
How well could that work with a general purpose Python web framework? I think quite well. Imagine you would by default always require stuff to be buffered up. The request comes in, the dispatching of the request is strictly happening on information that is available from the headers only (HTTP method, path, hostname etc.) and then you fine out what function you want to invoke to handle the request.
Once that function is found the meta information stored with the function tells the framework if it wants to operate on buffered or streamed data. If it operates on buffered data the data is fully consumed into memory (after ensuring the data transmitted is not too large) and the dispatching happens. Likewise the response is fully buffered in memory.
The internal request and response objects would then be fully written in a way that you could pass them efficiently around, serialize/unserialize them and so on. You can ignore that a socket is involved and just pass the actual data round in any format you like (which could be HTTP if you really want).
And when you need streaming (either on input or on output) you typically would not mind if the API was different. In fact, most of the time to take advantage of streaming things have to work differently anyways.