NIH in the WSGI World
Today I've seen yet another WSGI powered microframework. It does not do anything another framework does not do, but it exists. Which is not a problem per se. It probably does some things differently to other things out there and that would be perfectly okay. Except … more than half the code are repetitive WSGI bridging.
Seriously guys, stop doing that. For the following reasons:
cgias a module is for CGI, not WSGI. Don't try to use it for WSGI applications unless you know what you're doing. It's expecting a WSGI server that implementsreadline()with a size hint which is not compliant to the specification. Also with the wrong invocation it will read your command line arguments and incorporate those into the parsing process and other weird things.- Half the frameworks out there are not implementing proper multi dicts or try to leave that to the user by returning either lists or strings from URL parsing or whatever can yield multiple values for a key.
- Most frameworks get unicode terrible wrong. How hard can it be to properly implement unicode …
- Some of you are expecting EOF on input streams which might not be there.
- URL-decoding the path info is not something you should do. I know there are WSGI servers that are doing it wrong, but those things have to be fixed in a fixer middleware and not in your framework. You cannot reliable auto-detect that.
- Many frameworks are catching system exceptions such as
SystemExitandGeneratorExitand others that can cause ugly problems.
There are libraries like WebOb and Werkzeug that are not doing anything besides the very basic things such as form data parsing and stuff. Especially Werkzeug can be used totally low-level, without any request or response objects. It's just doing the boring parts you don't want to implement anyway.
Why shouldn't you reimplement it yourself? There is not much win by doing that. A single dependency for your framework won't kill anyone. The microframeworks in the Ruby World all depend on various stuff (such as Rack). The main reason for not reimplementing are server and browser bugs, limitations in WSGI that have to be worked around, complex issues that are hard to get right and other things where people should rather collaborate.
It's already problematic that there is the Django core, WebOb and Werkzeug that are all implementing low-level parsing and similar things, but I'm pretty sure that we can do better there. For future bugs in Werkzeug my policy is to check for similar problems in both WebOb and Django to ensure that nobody is missing anything here.
So I beg you: If you're working on a microframework, depend on WebOb/Werkzeug/Django, whatever or at least steal the code with copy/paste. Talk back to other developers and share patches. You don't win anything by reimplementing basic things on your own. Not even an understanding of HTTP or WSGI, those things turn out to be only learned by reading the specifications carefully.
I think part of the issue here is that WSGI looks deceptively simple to someone approaching it for the first time, and so the initial impression is "oh, this will be easy to handle, and I can write something that does just what I want". Of course, down the road it turns out not to be quite so easy since there plenty of weird-ish provisions and corner cases in the spec that are tricky to get right.
But honestly, at this point I really feel like WSGI should be something that most developers never need to think about; it's simply too low-level (it is, after all, mostly just a wrapper around the CGI model), and the community now has many years of collective experience at developing better abstractions for HTTP handling. It seems like every PyCon I go to the major framework folks are talking about finally sitting down and doing a compatible and interoperable request/response API that they'll all use, but of course it always fizzles out. Which is a shame, because a good API for that, if adopted, would suddenly make a whole lot of code much more portable.
So what I'd love to see is:
— James Bennett on Thursday, July 30, 2009 20:51 #
I created Newf as a one-off project and decided to "release" it to GitHub and have gotten the attention of way more people than it should have received. The damn thing is really not optimal.
If I were to re-do it, I probably wouldn't.
— Jared on Thursday, July 30, 2009 21:01 #
I think the 2nd item James brings up is exceptionally important. I was speaking with Mark Ramm at PyOhio and one of the ideas we discussed was being able to plug Django apps into TurboGears sites and vice versa. Each of these frameworks has a different API that it's view/controller functions work with, and that exacerbates any compatibility issues. Being able to have a unified request/response API would alleviate a lot of the concerns here.
— Alex on Thursday, July 30, 2009 21:41 #
How about web.py internals? How mature do you think is that code compared with Werkzeug?
— John on Thursday, July 30, 2009 22:03 #
Were there any options to avoid `NIH` when you implemented Werkzeug?
— Anonymous on Thursday, July 30, 2009 23:52 #
Too true. What I find particularly annoying about the proliferation of WSGI components/middleware is that they are more often than not created by some one as a by product of trying to do something else. As a result they only implement the very narrow functionality required for the overall application they are trying to produce. This means that in the greater scheme of things they are generally useless as a reusable component that might be adopted by someone else. This doesn't though stop the person publishing their WSGI component on PyPi or elsewhere, with the result being that when you go searching for a needed WSGI component, the good ones are hard to find amongst all the poorly documented and feature incomplete crap that is out there. Because the WSGI component is a by product and not the goal, if you report bugs, incorrect behaviour or non adherence to the WSGI specification, the authors don't do anything about it. Their attitude is one that if it doesn't affect how they use it, that they don't care.
In my opinion, developing truly reusable code libraries or toolkits requires a completely different mindset and attention to detail than producing your run of the mill code for a particular application. Good reusable code comes from that code itself being the goal and not just a by product. At the same time though, it still requires the author to have a really good appreciation of how the code will or could be used, implementing it well and also documenting it well. Most coders just aren't good at this and frankly should not be allowed anywhere near code which is to be reused by others. These comments don't apply to just WSGI components/middleware but equally apply to any supposedly reusable code in any programming language.
Another good example of where I see this as a big issue in the Python world is buildout recipes. Here again they are often a by product of achieving something else and every man and his dog is publishing their own custom recipes. Often these recipes are slight variations on other peoples recipes because the original didn't do quite what was required. So, like with WSGI components/middleware there is a growing list of crap out there which one has to wade through to find the really useful stuff. For buildout recipes in particular the need for sanity is perhaps even greater than for WSGI. It is crying out for someone to come along and develop a far ranging toolkit of really useful basic recipes, designed to be feature rich but flexible and very well documented. There are some collections of recipes out there, but for really basic stuff they don't go far enough and the recipes they provide are generally only there to support much more complex recipes designed for a narrow purpose and so haven't been made generic enough to be classed as good reusable components. Like with the basic WSGI components, there really needs to be a gold standard collection of basic buildout recipes which people should always use.
Anyway, I very much support Armin's call to developers out there, try and use low level toolkits like WebOb and Werkzeug. If you still want to develop your own then by all means do so, but keep it to yourself and don't publish it, or at least not until it comes up to a level of quality where it can truly compete with those existing toolkits.
— Graham Dumpleton on Friday, July 31, 2009 0:17 #
@4: I have not been using web.py for a long time. I guess web.py is a mature framework by now and people have fixed most bugs in the WSGI layer if there were any.
@5: the Werkzeug development was unfortunate. It started side by side with WebOb and I found the other project too late. Now we have two.
— Armin Ronacher on Friday, July 31, 2009 7:29 #
@James Bennet:
---
Please, please, can someone step up and actively drive these on web-sig? Graham, Armin, Ian, anyone?
— anonymous coward on Friday, July 31, 2009 15:51 #
the Werkzeug development was unfortunate. It started side by side with WebOb and I found the other project too late. Now we have two. ---
I'm glad. I like Werkzeug much better (tough at first, I thought the name was unpronunciable). Finally a python web framework that is really flexible and yet easy to use.
Thank you.
— JuanPablo on Sunday, August 2, 2009 15:50 #
In Python world we have this weird notion of everyone working on their own little projects - - without collaborating! This can be seen with the various template systems, the various WSGI libraries, the various web frameworks, the various WSGI server implementations etc.
While this approach produces a lot of "creative" code, it also produces a lot of half-finished projects.
I think you are spot on that we should start collaborating more and start working on projects the would benefit everyone. For example, a standard WSGI server that's secure, implements the specification, performs really well and is backed by the community.
Like some others pointed out a way to approach this could be to add parts to the standard library. For example, add a "real" WSGI server implementation to the standard library, add a "standard" template system, add a standard way of representing request and response objects... etc. Of course, some projects may die if they aren't picked, but really, I think it's better to have really strong building blocks, than have tons of half-broken building blocks.
— amix on Monday, August 3, 2009 23:29 #
Well, it's not too late, you can always switch Werkzeug to use WebOb; they are not that different right now, and via subclassing or upstream changes I'm sure it could be worked out. Be the change you advocate!
— Ian Bicking on Tuesday, August 4, 2009 17:59 #
@11 i strongly oppose that after comparing the code of werkzeug and webob
— RonnyPfannschmidt on Friday, August 14, 2009 13:24 #