Pages tagged as ‘caveat’

About Bug-Fixing and Politeness

October 13th, 2007

It took less than a day. And yes I was an asshole. I can’t blame the django team for fixing things too slow because being Python’s biggest framework you can break existing code easily and fixes often requires careful consideration.

All in all I’m very happy with django and love using it, but the trac always gives me a feeling I cannot really describe. If you query open tickets you can find around 800 of them and tickets I posted so far never got real attention. However that’s not a big problem because most of them where proposals or feature requests.

Two days ago someone posted that URL bug in the IRC Channel and I bookmarked it. I thought that someone of the developers would have the timeline in the RSS reader and fix that. Yesterday I then added a patch that ignores malformed unicode in those URLs and thought I could get that fixed quickly if I sent the link to the ticket to ubernostrum and got as response something like: “that requires discussing on the mailing list, I’m not sure if ignoring is the correct behavior.” And I guess my answer was something like “I’ve better things to do”.

That and the following blog post was just rude and unacceptable. I promise that I won’t do that again :-)

disclaimer: Europe/Vienna

Abusing XHTML

As small resumption to my previous post about XHTML/HTML here a small list of websites using XHTML that break when rendered on a browser in XHTML mode:

Not that all my XHTML pages are valid, but if they fail… How should browser vendors implement XHTML if that would break the internets?

Doctype Woes (back to HTML4)

October 3rd, 2007

At the moment I’m working together with the rest of the webteam of the ubuntuusers Team on the new portal of ubuntuusers.de based on django. One of the things we will do is consolidating all templates. And while doing so we have to decide to use an HTML/XHTML standard which we will use including the correct mimetype and doctype.

And selecting that is the hardest part because once you’ve decided on something you have to live with the consequences and cannot really change. For example HTML and XHTML have a slightly different DOM or different rules for CSS (CSS for example has an exception that allows background colors on the body-tag to affect the whole page, this exception does not exist for XHTML). Without a doubt many people use XHTML in a wrong way. Just have a look how many people serve their webpages as text/html and only use HTML semantics. They break if you serve them as application/xml+xhtml or render in a wrong way.

But why does XML and SGML have different semantics? SGML itself was created long ago (I assume IBM has something to do with it, at least it’s predecessor was created there) and is an insane specification. At least that’s what the web told me. I cannot tell you if that’s true or not because the standard itself is not available without paying for it :-/

From what sources tell me XML is an subset of SGML. I wonder how that’s possible tough, because there are syntactic elements that in my opinion are not compatible. For example clash XML’s self closing tags with null end tags in SGML:

XML <br /><br />
SGML: <p/This is some text in a paragraph/

Because the slash has a special meaning in tags in SGML it clashes with the closing slash of XML tags. Also SGML is apparently case insensitive where XML is not. Maybe I’m also wrong there and that part is up to the DTD, but quite frankly. I don’t care. I don’t even are about clashing slashes in tags because no browser implements the correct SGML behavior. And if they would do, we would all see invalid output because the web is not valid. It’s not and it will never be.

But what’s indeed ridiculous is that it’s incredible hard to write pages that are semantical and syntactical correct to both HTML4 and XHTML. However you have to make your documents compatible to both if you want to your page to be valid XHTML and render correctly. The reason is that no browser today selects the render mode by Doctype, and even if they would do, other browsers would break then on the huge number of pages that incorrectly use XHTML.

XML is strict, very strict. Syntactical errors appear as big red error messages. I for myself have to work on the wiki markup for the new portal and one of the things I have to deal with is balancing elements. That is possible and simple, but what’s harder is adding paragraphs without breaking things. And that’s not that easy because not every element is allowed in a paragraph and a paragraph cannot be mixed with every element thanks to inline versus block elements.

Even HTML5 disallows that mixing of different element types but at least it doesn’t complain. Sure, I could send the output through a validator and tell the user that his markup is bullshit and he should correct it. But I won’t do that. Users give a fuck about their markup. And I cannot bloat the parser more than it is now. Server resources are limited and additional validation for such a high traffic site is nearly impossible.

Fortunately browsers will never show you those errors because they parse XHTML with their tagsoup parser they use for HTML too. Even tough, if we cannot ensure that all of our pages are valid XML and XHTML we are not allowed to use the doctype because it would break browsers that support XHTML.

While this is hard for webdesigners and especially for programmers that want to create parsers that generate XHTML it’s an even harder job for the developers of browsers. In the end they have to have two independent parsers for HTML and XHTML. This makes it hard enough for the big browser vendors Microsoft, Mozilla, Opera and Apple, but even harder if you are new to the market and want to ship your own one. Because you not only have to be compatible with the new XHTML standard, but also the old HTML one. Nobody will translate all the old documents to XHTML I’m sure ;-)

Details about the issues are summarized here:

Without a doubt we will have fun with XHTML in the future. Probably the web stays like it’s today, we will still use the tag soup parsers, people will write XHTML that is HTML in fact and browsers will interpret it like that.

For me the decision is HTML4 at the moment, with the subset that is valid for both HTML4 and HTML5. That could make it easier for transition once the standard is ready (and I hope it’s earlier than 2022) and it’s good idea now too. Who needs an u-Tag anyway?

Macbook Pro

August 26th, 2007

Since today I’m a proud owner of an Apple Macbook Pro. Well. Proud and annoyed at the same time. I was aware that switching over to a Macbook would take me some time but It’s a whole lot harder than imagined. The hardest part is that the keyboard layout is totally different. And even if some of the keys work the same (normal letters) you still have the problem that all keyboard shortcuts are different. One the one hand because you have to use the apple sign where you would expect the control key if you’re used to linux.

The keyboard is definitively the worst part of the Macbook Pro. It’s missing a delete key and the alt gr key which you usually find on German keyboards. The Alt Gr key replaces the normal second alt key so that you can enter some of the symbols you use in programming languages (square brackets, or even the simple at sign). On German keyboards the “[” sign is on alt gr + 7. On a german macbook pro it’s on alt + 5 and not even displayed on the keyboard. While this is probably not a problem on OS X it will become a problem as soon as you try to install ubuntu on that machine.

But the keyboard is not the only annoyance. The mouse isn’t much better. I’m used to have acceleration on touchpads, but not on normal mouses. But on OS X there is no way to make the mouse movement linear, at least not a non hackish way.

Also the way windows are managed is totally new to me. I would expect that the maximize button maximizes the window. But it just makes it a bit bigger, as big as the window manager thinks it shoud be?

So far I don’t think I will install ubuntu right now, i have to much problems getting into OS X. And the problems i would have with a German mac keyboard on a linux machine are too much right now :-(

BTW: I’m back from Italy

Ruby XMLRPC Vulnerability

June 23rd, 2007

Looks like the Ruby XMLRPC implementation still has a vulnerability:

#!/usr/bin/env ruby
require 'xmlrpc/server'

class TestHandler
  def foo
    42
  end
end

if __FILE__ == $0
  srv = XMLRPC::Server.new(5000)
  srv.add_handler('test', TestHandler.new)
  srv.serve
end

Connecting to it with the python shell now does this:

>>> from xmlrpclib import ServerProxy
>>> p = ServerProxy("http://localhost:5000/")
>>> p.test.send('foo')
42
>>> p.test.send('`', 'echo "Shit"')
'Shit\n'

And something tells me there is no way to avoid this problem, so better just not use add_handler with a class. Explicit is better than implicit.

Update after some googeling i found someone that discovered the same: Ruby, Python, and an XML-RPC Server Arbitrary Shell Command Execution Flaw.

Kind of Directory Traversal

June 1st, 2007

If you see this code: what’s wrong?

from os import path

page_id = req.args.get('page_id', 'index')
filename = path.join(path.dirname(__file__), 'includes', page_id)
try:
    f = file(filename)
except IOError:
    handle_not_found()

First of all, the page_id comes from an user submitted variable in a web application. It’s in fact a variable from the query string. Now someone could say ?page_id=../../../etc/htpasswd etc. There are numerous ways to fix that. For example you can do this:

from os import path
fn = path.join(*[x for x in fn.split('/') if x != '..'])

This also makes sure that the path separator is valid. I guess everybody that knows about security also knows how to defend those problems.

What you might not know is that python itself doesn’t accept null bytes in the “file/open” call. This is some sort of built in security feature. Other languages such as PHP forward the nullbyte to the C layer. Thus an attacker could cut off a string at a given position to gain further control over the input layer (cutting of a .php extension etc). This is more severe in Perl where you can also pipe stuff in the file call.

This security feature however doesn’t raise an IOError but a TypeError! So the corrected example from above looks like this:

from os import path

page_id = req.args.get('page_id', 'index')
filename = path.join(path.dirname(__file__), 'includes',
                     *[x for x in page_id.split('/') if x != '..'])
try:
    f = file(filename)
except (IOError, TypeError):
    handle_not_found()

Python Dict Rehashing

With this in mind I was looking for a way to rehash a dict in python. Things that do not work: d = dict(d) and d.update(d) because the first copies the structure of the dict over, the second does nothing because the dict is the merged dict. What works is d.update(d.iteritems())

Niemand hat die Absicht, einen Überwachungsstaat zu errichten

“Es hat niemand vor, einen Überwachungsstaat in Deutschland zu errichten” — Wolfgang Bosbach

Mutable Hash Keys

Caveat. Mutable hash keys in ruby obviously behave different than you would expect:

>> foo = [1, 2, 3]
[1, 2, 3]
>> bar = {foo => 42}
{[1, 2, 3]=>42}
>> foo << 4
[1, 2, 3, 4]
>> bar[foo]
nil

Update: there is Hash#rehash.

cogitations driven by wordpress