Draconian Error Handling in XML

Mark found a broken blog, I have a nice broken XHTML page directly from the W3C: screenshot after the jump.

We all love XHTML do we? And yes I do know that this blog’s XML is equally broken but seriously, blame WordPress not me.

2 Responses to “Draconian Error Handling in XML”

  1. ahah good catch.

    We are using MT which doesn’t catch all mistakes.
    MT publishes the entry on a machine. This machine has a cron running locally and check if there are new pages, if yes, it cvs add and commit them to the W3C server.
    We have a program on W3C server which every day takes the cvs commit in the cvs log and checks if the files are valid. If not, it sends an email to the committer saying “dude, your code is not valid, fix it.” maybe less casual :p

    The code we use for checking the URI is the LogValidator (you can find it on W3C site.).

    so not perfect, but we try to catch the errors when they show up. Maybe we could improve a bit by checking before committing, but that would require to hack MT a bit more, which makes it a bit more fragile with regards to future updates.

    Comment by karl dubost, w3c — Friday, March 21st, 2008 @ 3:16 pm
  2. Well. I don’t want to be a jerk so I don’t blame you for having a page that once had invalid XML on it. Without a proper XML toolchain it’s quite hard to get it right. Especially when dealing with data from external websites encodings, and broken markup makes it tricky to still produce parsable XML.

    Comment by Armin Ronacher — Friday, March 21st, 2008 @ 4:05 pm

Leave a Reply

cogitations driven by wordpress