Pages tagged as ‘wordpress’

How not to do XML

Imagine for the moment there was a PHP blog software that has the ability to dump the blog posts into some sort of extended RSS 2 feed and import from there later and probably from a different installation. That’s nice, XML is a flexible format and RSS allows extensions via namespaces. Even better, there are XML parsers for all major programming languages and from python working with XML is especially cool because of lxml and element tree. But there is a problem with that…

…that XML, is not XML. It’s called WordPress eXtended RSS (WXR) but it’s not XML? And why in god’s name did nobody notice so far? I mean, WordPress must have an importer for that.

Why it’s not XML? It has XML syntax, XML namespace declarations but what doesn’t it have? A doctype. What’s the problem? It’s referencing HTML entities! So step one for parsing: inject an inline DTD that defines those entities. Great fun isn’t it? Then it parses. I was happy and finished my work. That XML doesn’t have HTML entities is something PHP developers probably don’t know and their parser isn’t resolving any entities during the parsing process. Or worse, their XML parser expands HTML entites.

But it’s worse! I loaded another dump that happened to have some broken HTML in comments (could happen, does happen, thanks broken trackback support). What happens next? THE XML DOESN’T PARSE ANY MORE! Why? Because comments are neither escaped nor marked as CDATA. I wonder why, especially because it’s so much easier to handle embedded HTML/XHTML for dumping as cdata and not XML, especially if you are working with PHP.

But WordPress was able to import that…. so I looked at their parser…. WORDPRESS PARSES THAT WXR FILE USING REGULAR EXPRESSIONS!!! Argharhgarhghargh. That’s not XML what you are doing there, that’s nothing. WordPress can’t even parse it’s own file if you bind the WordPress exporter namespace to a different prefix! WordPress can’t handle it’s own file if you replace their CDATA foobar against properly escaped stuff. Dammit!

I can’t even write a proper exporter using XML tools because what my XML tools generate is not compatible to WordPress. And what tops it all?

Reading that in the #wordpress channel:

<nickname_deleted> why does it matter what wp's xml format has flaws?
                   adapt your importer to the flaws

ARGHARGHARGHARGH. and then the webpage says:

WordPress is a state-of-the-art semantic personal publishing platform with a focus on aesthetics, web standards, and usability.

Without further comments… I lost my faith into standards that moment. Wait a second, I lost it earlier. Still sad.

Until TextPress is ready…

February 5th, 2008
mitsuhiko@hammett:~$ cd lucumr/cogitations/
mitsuhiko@hammett:~/lucumr/cogitations$ rm xmlrpc.php
mitsuhiko@hammett:~/lucumr/cogitations$ 

TextPress and other lost stuff…

Some time ago I wrote a blog engine in Python to replace this wordpress installation. As you can see that never really happened although the blog software is already in a usable state (at least the basic administration and plugin interface works). One of the reasons is that i would have to port over the theme which is one of those stupid tasks i just hate, another one is that i haven’t had time to improve it any further. Another piece of software I recently discovered again is a python powered wiki called ordo which I wrote one year ago i guess. It’s a shame that those applications never where released or properly licensed. ordo lingers around in my svn repository but textpress does not.

TextPress won’t for a few reasons. One is the name, it’s obviously taken from wordpress and want to rename it before I release it, the other reason is that i don’t have the time to maintain the code right now. There are already many other projects and I don’t feel like I want to maintain too much code at the same time.

But because TextPress has some really unique features I at least want to show what it does :)

screenshot of the textpress blog

The screenshot above shows the default theme, the admin panel looks like this and this. It has a pretty interesting plugin interface. The event system was inspired by dokuwiki, the way the syntax based plugins work is IMHO unique :)

Basically what it does is lexing the post sgml markup into a DOM like structure which you can query and transform. Here for example the complete sourcecode of the pygments plugin:

from textpress.api import *
from textpress.htmlprocessor import DataNode
try:
    from pygments import highlight
    from pygments.lexers import get_lexer_by_name
    from pygments.formatters import HtmlFormatter
    have_pygments = True
except ImportError:
    have_pygments = False

class PygmentsHighlighter(object):

    def __init__(self, style):
        self.formatter = HtmlFormatter(style=style)

    def process_doc_tree(self, event):
        for node in event.data['doctree'].query('pre[@tp:lang]'):
            lexer = get_lexer_by_name(node.attributes.pop('tp:lang'))
            output = highlight(node.text, lexer, self.formatter)
            node.parent.children.replace(node, DataNode(output))

    def get_style(self, req):
        return Response(self.formatter.get_style_defs(), mimetype='text/css')

    def inject_style(self, event):
        add_link('stylesheet', url_for('pygments_support/style'), 'text/css')

def setup(app, plugin):
    if not have_pygments:
        return
    app.add_config_var('pygments_support/style', unicode, u'default')
    app.add_url_rule('/_shared/pygments_support/style.css',
                     endpoint='pygments_support/style')

    c = PygmentsHighlighter(app.cfg['pygments_support/style'])
    app.connect_event('process-doc-tree', c.process_doc_tree)
    app.connect_event('after-request-setup', c.inject_style)
    app.add_view('pygments_support/style', c.get_style)

You can see the event and doc tree system in action in the snippet above. The way the DOM is queried is inspired by jQuery ;-)

It’s a Patchday (sort of)

June 21st, 2007

Today they released a new Wordpress version and there are a few security fixes in. Because I don’t feel like upgrading I patched the holes myself, here what you should fix if you use wordpress:

Look for that in wp-includes/class-phpmailer.php:

$sendmail = sprintf("%s -oi -f %s -t", $this->Sendmail, $this->Sender);

Those **** forgot to escape shell commands, they forward it to popen a few lines later. How stupid…? Here’s however is the fix:

$sendmail = sprintf("%s -oi -f %s -t",
    $this->Sendmail, escapeshellarg($this->Sender));

Then once again the xmlrpc.php file. Either delete it or make a cron that downloads the the most recent one automatically. They upgrade more escaping bugs then they actually announce…

And in the kubrick theme (if you use it or a derived theme) there is an XSS whole, they don’t escape REQUEST_URI, just replace

<?php echo $_SERVER['REQUEST_URI']; ?>

with

<?php echo htmlspecialchars($_SERVER['REQUEST_URI']); ?>

And the most interesting part about this update: security updates are marked as “minor”, a missing “<em>” is marked as major…

Update: they just inverted the colors… my fault

Pygments in Wordpress

May 30th, 2007

I wanted to have pygments support in wordpress and so I hacked up a small wordpress plugin that enables pygments support in wordpress. Because I was lazy and PHP sucks like hell I just supported the case of a php.ini with magic slashes disabled. Somehow wordpress reinserts some of those annoying things automatically in some places though.

If you want to try it out: pygments.php. Note that I do not support it, it’s released under the BSD license like Pygments itself, it requires an installed pygments with the pygmentize script, no idea which PHP version, disabled magic quotes I guess and that you generate a pygments.css file yourself that matches the style defined in the plugin.

It caches in the text, and has few overhead on rendering Basically all you have to do is typing <pre lang="LANGUAGE">code</pre> instead of using a normal pre tag. Escaping happens automatically.

Example:

# Server: ruby p2p.rb password server server-uri merge-servers
# Sample: ruby p2p.rb foobar server druby://localhost:1337 druby://foo.bar:1337
# Client: ruby p2p.rb password client server-uri download-pattern
# Sample: ruby p2p.rb foobar client druby://localhost:1337 *.rb
require'drb';F,D,C,P,M,U,*O=File,Class,Dir,*ARGV;def s(p)F.split(p[/[^|].*/])[-1
]end;def c(u);DRbObject.new((),u)end;def x(u)[P,u].hash;end;M=="client"&&c(U).f(
x(U)).each{|n|p,c=x(n),c(n);(c.f(p,O[0],0).map{|f|s f}-D["*"]).each{|f|F.open(f,
"w"){|o|o<<c.f(p,f,1)}}}||(DRb.start_service U,C.new{def f(c,a=[],t=2)c==x(U)&&(
t==0&&D[s(a)]||t==1&&F.read(s(a))||p(a))end;def y()(p(U)+p).each{|u|c(u).f(x(u),
p(U))rescue()};self;end;private;def p(x=[]);O.push(*x).uniq!;O;end}.new.y;sleep)
cogitations driven by wordpress