Re: D-Bus AFL/GPL issues (was Re: GLib plans for the next cycle)



On Thu, 23 Apr 2009 at 21:11:13 +0100, Robert McQueen wrote:
> dbus-python has had to duplicate a lot of the checking that libdbus does
> to validate calls before calling methods in libdbus, because whilst
> libdbus requires the application programmer gets stuff right at all
> times, dbus-python can be tricked by the python programmer into causing
> libdbus to abort. These checks are not in general exported as public
> API, and even with many checks in the Python code, its still possible
> for the Python programmer to assert libdbus from Python code in various
> corner cases.

Various libdbus functions will (default-fatally) warn if you pass them an
invalid object path or bus name, but libdbus doesn't give you a way to check
whether a Python-programmer-supplied string is a valid object path. Yes, this
is programmer error, but it's programmer error that, in Python, should be
reported in the Python way, by raising ValueError.

In both dbus-python and telepathy-glib, I've ended up implementing a set of
functions that raise an appropriate language-specific exception (an Exception
or GError respectively) if an object path/bus name/error name is bad. I'd be
happy to place either of these under the MIT/X11 license proposed for libdbus
(indeed, the Python one is already under that license) if anyone is interested
in porting them to raise DBusError.

On the other hand, because of libdbus' privileged roles as "the implementation
of dbus-daemon" and "a library from the dbus-daemon source package",
having a hard dependency on a newer libdbus is not necessarily a decision to
be taken lightly. Backporting a mere library like dbus-glib or dbus-python
is easy, but backporting dbus-daemon to an older distribution is less popular.

If nothing else, because (as its maintainers are so keen to remind us) the
system bus is a Critical System Service, correctly installing a new
dbus-daemon requires either a reboot, or a risky and discouraged system bus
restart, neither of which is a desirable action.

> There are silly cases like half way through packing a struct where the
> application has provided the D-Bus type, but later a value that doesn't
> fit that type. You can't close the struct because you'll abort libdbus.
> Unless you implement two-pass validation and check the types before
> building the message, there's no way out of this other than to fill the
> struct with nonsenense, /then/ close the iter, /then/ discard the
> message. If it was Python code you could just throw an exception back to
> the app author and get on with life.

dbus-python does suffer from this. It does throw exceptions wherever needed,
but I haven't had the spare time to implement either two-pass validation or
the "fill the struct with zeroes" unwinding, meaning that passing {'foo': None} 
to a D-Bus method that takes an a{sv} or a{ss} will either warn to stderr then
throw an appropriate exception anyway (in Debian-derived distros where
libdbus has been patched to have non-fatal warnings by default), or just abort
(in e.g. Fedora).

> Further to this, the OOM stuff is simply not of interest to (m)any other
> apps other than the daemon and the X server, and indeed if you write a
> D-Bus implementation in your native language/VM/style then you almost
> certainly get this for free, rather than wiring it up every other line
> of code.

Right, in GLib you conventionally just abort on OOM, and in Python you raise
MemoryError (which is easier than it sounds, because in the CPython API,
basically anything is allowed to raise an exception - so CPython programmers
already have to deal with them - and in Python, exceptions are syntactically
special in the expected way).

> It seems that if you're not using libdbus directly, or you're not the
> bus daemon, its a pretty hostile library to write bindings with,
> especially if the language is dynamically typed. People hacking on both
> bindings at Collabora have lost hair and screamed and sworn they would
> rewrite them to not use libdbus given half a chance, and I honestly
> don't think we're alone in this sentiment.

dbus-python has a branch called 'purity' (as in "pure Python") which I've never
dared to merge; it stops using libdbus' object-path-registration mechanism
entirely, and catches method calls by using a filter instead. This was, perhaps
unexpectedly, a significant code reduction, because I no longer had to keep
a separate registry of objects in dbus-python (in order to be able to raise an
exception on double-registration rather than letting libdbus abort me).

(It also fixed an isolated piece of OOM misbehaviour in which dbus-python would
exacerbate an OOM at precisely the wrong time by leaking a string, because
there was nothing else I could safely do; see the source code if interested.)

If I'd had time to continue on that path, the next step was going to be to use
dbus_message_[un]marshal() to get the binary blob containing the message
payload and parse it into Python data structures myself rather than using
DBusMessageIter, which I fully expect would have been another net reduction in
dbus-python code.

Havoc writes:
> > To add a Windows transport or SASL authentication or protocol v2 do 
> > we really want to run around having to fix the Python implementation,
> > the Qt implementation, the GLib implementation, etc.? The pain seems
> > mildly worth it with the managed languages (C#/Java) but largely
> > pure pain for the other languages that rely on C modules anyway.

As an aside, only one of the four Python implementations I'm aware of
(CPython, IronPython, Jython and PyPy) relies on C modules. True, CPython
is the one that in practice everyone uses, but that won't necessarily be
true forever.

> I'm pretty sure I've seen
> finished or at least partly started: one other C reimplementation, two
> Python ones, Ruby, Java, C#, and some guys in Collabora just did a
> Haskell one too.

dbus-java is an interesting one, because dbus-java 1.x was a binding. Matthew
Johnson watched Alp's progress on ndesk-dbus (the C# reimplementation) and
decided that being an implementation would actually be easier. As a result,
dbus-java 2.0 is a reimplementation. It might be worth talking to Matthew about
how easy/difficult this was (bearing in mind that Java isn't exactly the most
agile language...)

> I'm also quite certain that dbus-python has way more code to not trigger
> hidden landmines in libdbus than it would if it actually spoke the wire
> protocol itself. It's only a matter of time availability that has caused
> us not to excise the final hidden abort-able code paths by just ripping
> libdbus out.

As mentioned above, dropping my use of libdbus' "helpful" object path mapping
and just using a filter function was a net code reduction. As a bonus, most of
the deleted code was rather subtle C (the sort with more comments than code,
explaining why this particular implementation is the only one that can work),
and most of the added code was relatively simple Python.

Regards,
    Simon


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]