Re: [Tracker] New branch: dbus-fd-experiment



On 27 May 2010 17:08, Adrien Bustany <abustany gnome org> wrote:
Hello list!

You might have heard of the dbus-fd-experiment branch.

What is this branch about? I've been looking lately at how we
can improve our use of D-Bus, by not using it for passing large amounts of
data.

D-Bus isn't slow when used it to pass small messages, but its performance
goes
down when it has to handle large amounts of data.

The dbus-fd-experiment takes advantage of a new feature present in D-Bus 1.3
that allows passing UNIX file descriptors as message parameters. Using this
feature, we create a pipe on the server then pass it to the client.

Then we send the results over the pipe, saving the costs of D-Bus
marshalling.
The protocol used to pass data over the pipe is described in the reports [1]
and [2].

It's designed to minimize marshalling costs and context switches between
client
and server (for optimal performance).

I integrated this in tracker-store and libtracker-client, and the results
are
pretty good.

** Give me the numbers! **
     | Normal DBus   | DBus + FD passing ("steroids") | Relative speedup
Query 1 | 38 ms         | 28 ms                          | 25%
Query 2 | 142 ms        | 91 ms                          | 57%
Query 3 | 8 ms          | 7 ms                           | 14%
Query 4 | 449 ms        | 212 ms                         | 112%

Queries:

1: select ?a nmm:artistName(?a) where {?a a nmm:Artist}
 332 rows
 18874 bytes
2: select ?t nie:contentLastModified(?t) where {?t a nfo:FileDataObject}
 13542 rows
 654399 bytes
3: select ?r where {?r a nfo:FileDataObject; fts:match "test"}
 234 rows
 10764 bytes
4: select nie:plainTextContent(?f) where {?f a nfo:FileDataObject}
 231077 rows
 16184790 bytes

The tiny code I used to benchmark is hosted at [3].

** How it works under the hood **

My first approach was to use a client side iterator, and send the results
progressively from the server.

This approach is not good because while results are being sent from the
server to the client, a DB iterator is kept open in the store, blocking
concurrent INSERT queries.

Instead we fetch all the results in a buffer on client side (which is a bit
more expensive), and then iterate on that buffer. That way, the DB
sqlite3_stmt
on server side is released ASAP.

The code in tracker-store is in the two files tracker-steroids.[ch]. There
are
also a few lines in tracker-dbus.c to add a new DBus object,
/org/freedesktop/Tracker1/Steroids. This object has two methods in the
interface org.freedesktop.Tracker1.Steroids, which are PrepareQuery and
Fetch.

In libtracker-client I added a new query function,
tracker_resources_sparql_query_iterate. This function returns a
TrackerResultIterator which can be used with the tracker_result_iterator_*
functions. All those functions are documented, and there is an example
client
in the examples dir.

The work has been thoroughly tested during this week. I also wrote unit
tests
to ensure we get the same results when using both the traditional and the
"steroids" methods. GCov reports a complete coverage of the code. You're of
course invited to test it more, and report me any problem :)


Just out of curiosity...

Have you tried just using peer-2-peer DBus instead of going over the
bus daemon? That saves one rountrip over the DBus socket right
there... And if I recall correctly data will not be validated either
when using p2p dbus, but I may be wrong on that one...

But very interesting work.

-- 
Cheers,
Mikkel



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]