Announcement/RFC: jhbuild continuous integration testing



Hello fellow GNOME developers,

this already came up as a side issue recently[1], but now we are at a
point where have reasonably stabilized our GNOME jhbuild continuous
builds/integration test server to become actually useful:

  https://jenkins.qa.ubuntu.com/view/Raring/view/JHBuild%20Gnome/

This is building gnome-suites-core-3.8.modules, which currently
consists of 160 modules. Builds are updated every 15 minutes, and
triggered whenever there was a new commit in a module or any of its
dependencies. This mostly uses the smarts of jhbuild, we just have
some extra scripts around to pick the results apart for Jenkins and
drive the whole thing [2]. You can click through all the modules, all
their builds, and get their build logs.

Right now there are 151 successes (blue), 5 modules fail to build
(red), and 4 modules build but fail in "make check" (yellow). It's
been like that for a week or two now, so I'd say we are doing
reasonably well for now. Some details:

Build failures:
 - colord: recently started depending on libsystemd-login, which we
   don't have yet; that's a fault on the Ubuntu side
 - e-d-s: calls an undeclared g_cond_timed_wait(), not sure what this
   is about
 - folks: this started failing very recently, and thus is a perfect
   example why this is useful (unqualified ambiguous usage of
   HashTable)
 - gst-plugins-bad: unknown type GStaticRecMutex; this might be due to
   recent changes in streamer? That smells like a case of "broken by
   change in dependency, needs updating to new API"
 - mutter: worked until Jan 7, now failing on unknown XIBarrierEvent;
   that might be a fault in Ubuntu's X.org packages or upstream, I
   haven't investigated this yet

Test failures:
 - gst-plugins-good, empathy: one test failure, the other tests work
 - realmd: This looks like the test suite is making some assumptions
   about the environment which aren't true in a headless server?
 - webkit: I don't actually see an error in the log; we'll investigate
   this closer on our side

This was set up by Jean-Baptiste Lallement, I mostly help out with
reviewing the daily status and cleaning up after some build/test
failures which are due to broken checkouts, stale files, new missing
build dependencies, and so on. It's reasonably maintenance intensive,
but that's something which the two of us are willing to do if this
actually gets used.

The main difference to Colin's ostree builds is that this also runs
"make check", which is one of the main points of this: We want to know
as soon as possible if e. g. a new commit in glib breaks something in
gvfs or evolution-data-server. Where "soon" is measured in minutes
instead of days/weeks, so that the knowledge what got changed and why
is still fresh in the developer's head. That's also why I recently
started to add integration tests to e. g. gvfs or
gnome-settings-daemon, so that over time we can cover more and more
functionality tests in these.

To make this really useful, we can't rely on developers checking this
every hour or every day, of course; instead we need push notifications
as soon as a module starts failing. That's the bit which needs broader
discussion and consent.

I see some obvious options here what to do when the status of a module
(OK/fails tests/fails build) changes:

 (1) mail the individual maintainers, as in the DOAP files
   (1a) do it for everyone, and let people who don't want this filter
   them out on a particular mail header (like "X-GNOME-QA:")
   (1b) do this as opt-in

   This most often reaches the people who can do something about the
   failure. Of course there are cases where it's not the module's fault, but a
   dependency changed/got broken. There is no way we can automatically
   determine whether it was e. g. a deliberate API break which modules
   need to adjust to, or indeed a bug in the depending library, so we
   might actually need to mail both the maintainers of the module that
   triggered the rebuild, and the maintainers of the module which now
   broke.

 (2) one big mailing list with all failures, and machine parseable
     headers for module/test

   This might be more interesting for e. g. the release team (we can
   CC: the release team in (1) as well, of course), but will be rather
   high-volume, and pretty much forces maintainers to carefully set up
   filters.

My gut feeling is that we might start with (2) for a while, see how it
goes, and later switch to (1) when we got some confidence in this?

Opinions most welcome!

Also, I'll gladly work with the developers of the currently failing
modules to get them succeeding. I have full access to the build
machine in case errors aren't reproducible.

Thanks,

Martin

[1] https://mail.gnome.org/archives/desktop-devel-list/2013-January/msg00006.html
[2] http://bazaar.launchpad.net/~jibel/charms/raring/jhbuild/trunk/files

-- 
Martin Pitt                        | http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)

Attachment: signature.asc
Description: Digital signature



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]