[BuildStream] A reboot of the testing discussion

From: Jim MacArthur <jim macarthur codethink co uk>
To: buildstream-list gnome org
Subject: [BuildStream] A reboot of the testing discussion
Date: Wed, 17 Oct 2018 17:23:54 +0100

Hi all, a few weeks ago me and Antoine and several others met up inLondon and talked about testing. Unfortunately, I wasn't aware at thetime that Tristan and Chandan had had a fairly involved discussion aboutthis earlier:


https://mail.gnome.org/archives/buildstream-list/2018-July/msg00025.html

So we have covered some of the same ground, but I'd like to try andsummarise both discussions and try to see if we can move forward.

The starting point is that we do not have any structured or recommendedway to do testing inside BuildStream projects other than adding testcommands to the end of your existing build commands.


My very brief summary of the original thread is:

* Having a separate test element to a build element can provide what wewant, but makes creating and maintaining .bst files more difficult atthe moment.

* We would like better ways to create BuildStream elements, whetherthat's by having several in one YAML file, or some other more expressivemeans, and that can be done as a separate exercise.

* Nobody, as far as I can see, is in favour of making elements morecomplicated.

* Chandan wants elements to be able to assert that their dependenciesare tested as well as built.* You can always have an element which depends on lots of otherelements, as we might want to do if we want to collect the results oflots of tests, but we may not want to manually list them all in anelement description; we could do with better means of specifying that.

* Public data provides a reasonable means for a build element to specifydefault testing commands for a separate test element which removes theneed to have an autotools test element, maven test element, Bazel testelement and so on.


Please correct me if I'm wrong.

----

Our discussions in London left us with several questions. I've listedthese, with comments based on my reading of the previous thread.

* Can we retroactively invalidate an element? Once a test fails, we maywant to stop all dependencies of that element from building.- No means of doing this at the moment, other than by aborting the buildaltogether. Maybe you don't want to abort the whole build - consider abuild which takes several days, set off on Friday evenings. Do you stopbecause of a linter error?

* We have build dependencies and runtime dependencies. Can we also addtest dependencies? For example, DejaGnu is necessary to test GCC, butnot necessary to build or run it.- This becomes irrelevant if we have separate testing elements; thenthey're just normal build-time dependencies for that element.

* Can we produce two artifacts per element? I believe this has beenasked before and wasn't popular. This would allow us to have a strippedoutput (for future building & integration) and a debugging output (fortesting).- Unlikely to happen and fairly easily emulated with two elements.Tristan has suggested having the test element pass through the suppliedartifact. It could also potentially pass through a stripped artifact,with symbols needed only for testing removed.

* If we can't produce two artifacts per element, can we create twoelements for each software package, one for the main build and one fortesting it?

- Yes, and I think we're fairly well agreed this is a good idea.

* We also discussed the possibility of having a dependency on somethinghaving been tested, which is distinct from a dependency on a packagenecessary to test (as above). Then you can say that for a package to be tested, you require some dependencies to be tested. You canalso have a high-level "everything is tested" element, which doesnothing except aggregate the results of other tests.- Tristan hinted at this with his 'test shadow': if you make abuild-time dependency on an element, you naturally depend on all itsbuild-time dependencies. But we don't currently have a way to say "allour build-time dependencies, recursively, and their tests", which Ithink is what the test shadow would be.

* Do tests of an element become invalid whenever any dependent elementchanges?- Say, for example, you have updated `bash` to a new version; do we haveto re-test `apache`, which is several layers of dependency away from `bash`?

* Can we instruct BuildStream on what tests we'd like to do? E.g. `bstbuild freedesktop-sdk.bst --test quick` vs `--test soak`?- This last one can probably be investigated independently of the otherfeatures we're discussing.


----

So, given the 'separate test element' strategy, we still have twoapproaches to testing:1) Have the finished artifact depend on all its usual dependencies andon all the tests. This allows tests of one element and the builds of itsdependents to occur in parallel.2) Make each BuildElement depend on the tested result of its builddependencies. This will avoid wasted time spent building things whosedependencies fail tests.

What we really want is the best of both these worlds, such that we canpipeline build and test to get a fast build, but we also want to abortthe build when a pipeline fails. Since far as I'm aware, a failure ofany element will stop BuildStream from scheduling more jobs, we willavoid a lot of wasted builds already.


Things I'd like to try and discuss now:

* Does everyone agree on the broad strategy of making elements simpleand numerous, and finding better ways to specify how elements areconfigured, rather than keeping our existing YAML as is and makingelements more complicated?* The testing element type - is that uncontroversial enough that wemight make a plan to implement it?


* Can we do anything more clever about aborting builds?

If I've missed anything out, or claimed there is agreement where thereare still uncertainties, please do speak up - I'm not trying tobulldozer anything in!

Jim

Follow-Ups:
- Re: [BuildStream] A reboot of the testing discussion
  - From: Angelos Evripiotis

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]