[BuildStream] A reboot of the testing discussion



Hi all, a few weeks ago me and Antoine and several others met up in London and talked about testing. Unfortunately, I wasn't aware at the time that Tristan and Chandan had had a fairly involved discussion about this earlier:

https://mail.gnome.org/archives/buildstream-list/2018-July/msg00025.html

So we have covered some of the same ground, but I'd like to try and summarise both discussions and try to see if we can move forward.

The starting point is that we do not have any structured or recommended way to do testing inside BuildStream projects other than adding test commands to the end of your existing build commands.

My very brief summary of the original thread is:

* Having a separate test element to a build element can provide what we want, but makes creating and maintaining .bst files more difficult at the moment.

* We would like better ways to create BuildStream elements, whether that's by having several in one YAML file, or some other more expressive means, and that can be done as a separate exercise.

* Nobody, as far as I can see, is in favour of making elements more complicated.

* Chandan wants elements to be able to assert that their dependencies are tested as well as built. * You can always have an element which depends on lots of other elements, as we might want to do if we want to collect the results of lots of tests, but we may not want to manually list them all in an element description; we could do with better means of specifying that.

* Public data provides a reasonable means for a build element to specify default testing commands for a separate test element which removes the need to have an autotools test element, maven test element, Bazel test element and so on.

Please correct me if I'm wrong.

----

Our discussions in London left us with several questions. I've listed these, with comments based on my reading of the previous thread.

* Can we retroactively invalidate an element? Once a test fails, we may want to stop all dependencies of that element from building. - No means of doing this at the moment, other than by aborting the build altogether. Maybe you don't want to abort the whole build - consider a build which takes several days, set off on Friday evenings. Do you stop because of a linter error?

* We have build dependencies and runtime dependencies. Can we also add test dependencies? For example, DejaGnu is necessary to test GCC, but not necessary to build or run it. - This becomes irrelevant if we have separate testing elements; then they're just normal build-time dependencies for that element.

* Can we produce two artifacts per element? I believe this has been asked before and wasn't popular. This would allow us to have a stripped output (for future building & integration) and a debugging output (for testing). - Unlikely to happen and fairly easily emulated with two elements. Tristan has suggested having the test element pass through the supplied artifact. It could also potentially pass through a stripped artifact, with symbols needed only for testing removed.

* If we can't produce two artifacts per element, can we create two elements for each software package, one for the main build and one for testing it?
- Yes, and I think we're fairly well agreed this is a good idea.

* We also discussed the possibility of having a dependency on something having been tested, which is distinct from a dependency on a package necessary to test (as above). Then you can say that for a pack age to be tested, you require some dependencies to be tested. You can also have a high-level "everything is tested" element, which does nothing except aggregate the results of other tests. - Tristan hinted at this with his 'test shadow': if you make a build-time dependency on an element, you naturally depend on all its build-time dependencies. But we don't currently have a way to say "all our build-time dependencies, recursively, and their tests", which I think is what the test shadow would be.

* Do tests of an element become invalid whenever any dependent element changes? - Say, for example, you have updated `bash` to a new version; do we have to re-test `apache`, which is several layers of dependency away from `bash`?

* Can we instruct BuildStream on what tests we'd like to do? E.g. `bst build freedesktop-sdk.bst --test quick` vs `--test soak`? - This last one can probably be investigated independently of the other features we're discussing.

----

So, given the 'separate test element' strategy, we still have two approaches to testing: 1) Have the finished artifact depend on all its usual dependencies and on all the tests. This allows tests of one element and the builds of its dependents to occur in parallel. 2) Make each BuildElement depend on the tested result of its build dependencies. This will avoid wasted time spent building things whose dependencies fail tests.

What we really want is the best of both these worlds, such that we can pipeline build and test to get a fast build, but we also want to abort the build when a pipeline fails. Since far as I'm aware, a failure of any element will stop BuildStream from scheduling more jobs, we will avoid a lot of wasted builds already.

Things I'd like to try and discuss now:

* Does everyone agree on the broad strategy of making elements simple and numerous, and finding better ways to specify how elements are configured, rather than keeping our existing YAML as is and making elements more complicated? * The testing element type - is that uncontroversial enough that we might make a plan to implement it?

* Can we do anything more clever about aborting builds?

If I've missed anything out, or claimed there is agreement where there are still uncertainties, please do speak up - I'm not trying to bulldozer anything in!

Jim



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]