Re: [BuildStream] Proposal: Add support for running tests in BuildStream



It seems that most of the functionality needed to add testing support
into BuildStream is already in place. I'd like to attempt to then
summarise the remaining features needed to start supporting testing
within BuildStream. The good news is that none of the following should
be breaking changes!

As far as I'm aware, the following items would need to be completed
(which I'll elaborate on below):
1.) Add 'provides-conditions' and 'requires-conditions' fields to the
bst element file
2.) Create a new kind of dependency ('abstract dependencies')
3.) Add functionality to 'bst show' to check the 'conditions' field(s)
of a bst element
4.) Have some method of allowing a condition-providing element to use
another element's build tree


1.) Add 'provides-conditions' and 'requires-conditions' fields to the
bst element file
=======================================================================================
For the following dependency tree:

      [base.bst]
          |
          |
          v
      [lib.bst] --> [lib-test.bst]
          |
          |
          v
      [app.bst] --> [app-test.bst]
          |
          |
          v
    [app-pkg.bst]

Similar to what Tristan had suggested [1], if app-pkg.bst were to
require that app.bst is tested, I'd imagine the bst files of app,
app-test and app-pkg to look something like:

==== app.bst =====
  kind: import
  sources:
  - kind: git
    url: github: foo/app
    track: master
  depends:
  - lib.bst
  provides-conditions:
    tested:
    - app-test.bst
    packaged:
    - app-pkg.bst

==== app-test.bst ====
  kind: manual
  depends:
  - app.bst
  build-commands:
  - setup.py test

==== app-pkg.bst ====
  kind: manual
  depends:
  - app.bst
  requires-conditions:
  - tested
  build-commands:
  - setup.py sdist


2.) Create a new kind of dependency ('abstract dependencies')
=============================================================
Abstract dependencies would have the following characteristics:
- they do not care about the output of the element that is
'abstractly' depended on
- they function like 'normal' dependencies with regards to cache key calculation
- they are implied when an element specifies 'requires-conditions' on
its dependencies (and so do not need to be explicitly stated in the
bst files). To illustrate, when app-pkg.bst specifies that app.bst
must be tested, the following abstract dependencies are created:

                     [base.bst]
                         |
                         |
                         v
  [lib-test.bst] <-- [lib.bst]
         "               |
         "               |
         "               v
         "           [app.bst] --> [app-test.bst]
         "               |               "
         "               |               "
         "               v               "
         += = => [app-pkg.bst] <= = = = =+

where -----> are 'normal' dependencies
= = => are 'abstract' dependencies

(The above figure is taken from [2] but modified to distinguish
between 'normal' and 'abstract' dependencies)


3.) Add functionality to 'bst show' to check the 'conditions' field(s)
of a bst element
=======================================================================================
For the purposes of enforcing policy (without having to rely on
directly yaml parsing bst files), there must be a way to allow someone
to check if particular conditions (e.g. 'tested') have been specified
in an element's 'provides-conditions' field. Perhaps adding some
functionality to 'bst show' to support something like:

     bst show --world --format 'Name: %{name}\nProvides-conditions:
%{provides-conditions}'

which would then print out something like:

  > Name: base.bst
     Provides-conditions:
       None

     Name: lib.bst
     Provides-conditions:
       tested: lib-test.bst

     Name: app.bst
     Provides-conditions:
       tested: app-test.bst
       packaged: app-pkg.bst

or perhaps be able to print out yaml like so:

  > - Name: base.bst
       Provides-conditions: None

     - Name: lib.bst
       Provides-conditions:
         tested:
         - lib-test.bst

     - Name: app.bst
       Provides-conditions:
         tested:
         - app-test.bst
         packaged:
         - app-pkg.bst

This function would have to fail gracefully if it cannot find the
field specified in the format string. Similarly, it might be useful to
check if certain elements specify a 'requires-conditions' field:

     bst show app-pkg.bst --deps none --format 'Name:
%{name}\nRequires-conditions: %{requires-conditions}'

which would then print out something like:

  > Name: app-pkg.bst
     Requires-conditions: tested


4.) Have some method of allowing a condition-providing element to use
another element's build tree
==================================================================================================
Currently, the below configuration would not work since lib-test.bst
would only have the binary from lib.bst available to it and not the
whole build tree. For this to then work, some method must be
implemented to allow lib-test.bst to be built on top of lib.bst

======= lib.bst ========
build-commands:
- make install
===== lib-test.bst =====
build-commands:
- make test

Some possible solutions discussed with Will Salmon:
- introducing a new kind of build element which might use the previous
element's build tree
- use an existing element (the filter element seems like a good
candidate but I'm not too familiar with its functionality) -- this
could likely have big impacts on the cache
- rebuild lib in lib-test (although this would likely slow down the
build of app-pkg by a significant amount)


Cheers,
Chiara

[1]: https://mail.gnome.org/archives/buildstream-list/2018-October/msg00068.html
[2]: https://mail.gnome.org/archives/buildstream-list/2018-October/msg00069.html
On Wed, 31 Oct 2018 at 15:15, William Salmon via BuildStream-list
<buildstream-list gnome org> wrote:

I really enjoyed this discussion at the gathering and felt like it was one of the more productive,

Many thanks to Chandan for the great write up.

I would have replied to the end of the thread but some of the context i wanted to add on to has been 
removed.

On 31/10/2018 10:45, Sander Striker via BuildStream-list wrote:

Hi,

On Tue, Oct 30, 2018 at 1:33 PM Tristan Van Berkom via BuildStream-list <buildstream-list gnome org> wrote:

Hi Chandan,

This is a beautiful writeup and I feel it accurately describes what we
discussed in the gathering.

That said I am a bit biased by having been present for the discussion;
it is quite fairly possible that someone who was not present at the
discussion would have a hard time to understand this writeup :)

I will add some things to this inline to your email...

On Thu, 2018-10-25 at 14:56 +0100, Chandan Singh wrote:
Hi all,

After several long discussions in this thread, another thread [0], and in
person at the recent BuildStream Gathering [1], I think we have finally reached
some consensus on how testing should work in BuildStream. In this message, I
aim to summarize what we discussed in person. Unless anyone has any objections
with the high-level plan, I will summarize the implementation plan in a
following message.

Problem
=======

Let's start by doing a recap of the problem we are trying to solve.

As software integrators, we would like to run tests as part of BuildStream
builds and be able to validate (at a later stage in build) that the tests for a
given element have passed. Ideally, such tests should have the following
properties:

  1.) Tests should not block reverse dependency builds.
  2.) Strong association between an element and its tests.
  3.) Additional dependencies for testing an element should not be required
      while building it.
  4.) Elements should be able to assert that the tests for their dependencies
      have passed.
  5.) Test elements are not necessarily pass-through elements - they can have
      meaningful output such as coverage reports.

  6.) Tests are allowed to carry additional dependencies which are not
      necessarily needed at build or run time.

Note that having tests as separate elements can result in essentially doubling
the number of elements in a given project - this problem is orthogonal to
testing and also applies to things like packaging. So, let us acknowledge that
is important and deal with it separately.


While it is orthogonal, we are not talking about this in the abstract.  On the contrary, proposed solutions 
imply separate elements for testing.  I am also not sure this applies the same to packaging.  I can see 
packaging be more limited to leaf nodes, for example final applications, whereas testing would apply to 
every element, including for example every library dependency.
In short, let's treat this as orthogonal but let's also not forget the importance of this problem.

Proposed solution
=================

To ensure that all of us are talking about the same thing, let us consider the
following dependency graph as an example.

      [base.bst]
          |
          |
          v
      [lib.bst] --> [lib-test.bst]
          |
          |
          v
      [app.bst] --> [app-test.bst]
          |
          |
          v
    [app-pkg.bst]

This is a simple project with a single app and a single library, where the
final output is a package of the app. Other than the standard elements for the
base, lib and app; we also have a few additional elements:

  a.) lib-test.bst - a standard BuildStream element that will run tests for
      lib.bst as part of its build.
  b.) app-test.bst - same as lib-test.bst but will run tests for app.bst.
  c.) app-pkg.bst - a standard BuildStream element that will produce a package
      for the app, depends only on app.bst.

If this project is used as-is, there will be no way for app-pkg.bst to verify
that the tests have actually passed. Moreover, since lib-test.bst is not listed
as a dependency for anything in the plan for app-pkg.bst, it will not even get
built. To remedy that, we need to introduce the notion of "conditions".

XXX: Maybe we need a better word for "conditions" - suggestions welcome :)

Conditions
----------

Conditions will essentially be a mapping from a keyword to a set of elements.
The keyword can be anything as far as BuildStream is concerned but the projects
are free to enforce their own guidelines. The set of elements will usually be a
subset of the reverse dependencies of the given element.

Conditions can be things like: "tested", "packaged", "license verified" etc.

When listing dependencies, an element can require certain conditions to be
true for its dependencies. Doing so will implicitly add a dependency on the
corresponding set of elements.

Coming back to our example, lib.bst can declare its "tested" condition to be
the set of elements: [lib-test.bst]. Then, app-pkg.bst can require the "tested"
condition to be met for its dependencies. So, although it only explicitly
depends on app.bst, BuildStream would automatically add a dependency on
lib-test.bst and app-test.bst when building it.


That implies that app-pkg.bst will not actually proceed with building until all tests of the dependent 
elements complete?


I know Tristan addressed this but I think that the following is a different but nice side effect of this 
all.


This is a interesting one, the way we were thinking about it at the gathering was that if most "pieces of 
software" have 3 elements the "build", "test" and "package" and the packages have the testing "condition" 
then all the "build" element can get on and build in there normal way with no addition to there "best case 
chronological build time" and even if tests fail a user can have a debug in the bst shell of the final or 
any other element but that the "package" elements only get there dependencies satisfied when all the tests 
pass.


Whether you want to have the "package" element blocked by the test or have some other slight permutation 
this system seem supper flexible as we are just suppling a mechanism for not explicit dependencies it can 
be used for a multitude of problems, maybe at the same time, in different ways depending on the situation.


One thing that i like about this is that in my mind it dose not make scene to build a package if its 
dependencies can be packaged, but others had different ideas, so this system makes it easy to do ether (pkg 
element `relying on tst elements` or `relying on test and pkg elements`) or have extra elements that mean 
you can do even more complex requirements.


You can also set this up so that all the tests run whether or not some of them fail or have the tests rely 
on each other so they stop once one fails, ether way you only get the final package if all the tests pass.




It is also worth noting that since base.bst does not have a corresponding test
element and does not declare a "tested" condition, this condition does not
apply to it.

While this example only mentions the "tested" condition, it should be easy to
see how this might work with other conditions as well.

I'd like to illustrate a couple of interesting properties here,
starting with your graph above.

Consider the build graph below as Chandan described it, and consider
that lib-test.bst satisfies the "tested" condition of lib.bst.

Similarly app-test.bst satisfies the "tested" condition of app.bst.

                     [base.bst]
                         |
                         |
                         v
  [lib-test.bst] <-- [lib.bst]
                         |
                         |
                         v
                     [app.bst] --> [app-test.bst]
                         |
                         |
                         v
                   [app-pkg.bst]


Add to this, that app-pkg.bst makes the following statement:

  I require that all of my dependencies have the "tested" condition
  satisfied

Then the above dependency graph is transformed in the following way:

                     [base.bst]
                         |

                |
                         v
  [lib-test.bst] <--
[lib.bst]
         |               |
         |               |

        |               v
         |           [app.bst] --> [app-
test.bst]
         |               |               |
         |
    |               |
         |               v               |

+--------> [app-pkg.bst] <------+


Some things to note about this automatic transformation:

  * The two newly grown dependency relationships are not regular
    build or runtime dependencies, consequently their outputs are not
    staged during the app-pkg.bst build.

  * The app-pkg.bst cache key *does* however take the cache keys
    of the new "condition" dependencies into account.

    This has the interesting property of blocking the creation of the
    resulting artifact until the tests have successfully run.


:).


  * While the requirement of the "tested" condition imposed by
    app-pkg.bst is recursive in nature, inasmuch as it requires
    the "tested" condition recursively across it's explicitly
    declared dependencies, it is not recursive across implicit
    condition based boundaries.

    This is to say that:
    - app-pkg.bst requires the tested condition of app.bst AND lib.bst
    - app-pkg.bst does NOT require the tested condition of lib-test.bst

    This is because lib-test.bst was implicitly pulled into the graph
    due to the condition requirement.

    However, if lib-test.bst imposes other requirements on it's own
    dependencies, those will be treated in the same way as app-pkg.bst.


This could get rather complicated.


  * If an artifact for app-pkg.bst exists but the tests declared
    for lib-test.bst are changed, this will cause app-pkg.bst to have
    a new cache key, requiring the tests to be re-run.

    HOWEVER, following some plans which were discussed in artifact
    subcommand group proposal, around here (see earlier and later
    messages for better context):

      https://mail.gnome.org/archives/buildstream-list/2018-October/msg00006.html

    With "artifact aliasing" (also needs a better name), it will be
    possible to avoid the actual processing in rebuilding app-pkg.bst,
    since the actual staged inputs and build commands will remain
    unchanged.

    Ideally after implementing this orthogonal feature; the effect of
    re-building app-pkg.bst after having only changed a test, is that:

      - A new cache key is derived for app-pkg.bst
      - This new cache key requires the tests to pass for all deps,
        causing app-test.bst to be rebuilt
      - The actual package payload is not re-run, only a new artifact
        is created with the same files content as a previously existing
        artifact.

    Until this measure of build avoidance is implemented in a generic
    way, app-pkg.bst will be rebuilt every time the tests of it's
    dependencies are changed and require rebuilds (retests).


Thanks for resurfacing that part of the artifact command subgroup thread in the context of this 
conversation.

I do think there is another consideration missing.  Instead of declaring in app-pkg.bst that I want all of 
my (transient) dependencies to satisfy the 'tested' condition I want to be able to run "bst build app.bst" 
declaring that I want all elements that are built during my run to have the tested condition met at the end 
of my run.  And otherwise consider my build a failure.  I don't think it's satisfactory If the only way for 
me to do this is to "bst build app-pkg.bst".

Considerations
==============

Using this approach, we can satisfy the criteria listed above. Additionally, it
also provides the following advantages:

  - BuildStream does not require any implicit knowledge about testing and
    treats tests as regular elements.
  - Since the idea of conditions is generic, it can also be used for other
    purposes like "packaged", "license verified" etc.
  - It is easily enforceable by maintainers of a BuildStream project, either
    manually or automatically.


While I don't see why it couldn't work, I do think there is a balance we need to strike between keeping 
things generic (and simpler from an implementation/maintainablity PoV in BuildStream), and making things 
more specialized and convenient for users.

If we have project maintainers provide standardized build elements (kinds), can those also take care of 
testing in one go, or do I now need to create a -test.bst element that has a related kind?

At various points in the past, we have discussed other approaches. Here is a
short (and incomplete) list of approaches that we considered but realized were
not fit for purpose for some reason. I am listing them here in case anyone
thinks that we missed some obvious choices:

   - Running tests as part of the build - basically works but we lose the
     potential parallelism and block the reverse dependency builds.
   - Retroactively marking build as failed if tests fail: altering the artifact
     after it has been published seems wrong, and failing a test should not
     prevent the existence of the artifact for a successful build.
   - Having leaf elements explicitly depend on the test elements - basically
     works but not easy to enforce standards this way and it is easy to miss
     tests for certain dependencies.
   - Have additional cache keys for testing - requires BuildStream to be aware
     that tests are special and also suffers from the same problem as having an
     explicit dependency on test elements.


I have purposefully tried to keep the implementation details out of this
message, so that we can agree on the high-level plan first, and decide on the
details later. I will be very curious to hear what others think about
"conditions" :)

I'm also wary of the name here, let's consider what the YAML will look
like and what contexts in which the vocabulary will be used, using
"conditions", I think it will look like this:

======= lib.bst =======

# Specify that the lib-test.bst element provides its "tested" condition
condition-providers:
  tested:
  - lib-test.bst


======= app-pkg.bst =======

# Specify that this element requires the "tested" condition of its
# dependencies
condition-requirements:
- tested


I'm not sure I love "condition" as a word for this, it does make some
kind of sense when thinking about state chart terminology (e.g., the
built app-pkg.bst state cannot be reached until the conditions are all
satisfied), but clashes a bit with what we call "conditional
statements" in our YAML layer.

The word "state" is not much better.

Another approach would be to talk in terms of "extensions", as in the
lib.bst declares lib-test.bst as one of it's "tested extensions".

I'm not sure this is better either, and would also be interested to
hear other suggestions on the terminology to use.



To my mind these are "implicit dependencies" but thats not very catchy is it..  "declared implicit 
dependencies" and "required implicit dependencies" are also a bit of a mouth full.



Finally some points that i don't *think* have been mentioned from the gathering and while they don't seem 
to clash with anything in this or the longer thread i think it is good to have them written down so we keep 
them in mind as we start to move towards the implementation details.


1) The way that this is used is a matter of policy for a given project but this should make it possible for 
simple polices that should be easy to implement. And it should be easy to test that policy is being 
followed.


This should be easy to do for non bst experts and easy to test for when people add .bst elements to the 
".bst" repository


"Engineer load/over head" was a criticism of a lot of the other ideas we had and while this is not a zero 
effort proposal for the engineer adding .bst files to a big repository it is not too big and should not be 
allowed to grow to much as we polish/implement this idea.


I also think the current proposal is *just* simple enough that some sort of CI could be dreamed up around 
the repo that holds the .bst files.


2) this only effects the dependency graph.


Lots of the other ideas we have had for testing have needed changes in all sorts of parts of the code base, 
even effecting caching. But this idea keeps all changes up to the dependency tree. Once the front end has 
run and the dependency tree has been calculated there should be no further modification for this to work. 
Although some extra commands to make it easy to digest the information may be desired i think they would 
only be very minimal and not effect the core of the code.


While i did say a lot of things in the discussion these last two are not my original points but I didn't 
want them to get missed. Apologizes if they are already there i have tried to read most of this thread but 
i may have missed them.


Thanks

Will


_______________________________________________
BuildStream-list mailing list
BuildStream-list gnome org
https://mail.gnome.org/mailman/listinfo/buildstream-list


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]