I really enjoyed this discussion at the gathering and felt like
it was one of the more productive,
Many thanks to Chandan for the great write up.
I would have replied to the end of the thread but some of the
context i wanted to add on to has been removed.
On 31/10/2018 10:45, Sander Striker via
BuildStream-list wrote:
Hi,
Hi
Chandan,
This is a beautiful writeup and I feel it accurately
describes what we
discussed in the gathering.
That said I am a bit biased by having been present for the
discussion;
it is quite fairly possible that someone who was not present
at the
discussion would have a hard time to understand this writeup
:)
I will add some things to this inline to your email...
On Thu, 2018-10-25 at 14:56 +0100, Chandan Singh wrote:
> Hi all,
>
> After several long discussions in this thread, another
thread [0], and in
> person at the recent BuildStream Gathering [1], I think
we have finally reached
> some consensus on how testing should work in
BuildStream. In this message, I
> aim to summarize what we discussed in person. Unless
anyone has any objections
> with the high-level plan, I will summarize the
implementation plan in a
> following message.
>
> Problem
> =======
>
> Let's start by doing a recap of the problem we are
trying to solve.
>
> As software integrators, we would like to run tests as
part of BuildStream
> builds and be able to validate (at a later stage in
build) that the tests for a
> given element have passed. Ideally, such tests should
have the following
> properties:
>
> 1.) Tests should not block reverse dependency builds.
> 2.) Strong association between an element and its
tests.
> 3.) Additional dependencies for testing an element
should not be required
> while building it.
> 4.) Elements should be able to assert that the tests
for their dependencies
> have passed.
> 5.) Test elements are not necessarily pass-through
elements - they can have
> meaningful output such as coverage reports.
6.) Tests are allowed to carry additional dependencies
which are not
necessarily needed at build or run time.
> Note that having tests as separate elements can result
in essentially doubling
> the number of elements in a given project - this
problem is orthogonal to
> testing and also applies to things like packaging. So,
let us acknowledge that
> is important and deal with it separately.
While it is orthogonal, we are not talking about this in
the abstract. On the contrary, proposed solutions imply
separate elements for testing. I am also not sure this
applies the same to packaging. I can see packaging be more
limited to leaf nodes, for example final applications,
whereas testing would apply to every element, including for
example every library dependency.
In short, let's treat this as orthogonal but let's also
not forget the importance of this problem.
> Proposed solution
> =================
>
> To ensure that all of us are talking about the same
thing, let us consider the
> following dependency graph as an example.
>
> [base.bst]
> |
> |
> v
> [lib.bst] --> [lib-test.bst]
> |
> |
> v
> [app.bst] --> [app-test.bst]
> |
> |
> v
> [app-pkg.bst]
>
> This is a simple project with a single app and a single
library, where the
> final output is a package of the app. Other than the
standard elements for the
> base, lib and app; we also have a few additional
elements:
>
> a.) lib-test.bst - a standard BuildStream element
that will run tests for
> lib.bst as part of its build.
> b.) app-test.bst - same as lib-test.bst but will run
tests for app.bst.
> c.) app-pkg.bst - a standard BuildStream element that
will produce a package
> for the app, depends only on app.bst.
>
> If this project is used as-is, there will be no way for
app-pkg.bst to verify
> that the tests have actually passed. Moreover, since
lib-test.bst is not listed
> as a dependency for anything in the plan for
app-pkg.bst, it will not even get
> built. To remedy that, we need to introduce the notion
of "conditions".
>
> XXX: Maybe we need a better word for "conditions" -
suggestions welcome :)
>
> Conditions
> ----------
>
> Conditions will essentially be a mapping from a keyword
to a set of elements.
> The keyword can be anything as far as BuildStream is
concerned but the projects
> are free to enforce their own guidelines. The set of
elements will usually be a
> subset of the reverse dependencies of the given
element.
>
> Conditions can be things like: "tested", "packaged",
"license verified" etc.
>
> When listing dependencies, an element can require
certain conditions to be
> true for its dependencies. Doing so will implicitly add
a dependency on the
> corresponding set of elements.
>
> Coming back to our example, lib.bst can declare its
"tested" condition to be
> the set of elements: [lib-test.bst]. Then, app-pkg.bst
can require the "tested"
> condition to be met for its dependencies. So, although
it only explicitly
> depends on app.bst, BuildStream would automatically add
a dependency on
> lib-test.bst and app-test.bst when building it.
That implies that app-pkg.bst will not actually proceed
with building until all tests of the dependent elements
complete?
I know Tristan addressed this but I think that the following is a
different but nice side effect of this all.
This is a interesting one, the way we were thinking about it at
the gathering was that if most "pieces of software" have 3
elements the "build", "test" and "package" and the packages have
the testing "condition" then all the "build" element can get on
and build in there normal way with no addition to there "best case
chronological build time" and even if tests fail a user can have a
debug in the bst shell of the final or any other element but that
the "package" elements only get there dependencies satisfied when
all the tests pass.
Whether you want to have the "package" element blocked by the
test or have some other slight permutation this system seem supper
flexible as we are just suppling a mechanism for not explicit
dependencies it can be used for a multitude of problems, maybe at
the same time, in different ways depending on the situation.
One thing that i like about this is that in my mind it dose not
make scene to build a package if its dependencies can be packaged,
but others had different ideas, so this system makes it easy to do
ether (pkg element `relying on tst elements` or `relying on test
and pkg elements`) or have extra elements that mean you can do
even more complex requirements.
You can also set this up so that all the tests run whether or not
some of them fail or have the tests rely on each other so they
stop once one fails, ether way you only get the final package if
all the tests pass.
> It is also worth noting that since base.bst does not
have a corresponding test
> element and does not declare a "tested" condition, this
condition does not
> apply to it.
>
> While this example only mentions the "tested"
condition, it should be easy to
> see how this might work with other conditions as well.
I'd like to illustrate a couple of interesting properties
here,
starting with your graph above.
Consider the build graph below as Chandan described it, and
consider
that lib-test.bst satisfies the "tested" condition of
lib.bst.
Similarly app-test.bst satisfies the "tested" condition of
app.bst.
[base.bst]
|
|
v
[lib-test.bst] <-- [lib.bst]
|
|
v
[app.bst] --> [app-test.bst]
|
|
v
[app-pkg.bst]
Add to this, that app-pkg.bst makes the following statement:
I require that all of my dependencies have the "tested"
condition
satisfied
Then the above dependency graph is transformed in the
following way:
[base.bst]
|
|
v
[lib-test.bst] <--
[lib.bst]
| |
| |
| v
| [app.bst] --> [app-
test.bst]
| | |
|
| |
| v |
+--------> [app-pkg.bst] <------+
Some things to note about this automatic transformation:
* The two newly grown dependency relationships are not
regular
build or runtime dependencies, consequently their
outputs are not
staged during the app-pkg.bst build.
* The app-pkg.bst cache key *does* however take the cache
keys
of the new "condition" dependencies into account.
This has the interesting property of blocking the
creation of the
resulting artifact until the tests have successfully
run.
:).
* While the requirement of the "tested" condition imposed
by
app-pkg.bst is recursive in nature, inasmuch as it
requires
the "tested" condition recursively across it's
explicitly
declared dependencies, it is not recursive across
implicit
condition based boundaries.
This is to say that:
- app-pkg.bst requires the tested condition of app.bst
AND lib.bst
- app-pkg.bst does NOT require the tested condition of
lib-test.bst
This is because lib-test.bst was implicitly pulled into
the graph
due to the condition requirement.
However, if lib-test.bst imposes other requirements on
it's own
dependencies, those will be treated in the same way as
app-pkg.bst.
This could get rather complicated.
* If an artifact for app-pkg.bst exists but the tests
declared
for lib-test.bst are changed, this will cause
app-pkg.bst to have
a new cache key, requiring the tests to be re-run.
HOWEVER, following some plans which were discussed in
artifact
subcommand group proposal, around here (see earlier and
later
messages for better context):
https://mail.gnome.org/archives/buildstream-list/2018-October/msg00006.html
With "artifact aliasing" (also needs a better name), it
will be
possible to avoid the actual processing in rebuilding
app-pkg.bst,
since the actual staged inputs and build commands will
remain
unchanged.
Ideally after implementing this orthogonal feature; the
effect of
re-building app-pkg.bst after having only changed a
test, is that:
- A new cache key is derived for app-pkg.bst
- This new cache key requires the tests to pass for
all deps,
causing app-test.bst to be rebuilt
- The actual package payload is not re-run, only a new
artifact
is created with the same files content as a
previously existing
artifact.
Until this measure of build avoidance is implemented in
a generic
way, app-pkg.bst will be rebuilt every time the tests of
it's
dependencies are changed and require rebuilds (retests).
Thanks for resurfacing that part of the artifact command
subgroup thread in the context of this conversation.
I do think there is another consideration missing.
Instead of declaring in app-pkg.bst that I want all of my
(transient) dependencies to satisfy the 'tested' condition I
want to be able to run "bst build app.bst" declaring that I
want all elements that are built during my run to have the
tested condition met at the end of my run. And otherwise
consider my build a failure. I don't think it's
satisfactory If the only way for me to do this is to "bst
build app-pkg.bst".
> Considerations
> ==============
>
> Using this approach, we can satisfy the criteria listed
above. Additionally, it
> also provides the following advantages:
>
> - BuildStream does not require any implicit knowledge
about testing and
> treats tests as regular elements.
> - Since the idea of conditions is generic, it can
also be used for other
> purposes like "packaged", "license verified" etc.
> - It is easily enforceable by maintainers of a
BuildStream project, either
> manually or automatically.
While I don't see why it couldn't work, I do think there
is a balance we need to strike between keeping things
generic (and simpler from an implementation/maintainablity
PoV in BuildStream), and making things more specialized and
convenient for users.
If we have project maintainers provide standardized build
elements (kinds), can those also take care of testing in one
go, or do I now need to create a -test.bst element that has
a related kind?
> At various points in the past, we have discussed other
approaches. Here is a
> short (and incomplete) list of approaches that we
considered but realized were
> not fit for purpose for some reason. I am listing them
here in case anyone
> thinks that we missed some obvious choices:
>
> - Running tests as part of the build - basically
works but we lose the
> potential parallelism and block the reverse
dependency builds.
> - Retroactively marking build as failed if tests
fail: altering the artifact
> after it has been published seems wrong, and
failing a test should not
> prevent the existence of the artifact for a
successful build.
> - Having leaf elements explicitly depend on the test
elements - basically
> works but not easy to enforce standards this way
and it is easy to miss
> tests for certain dependencies.
> - Have additional cache keys for testing - requires
BuildStream to be aware
> that tests are special and also suffers from the
same problem as having an
> explicit dependency on test elements.
>
>
> I have purposefully tried to keep the implementation
details out of this
> message, so that we can agree on the high-level plan
first, and decide on the
> details later. I will be very curious to hear what
others think about
> "conditions" :)
I'm also wary of the name here, let's consider what the YAML
will look
like and what contexts in which the vocabulary will be used,
using
"conditions", I think it will look like this:
======= lib.bst =======
# Specify that the lib-test.bst element provides its
"tested" condition
condition-providers:
tested:
- lib-test.bst
======= app-pkg.bst =======
# Specify that this element requires the "tested" condition
of its
# dependencies
condition-requirements:
- tested
I'm not sure I love "condition" as a word for this, it does
make some
kind of sense when thinking about state chart terminology
(e.g., the
built app-pkg.bst state cannot be reached until the
conditions are all
satisfied), but clashes a bit with what we call "conditional
statements" in our YAML layer.
The word "state" is not much better.
Another approach would be to talk in terms of "extensions",
as in the
lib.bst declares lib-test.bst as one of it's "tested
extensions".
I'm not sure this is better either, and would also be
interested to
hear other suggestions on the terminology to use.
To my mind these are "implicit dependencies" but thats not very
catchy is it.. "declared implicit dependencies" and "required
implicit dependencies" are also a bit of a mouth full.
Finally some points that i don't *think* have been mentioned from
the gathering and while they don't seem to clash with anything in
this or the longer thread i think it is good to have them written
down so we keep them in mind as we start to move towards the
implementation details.
1) The way that this is used is a matter of policy for a given
project but this should make it possible for simple polices that
should be easy to implement. And it should be easy to test that
policy is being followed.
This should be easy to do for non bst experts and easy to test
for when people add .bst elements to the ".bst" repository
"Engineer load/over head" was a criticism of a lot of the other
ideas we had and while this is not a zero effort proposal for the
engineer adding .bst files to a big repository it is not too big
and should not be allowed to grow to much as we polish/implement
this idea.
I also think the current proposal is *just* simple enough that
some sort of CI could be dreamed up around the repo that holds the
.bst files.
2) this only effects the dependency graph.
Lots of the other ideas we have had for testing have needed
changes in all sorts of parts of the code base, even effecting
caching. But this idea keeps all changes up to the dependency
tree. Once the front end has run and the dependency tree has been
calculated there should be no further modification for this to
work. Although some extra commands to make it easy to digest the
information may be desired i think they would only be very minimal
and not effect the core of the code.
While i did say a lot of things in the discussion these last two
are not my original points but I didn't want them to get missed.
Apologizes if they are already there i have tried to read most of
this thread but i may have missed them.
Thanks
Will
|