Revisions Draft RFC [Re: Proposed public data fields for dpkg build and deploy elements]
- From: Tristan Van Berkom <tristan vanberkom codethink co uk>
- To: buildstream-list gnome org
- Subject: Revisions Draft RFC [Re: Proposed public data fields for dpkg build and deploy elements]
- Date: Fri, 07 Jul 2017 18:36:11 +0900
Forking this thread.
This started as a reply to Jonathan about how to handle format changes
and migrations for public data, and evolved into a complete draft of
how we should handle versioning information in general in BuildStream.
On Thu, 2017-07-06 at 17:49 +0100, Jonathan Maw wrote:
On 2017-07-06 12:32, Tristan Van Berkom wrote:
[...]
Okay, I have some concerns about how this format for packaging
scriptlets will work with other packaging systems, but to avoid
blocking your progress, lets try to address these concerns separately.
Instead, can you outline a plan for revisioning of this data ?
What I'm mainly concerned with, is if we eventually want to use this
data for another packaging system but share the same scriptlets to
deploy to multiple packaging systems, what happens then ?
A.) The dpkg-deploy element should probably understand what revision
the scriptlets were written for, and behave in a backwards
compatible way.
B.) A potential new package deployment element, say RPM, which cannot
handle the old version, should be able to bail out with an error
when detecting an incompatible 'package-scripts' syntax
I take it you're referring to versioning of the public data, so that we
can handle
changes to the format sensibly?
I haven't given it much thought. I suppose there are two options:
1) Add a version number to the "bst" domain.
i.e. public.bst.version is defined in the defaults, and when we alter
the format for public data, that version goes up.
2) Add version numbers to the subdomains in "bst".
i.e. there'd be public.bst.split-rules.version,
public.bst.dpkg-data.version and public.bst.package-scripts.version
defined in the defaults.
I'd prefer version 2), since that way someone who wants to add something
completely separate doesn't have to comprehend the entire history of
public data formats.
So I've been giving this some thought and haven't arrived as any
conclusions yet, but I can see this as a potential repeating pattern,
also there are more than just format revisions to consider with
BuildStream.
Element 'config' sections will probably have to be revisioned; the base
buildstream format will have to be revisioned, and now it seems that
public data will have to be revisioned.
At what granularity we revision public data is difficult to say, but
at face value it seems to make sense to only revision public data at
the domain level but not at the subdomain level.
I think this activity constitutes more thought process than actual work
(enforcing and supporting revisions will be a hand full of relatively
trivial patches), so I'm drafting an initial pass on revisions here.
BuildStream Software Version
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I have not quite decided on when it is appropriate to bump the major
version, but every 6 months there should be a public release of a
stable BuildStream version with a minor point revision bump (e.g. 1.2
becomes 1.3 if a feature was added in a release cycle).
Starting from BuildStream 1.0, any API additions or keyword argument
additions in the public API should include an annotation of what
version of BuildStream they were added in (e.g. "Since: 1.2")
Source and Element plugins must advertise the minimal bound BuildStream
dependency - for convenience, any plugins residing in the BuildStream
source base will just advertise the current version of BuildStream as
their dependency.
After the release of BuildStream 1.0, all public python API surfaces
must be considered stable and remain backwards compatible moving
forward.
BuildStream Format Versions
~~~~~~~~~~~~~~~~~~~~~~~~~~~
The format versions are different as they provide a guarantee of what
features are available to a project.
This is rendered a bit more complex by the fact that third party
plugins are allowed to exist, this means that the core BuildStream
format (i.e. conditional statements, dependency declarations, variants,
etc) needs to have one format revision, and all plugins need to have an
individual revision as well.
To simplify things; I would propose that we keep a single BuildStream
format revision and we have all plugins which are hosted in the
BuildStream source repository use the same revision number.
A project will be able to assert a minimal bound revision for
BuildStream and for any plugins it uses in the project.conf, if the
installation of BuildStream has an old revision for the overall format,
or for any of the loaded plugins; BuildStream will abort and tell the
user that they need a newer installation for the given project.
Again, from BuildStream 1.0; any additions and enhancements to the BuildStream format must remain backwards
compatible with older versions.
The policy for bumping the BuildStream format revision is, if any features have been added to the base
format, or to any of the plugins, or if a new plugin has been added over the course of a release cycle (which
should be 6 months following the GNOME release schedule), then the format revision must be bumped once
immediately; there should be only at most 1 revision bump in a given release cycle.
Public Data
~~~~~~~~~~~
Public data is a bit tricky, but I think the most straight forward way of dealing with this is to say that:
o Public Data in the "bst" domain is revisioned with the main
BuildStream format revision
o If Public Data is not in the "bst" domain, then it is specific to
a given element type which consumes that data, and as such it is
revisioned with the format version of the given element plugin.
BuildStream Artifact Versions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The artifact version is a different beast altogether, and is only ever bumped when the underlying code causes
the build output to differ for a given calculated cache key - OR - if the cache key calculation algorithm has
changed in any way.
Note that incrementing the BuildStream version itself, OR incrementing the format version, does not by itself
constitute bumping the artifact version.
This provides two separate guarantees:
A.) The same cache key for a given artifact will always produce the
same output (bit-for-bit identical ideally, if we have
sufficiently reproducible builds).
B.) Building the same project with a version of BuildStream that
carries the same artifact version, will produce the same
cache key.
Here again, unfortunately we have to consider third party plugins so it cannot be one single BuildStream
artifact version. This is because third party plugins loaded in the pipeline may change over time in ways
that can both potentially effect how the output is created, or effect how a cache key is calculated for the
given element (via Element.get_unique_key()).
So, this means that the artifact version is actually a single master revision, plus a dictionary of plugin
artifact revisions.
There are two avenues we can follow regarding this revision:
o For convenience, only ever bump a single artifact version for any
and all BuildStream first class citizen plugins (plugins which are
maintained as a part of BuildStream).
In this case we can ignore the plugins which are maintained as a
part of BuildStream and have a more comprehensive artifact version.
o It is generally undesirable to bump the artifact version over time,
because it means you need to go get an old version of BuildStream
if you want to produce the same cache key and output for the same
project, years later.
In this light we could chose to revision the artifact version of
each plugin separately. If for example the cmake build element
artifact version is bumped, it need not have any effect on projects
which do not use cmake.
Specifically Regarding Artifact Revisions in Plugins
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Another thing to consider is how we add features which effect cache key calculation to indivitual Element
plugins which provide the Element.get_unique_key() implementations.
If for example, an Element adds a new feature to it's format, this constitutes a format bump, however it
still must remain backwards compatible with older formats.
So, Element.get_unique_key() is written properly, the artifact version for this element need not be bumped,
as only *usage* of the new feature would cause the cache key to change; but projects which do not use the new
feature would still produce the same output and can still produce the same cache key.
An example here... of a bad Element.get_unique_key() implementation:
===============================
return {
"foo": self.get_foo(),
"bar": self.get_bar(),
"new-feature": self.get_new_feature_configuration()
}
===============================
An example of a backward compatible Element.get_unique_key() implementation which may not require an artifact
revision bump:
===============================
unique_key = {}
unique_key["foo"] = self.get_foo()
unique_key["bar"] = self.get_bar()
if self.get_new_feature_configuration() is not None:
unique_key["new-feature"] = if self.get_new_feature_configuration()
return unique_key
===============================
Sorry for the long email, but there are a lot of details to consider as usual, any thoughts or comments
appreciated.
Cheers,
-Tristan
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]