Re: [BuildStream] Execution environments



Hi Jürg,

On Thu, Nov 1, 2018 at 8:28 PM Jürg Billeter <j bitron ch> wrote:
Hi everyone,

With remote execution it will be easily possible to execute build
commands in environments that differ from the host. E.g. the remote
worker may run on a different ISA or on a different operating system.
Strictly speaking this is not specific to remote execution, e.g. QEMU
allows running ARM binaries on x86 systems, however, with remote
execution we expect this to be much more common.

Projects need to be able to explicitly request an execution
environment. For local builds BuildStream will check whether the local
environment is a match and report a friendly error otherwise. With
remote execution this will make sure that a suitable worker is
assigned, and otherwise also report an error.

While the Bazel Remote Execution API has a mechanism for platform
properties, the details are not standardized yet. As we can't wait for
this to happen, I'm proposing minimal support for requesting the
execution environment, in a way that we can support and later extend.
Future extensions can hopefully be aligned with the upstream Remote
Execution API.

+1.  I noticed there are related topics on the bazel-dev list as well:
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/bazel-dev/wKacIE4MIRM/nukf2F7hBQAJ
 
The proposal is to extend the `sandbox` configuration for this purpose
with two keys: os and arch

    os
    ~~
    This is a string indicating the OS family / kernel. I think it's
    sensible to use the output of `uname -s` for this on POSIX systems.

    Example values: Linux, Darwin, FreeBSD, SunOS

    We may also need to be able to specify a minimum OS version,
    however, I don't think this is urgent and we can add this later.


    arch
    ~~~~
    This is a string indicating the ISA family. There is a lot more
    variance here across systems/vendors, unfortunately. And on some
    systems `uname -m` conflates the ISA family with extensions or
    versions (e.g., on Linux: ARM extensions or i386 vs. i686), which
    I think we need to handle separately. I thus propose that we define
    our own list of values, but we should attempt to pick names that
    are as close to official, neutral, and consistent as possible. Bi-
    endian architectures require a suffix to indicate (non-default)
    endianness.

    As a start I propose the following list:

    * AArch32 (little endian)
    * AArch32-BE
    * AArch64 (little endian)
    * AArch64-BE
    * Power-ISA-BE
    * Power-ISA-LE
    * SPARC-V9 (big endian)
    * x86-32
    * x86-64

    We will likely need to support specifying minor architecture
    versions and ISA extensions (e.g. SSE4 for x86, and SVE for
    AArch64). For this it may be useful to allow a list for `arch`
    instead of a single string. Or it may be better to add separate
    keys for this. A list may also be useful to support specifying
    multiple ISAs, e.g. AArch32 and AArch64 for biarch builds. I don't
    think we need to decide this right now and would like to wait for
    the Remote Execution API upstream discussions for alignment.

These two keys can be specified in `project.conf` as well as elements.
Conditional expressions can be used to allow user configuration via
options. If not specified, they default to the host environment.

These keys define the execution environment, i.e., what binaries can be
executed. Projects may cross-compile targeting other environments,
however, BuildStream itself has no need to know the target platform.

For local builds, an exact match with the host architecture is not
required. E.g. Linux x86-64 systems typically support x86-32 binaries
as well. However, the sandbox should in that case use
linux32/personality(2) such that `uname -m` and configure scripts in
the sandbox work as expected.

Cache key: We already include `os` and `arch` in the cache key
calculation, currently using the value from the host. However, as the
existing `arch` is based on `uname -m` instead of the proposed list,
cache keys will still change.

Open issue: The current `arch` project option type is documented and
implemented using `uname -m` as default. However, as described above,
`uname -m` varies across systems and is thus not suitable. Any
suggestions how to handle this with minimal breakage of existing
projects?

This requires a cache key version bump doesn't it?  If we are only assuming
backward compatibility in terms of consuming from existing caches, we
would generate the cache key both with the old and the new version.  We
check if an artifact exists under the new key, if not we check under the old key.
If not present we build the artifact.  We then store the artifact under the new
version key.
 
Any thoughts or comments?

Seems like a good starting point.  Thanks for raising!
 
Jürg

Cheers,

Sander

 
_______________________________________________
BuildStream-list mailing list
BuildStream-list gnome org
https://mail.gnome.org/mailman/listinfo/buildstream-list
--

Cheers,

Sander


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]