Re: [BuildStream] Partial local CAS

From: Jim MacArthur <jim macarthur codethink co uk>
To: buildstream-list gnome org
Subject: Re: [BuildStream] Partial local CAS
Date: Mon, 26 Nov 2018 17:59:31 +0000


On 08/11/2018 22:01, Sander Striker via BuildStream-list wrote:

Hi,
After the exchange in the "Coping with partial artifacts" thread, Irealize that we haven't actually had a conversation on list aboutpartial local CAS, and by extension local ArtifactCache. Let me firstexplain what I mean with partial local CAS. Let's define it as a CASthat contains Tree and Directory nodes, but not the [all of] actualfile content blobs.
I'll outline the context and importance of this concept. In remoteexecution builds do not run on the local machine. As such to be ableto perform a build, it is important to be able to _describe_ theinputs to a build. When all of the input files are locally available,this can be done. However, when the input files are not locallyavailable, should we then incur the cost of fetching them? Is thereanother way?
To answer that question let's review how remote execution is supposedto work again in the context of BuildStream. To build an element:
1) Compose a merkle tree of all dependencies, and all sources
2) Create a Command and an Action message
3) FindMissingBlobs(command, action, blobs in the merkle tree)
4) Upload the missing blobs
5) Submit the request to the execution service
6) Wait for the request to complete
7) Download the result merkle tree
8) Construct a merkle tree for the Artifact (based on the result)
9) FindMissingBlobs(blobs in the artifact merkle tree)
10) Upload the missing blobs
11) Store a ref to the artifact merkle tree in ArtifactCache
Let's dive in a bit and look where the inefficiencies are in thecurrent implementation.
Step 1 happens during staging. More specifically inbuildelement.py:stage(). We start with the dependencies. Fordirectories backed by CAS, we don't need to actually stage them on thefilesystem. We can import files between CAS directories by reference(hash), without even needing the files locally. This isn't currentlyimplemented (_casbaseddirectory.py:import_files), but that shouldchange with CAS-to-CAS import (MR !911).After the depencies are staged, we move on to the sources. Currentlythis is still fairly clunky, as we are actually staging sources on thefilesystem and then importing that into our virtual staging directory(element.py:_stage_sources_at). With SourceCache this should be asefficient as staging dependencies for non-modified elements.
Step 2 through 11 all happen during _sandboxremote.py:run().
Step 2-4 aren't currently implemented in this fashion, and insteadserially call a number of network RPCs. In _sandboxremote.py:run() acall is made to cascache.push_directory(). This will push up anymissing directory nodes, or any missing files.In _sandboxremote:run_remote_command() we are usingcascache.push_message(), followed by cascache.verify_digest_pushed(). This results in a Write RPC, followed by a FindMissingBlobs RPC. Forboth the Command and the Action. In short, we could be eliminating acouple of RPCs and thus network roundrips here.I'll skip over step 5-6 as these are not very interesting. Although itshould be noted that _sandboxremote.py:run() is ignoring the buildlogs from the execution response.In step 7, which happens in _sandboxremote.py:process_job_output(), wetake a Tree digest that we received from the execution service, anduse it in a call to cascache.pull_tree(). This will fetch all of thefile blobs that are present in the tree that are not availablelocally. It will also store all of the directory nodes that arereferenced in the tree, and return the root digest. This is used toconstruct the result virtual directory of the sandbox.In step 8 we go back to constructing a file system representation ofthe artifact, instead of using a CAS backed directory. This happensin element.py:assemble() through a call to cascache.commit(). Thiswill do a local filesystem import of files, the majority of which weexported in step 7. It will put an entry in the local ArtifactCache.Step 9-11 happen during the push phase. Here we rely oncascache.push() to ensure that the artifact is made available on theremote CAS server.
Sidenote while we're here: apart from step 9-11 we don't actually makeit clear to the scheduler which resources are needed. As far as it isconcerned a remote build job is currently taking up PROCESS tokens.
If you made it all the way here, thank you :). I think we need toeliminate the unneeded filesystem access first.
Then we can go further and support partial CAS by:
- erroring when FindMissingBlobs() calls return digests that we don'tactually have locally- retrieving just the Tree, rather than all blobs when we processActionResults
Only when you actually want to use the artifact locally should wefetch the actual file objects. For instance in case of bst [artifact]checkout. Or bst shell. If we are not using the files, there is noreal point in downloading all this content, which takes both time anddisk space.

Regardless of the current discussion on default behaviour and whether wewant finished artifacts to always be present on the `bst` clientmachine, there clearly are some cases (build requirements) in which wedon't always need to download artifacts to the local machine if usingremote execution. This is my understanding of the potential optimisations:

In the simplest case, we could just check whether a build requirementexists on the remote execution storage service first, and if it does,and we are using remote execution, we don't need to download thatartifact or source to the `bst` client machine. It's quite probable thatsomeone else sharing the remote execution service will have uploadedthat artifact already, or a previous run by ourselves will have done.

The second case is that the remote execution service could be made tofind build-requirements itself; if the remote service has access to ourartifact cache, or source cache when that appears, then it can querythose for missing artifacts or blobs before it answers its own'findMissingBlobs' request.

We can't require that the remote execution service does this, since wehave to use only the REAPI, nor can we explicitly tell the service whereto find artifacts, but if we are running our own remote executionservice, we can give it a list of remote storage to query first.

We also don't have any notion of addressing or location for artifactsAFAIK so the remote service would have to query all of its remotes.


Does that sound like a reasonable goal?

Jim

References:
- [BuildStream] Partial local CAS
  - From: Sander Striker

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]