Re: [BuildStream] Scheduler optimization
- From: Jonathan Maw <jonathan maw codethink co uk>
- To: buildstream-list gnome org
- Subject: Re: [BuildStream] Scheduler optimization
- Date: Thu, 28 Mar 2019 16:39:24 +0000
On 2019-03-28 09:30, Benjamin Schubert via buildstream-list wrote:
Hey everyone,
TLDR: I want to refactor the queues and scheduler from their current
model to a push based model.
I would need feeback on the approach to see if I'm not missing
something.
I would also need to know if there is any objections to my plan since
this would be a
significant refactoring effort.
I'll explain why and how in the rest of the email
---
I've been looking at the scheduler and the queues and from what I can
see in
some benchmarks and profiles, they are becoming an important
bottleneck,
especially with a large number of workers.
A few issues have been raised over time about that:
- https://gitlab.com/BuildStream/buildstream/issues/824
- https://gitlab.com/BuildStream/buildstream/issues/943
Currently, the queues go over every element until they find enough
elements to
fill up all resources. While this works fine with few workers, since
the
probability of finding a job that is ready early in the queue is quite
high,
the performance degrades when more workers are added, as the
probability to have
many jobs ready early decreases.
The problem becomes even worse when we are using remote execution with
many
workers.
I therefore think we should be pushing items in the queue as soon as
they are
needed.
This is in details what I think should be done:
1) Don't store a list of all elements in each queue.
- Currently, every queue contains everything, and we need to go
over it
everytime.
2) Remove Queue.status() method, allowing to check if an element is
ready
- Currently, this method is called on every element of the queue,
in sequence,
until the queue has enough elements to use all the resources.
3) When setting an element for tracking, put it in the `track` queue.
4) When calling "set_required" on element, put the element in the
fetch_queue.
- Currently, what happens is that the element is already in the
queue, and when
"set_required" is called, a boolean is toggled, making the
status of the element
go from WAITING to READY, and allows the queue to pick it.
5) When an element is put in the build queue, if all its dependencies
are not ready,
add the element to a list in every dependency that is not ready.
Whenever an
element becomes done in this queue, go over this list and for every
ready
element put it in the queue.
- Currently, what happens is that the element is already in the
queue, and whenever
it's status is checked, it will check if all its dependencies
are available before
saying that it is ready.
All other queues should just get the elements from the previous queue
pushed
to them when ready, as they don't have dependencies on other elements.
One concern that was raised in the past was that we would lose the
ordering
of the build and that would decrease performance.
We could prevent this by using a priority queue and sort elements when
adding
them. We would therefore still keep the stable ordering we have now.
Does this plan seem sensible? Am I missing something? Any feedback is
welcome!
I plan to start working on this next week unless anyone has any
objections to the
plan.
Cheers,
Benjamin
Hi Benjamin,
The proposal looks good to me, though I'm wondering about how you're
adding the elements to the BuildQueue.
If I understand you correctly, you intend for every element that's done
being fetched to be added to the BuildQueue.
Currently, the fetch queue has an option to skip all cached elements,
which adds the element to the skipped_elements list for the frontend to
read. AIUI here you'd have to also directly add the element to the
BuildQueue.
Is there a particular reason you prefer this way of doing it to
deferring the logic of adding elements to the BuildQueue to
Element._schedule_assembly()?
Cheers,
Jonathan
--
Jonathan Maw, Software Engineer, Codethink Ltd.
Codethink privacy policy: https://www.codethink.co.uk/privacy.html
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]