Re: [BuildStream] Proposal: Moving UI into a subprocess
- From: Phil Dawson <phil dawson codethink com>
- To: Tristan Van Berkom <tristan vanberkom codethink co uk>, Benjamin Schubert <contact benschubert me>
- Cc: buildstream-list gnome org
- Subject: Re: [BuildStream] Proposal: Moving UI into a subprocess
- Date: Fri, 17 May 2019 17:15:34 +0100
Thanks both for the responses. I've replied inline.
On 17/05/2019 13:59, Benjamin Schubert wrote:
<snip>
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, 17 May 2019 10:31, Tristan Van Berkom via buildstream-list <buildstream-list gnome org> wrote:
<snip>
Let's determine if this is a problem
If you are raising this, I presume it might already be a problem
indeed, but please can we have some details ?
[...]
> In Daniel's latest batch of profile analysis [1]. He noted that UI
> rendering was consuming around 50% of the straight line cpu time of the
> parent `bst` process. As such, it may make sense to delegate the UI
> rendering to a subprocess to avoid blocking the scheduling of useful
> work while rendering UI events.
I think this is a non sequitur, it is not necessarily true that:
* If the main process is spending half of it's time in logging...
* It is a problem and we should fix it.
I can add more data here. I'm using BuildStream in docker through bash.exe
on a Windows machine. The IO and screen update it slow, meaning when I run
bst, 40% of the time is spent in writing to the screen. I usually end up
running something like:
`bst command 2>&1 | tee`
in order to get a quicker BuildStream run.
Doing so _does_ increase my build times in a matter that I can see it.
Both problems from Daniel's profiles and mines would be aleviated by moving
the UI in it's own process.
We need to know that this is in fact a problem before going to such
extremes as proposed in this mail.
Spending half of the main processes's work in logging doesn't mean that
we are bottlenecking on logging, it only means that half of the work
that is being done is being done in logging.
Consider that we are currently improving how cache keys get calculated,
there is a plan in place to push resolved keys onto the reverse
dependencies such that cache key calculation becomes cumulative, with a
goal of eliminating the need to ever call Element.dependencies() in
cache key resolution algorithms entirely, causing the algorithm to be
nearly linear if possible.
The result of this will mean that:
* The remaining logging being done in the main process will be more
than 50% of the work.
* The logging will be *less* of a bottleneck than it was before, even
if it constitutes more than half of the work being done, and even
without improving logging performance.
Let's try some less aggressive measures first
When running BuildStream, the main process uses 100% of a processor. There
is therefore a real bottleneck here. The scheduler shouldn't need that much.
I'd be much happier if the scheduler could use 100% of it's own process, having
the UI updated separately.
I also think that when adding Remote Execution in the picture, with many more
processes to orchestrate, we will still have this bottleneck, evne with all
the other optimisations.
Thanks Benjamin for providing this extra detail.
Tristan, is this enough to convince you that having the UI in a separate
process to the scheduler is worth pursuing (implementation details aside)?
I already replied with some hints on how we can improve logging
performance here[0], I'm certain we can do better without resorting to
splitting up roles and responsibilities into multiple processes.
Thanks for those, I indeed think they should be implemented. I however don't
think this is enough.
+1 here.
First, consider - is it really that important that we update the status
bar and spit out log lines at a sub-second frequency ?
- In a session that has a lot of ongoing tasks, naturally has a more
expensive status bar to update (more tasks in the status bar)
- Similarly in a session that has a lot of ongoing tasks, we get a lot
of start/stop messages, plus a lot of status messages, often we see
many messages appear in the same second.
Compounded together, of course the logging is going to get
exponentially more expensive with a lot of ongoing tasks.
Instead of unconditionally updating the UI at every incoming message,
we can limit the frequency at which we update the terminal whilst
preserving the integrity and ordering of incoming messages.
Agreed, this doesn't fix anything though, when using BuildStream, my UI is already
not updating more than once a second.
When the scheduler is already running, we already have a ticker which
we use to unconditionally update the status bar, that would be a good
time IMO to:
- Render all of the messages received since the last "tick"
- Update the status bar only once at the end
This would have the additional benefit of reducing a lot of "flicker"
we can sometimes experience when overly updating the UI and processing
a lot of messages (in these times, we definitely update the status bar
way more than what is needed).
One thing to keep in mind in this approach, is we should still handle
the error messages immediately upon receiving them, at least in the
case when we know that it will result in pausing all ongoing tasks and
launching an interactive prompt.
I think we should definitely improve this area regardless of other more
extreme measures we might take as it will improve the UI experience and
reduce overall work anyway, and then we should revisit my point above
and reevaluate again whether it is still a bottleneck.
Split up work in multiple processes
If:
* We have reduced the work in calculating state, allowing the main
process to breathe easier
* Reduced the redundant work we do in logging by logging on a fixed
timer rather immediately than with every incoming message, further
allowing the main process to breathe easier
* It is *still* demonstrably a bottleneck
Then we should consider splitting up the work into separate processes.
I don't think those are mutually exclusive and do think both are needed.
Again, thanks for the detailed suggestions. I agree that these would be
worth doing in addition to having the UI in a separate process.
However, the way I see it putting logging into a subprocess is just
about the worst way I can imagine for going about this - running
`Scheduler.run()` in a subprocess separated from frontend activities
would make a lot more sense.
Completely agree here
I agree that this does seem like a better abstraction. Thanks for
calling me out here :)
<snip>
One thing to point out is that no matter how we slice things, we are
still going to have the issue of synchronizing element state across
processes, as state mutates over the course of a session, and the
logger needs to display state.
I would hope that the worker process could eventually give all the necessary
information to the frontend process, that would remove any problem arrising from
this and would remove lots of state synchronisation.
As far as state synchronisation goes, as far as I understand, the
following bits of state are needed by the front-end.
1) The element lookup table, this will be populated after load time and
won't change during a scheduler run, so I don't think we'd any
synchronisation here.
2) Element names. Again, these are available after load time and won't
change during a scheduler run, so I don't think we'd need to communicate
this back to the main process.
3) Element cache keys. I'm not sure if the cache keys as displayed by
the front-end can change during a scheduler run. I feel like they
probably do, but I'm not sure.
4) The status of the scheduler queues to be displayed on the status bar.
5) Cache usage.
Am I missing anything here?
Thanks again for taking the time to respond.
Cheers,
Phil
--
Phil Dawson, Software Engineer Codethink Ltd
https://www.codethink.co.uk/
We respect your privacy. See https://www.codethink.co.uk/privacy.html
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]