On 01/04/2019 16:28, James Ennis via
buildstream-list wrote:
Hello,
This week's weekly performance update will be much more verbose
than previous posts as I'd like to take the time to talk about the
changes we've made to the project we have been using to generate
the benchmarks data and I will also try to mention some of the
interesting trends/observations.
So, we have made a tweak to the Debian-like [0] project. The
master branch of this repo has remained unchanged, ~75,000 *local*
import elements whose dependency relationship reflects that of the
source and binary packages in Debian-stretch. However, there now
exists the branch: "jennis/use_remote_file" [1], which uses
"remote" imports (with file urls) of uniquely generated files
(each of size 4.0K and located in the project's files/ directory).
These files are generated by the ./prepare_project.sh script on
that branch.
Note that we are still testing a subset of this project -
base-files/base-files.bst is the target which produces a pipeline
of ~6000 elements.
The reason we back-generated all the data for an altered project
was because Raoul's source-cache MR (!1124) [2] reminded us that
local imports were not being fetched and staged into the local
cache (pre-!1124), so we saw a spike in build times from when this
landed. Therefore, in order to more accurately simulate a
'real-world' project, we altered the Debian-like project to use
'remote' file imports.
In addition to this, we are now also measuring the build times and
peak memory usage for 4, 8 and 12 builders.
-----
All results were obtained on the same hardware (a Codethink dev
machine configured to be a gitlab runner - not being used for
anything else), and the results files are being appended to by a
CI job. Specs as follows:
* Linux (Debian stable)
* x86_64
* 16G RAM
* 500Gb SSD
* Intel i7-3770
* 8 cores @ 3.40 GHz
-----
Now, bearing in mind all the results have been completely
regenerated, here are the times and max-rss (peak memory usage)
results for a `bst show` of base-files/base-files.bst in the
Debian-like project (jennis/use_remote_file branch) [1] (Note that
I've set the column width of the table to 72 in the hope that this
will wrap nicely for everyone):
┌─────┬─────────────────┬─────────────────┬───────────────────────────┐
│ │ Show │ YAML cache show │ Show once
built │
├─────┼───────┬─────────┼───────┬─────────┼─────────────────┬─────────┤
│Dates│Time │Memory │Time │Memory │Show once built
│Memory │
│ │(s) │(M) │(s) │(M) │(s)
│(M) │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│01/14│49.23 │1089 │27.17 │1115 │32.82
│1117 │
│01/20│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│01/21│48.85 │1089 │27.35 │1115 │32.84
│1117 │
│01/27│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│01/28│49.50 │1088 │27.17 │1115 │32.76
│1117 │
│02/03│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│02/04│48.08 │561 │26.68 │588 │31.90
│589 │
│02/10│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│02/11│45.46 │408 │23.75 │435 │29.50
│437 │
│02/17│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│02/18│43.17 │417 │21.65 │443 │27.28
│445 │
│02/24│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│02/25│41.75 │493 │19.43 │519 │25.69
│625 │
│03/03│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│03/04│40.96 │517 │18.81 │543 │25.20
│696 │
│03/10│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│03/11│40.44 │519 │18.06 │545 │24.50
│698 │
│03/17│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│03/18│38.87 │521 │16.74 │547 │23.19
│696 │
│03/24│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│03/25│22.63 │422 │13.40 │433 │17.36
│563 │
│03/31│ │ │ │ │
│ │
└─────┴───────┴─────────┴───────┴─────────┴─────────────────┴─────────┘
# Interesting observations
* On the whole, times and peak memory usage are decreasing.
* Our peak memory has not been consistently decreasing. This is
due to the fact that we have made a couple of time vs memory trade
offs, favouring a reduction in `show` time. The most notable
culprit here being !1164 [4]: Altering the behavior of the garbage
collector - in favor of a faster show.
- This is very noticeable in the attachment
'all_show_memory_results.png', there is a spike in memory near the
beginning of March.
- This corresponds to a dip in the time taken in the
'all_show_times_results.png' attachment.
* Please do not be alarmed by the fact that the most recent `show`
is much higher than the most recent `show with YAML cache`, even
though we no longer support the YAML cache, YAML New World Order
(!1257) [4] landed mid-week last week, so the average has been
calculated with results from pre-!1257.
* Our peak memory usage for `bst show` once the artifact cache is
hot is much higher than for all other shows.
-----
Here are the times and max-rss results for a `bst build` of
base-files/base-files.bst in the Debian-like project [1]
┌─────┬─────────────────┬─────────────────┬───────────────────────────┐
│ │ 4 Builders │ 8 Builders │ 12
Builders │
├─────┼───────┬─────────┼───────┬─────────┼─────────────────┬─────────┤
│Date │Time │Memory │Time │Memory │Show once built
│Memory │
│ │(s) │(M) │(s) │(M) │(s)
│(M) │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│01/14│1383.67│1119 │1078.76│1119 │735.41
│1119 │
│01/20│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│01/21│561.12 │1117 │602.20 │1117 │680.10
│1117 │
│01/27│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│01/28│560.7 │1117 │601.57 │1117 │678.03
│1117 │
│02/03│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│02/04│373.29 │590 │425.09 │589 │500.71
│590 │
│02/10│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│02/11│308.75 │436 │357.69 │436 │412.90
│436 │
│02/17│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│02/18│292.78 │445 │350.70 │445 │398.30
│445 │
│02/24│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│02/25│342.28 │521 │387.28 │521 │436.51
│521 │
│03/03│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│03/04│354.64 │544 │400.00 │544 │441.61
│544 │
│03/10│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│03/11│352.95 │546 │395.50 │546 │442.52
│546 │
│03/17│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│03/18│348.53 │549 │383.93 │549 │431.18
│549 │
│03/24│ │ │ │ │
│ │
├─────┼───────┼─────────┼───────┼─────────┼─────────────────┼─────────┤
│03/25│286.67 │435 │302.77 │435 │332.87
│435 │
│03/31│ │ │ │ │
│ │
└─────┴───────┴─────────┴───────┴─────────┴─────────────────┴─────────┘
# Interesting observations
* Both peak memory usage and time have decreased dramatically
since mid-Jan
* From late Feb/early March, our build times increased again by
~60 seconds and our peak memory by ~100M... Upon investigation, we
found that !1164 [4]: Altering the behavior of the garbage
collector - in favor of a faster show. was responsible for this.
- There is an obvious spike which shows this in the
'all_build_time_results.png' attachment.
- This also has a corresponding spike in memory usage, see the
'all_build_memory_results.png' attachment.
- So the discussion needs to be had whether this commit is
worth reverting.
* The most notable of observations is in mid-January, where our
build times were very high, but, 4 builders were slower than 8
builders which were slower than 12 builders, as one might expect.
Then, *all* of the build times were heavily reduced by Tristan's
change to the scheduler: !1067 - Fix element processing order [4].
Now, rather interestingly, all times are significantly lower but
12 builders are consistently slower than 8 builders which are
consistently slower than 4 builders. So the behavior has flipped,
and is perhaps now behaving non-intuitively.
- See the all_build_time_results.png attachment for clarity.
- This suggests that if I tried 24 builders, for example, this
patch would have likely made my build slower.
Given that this is a import project you would expect that if the
scheduler got very fast then the projects time would eventually be
bound by IO and so adding more works would just slow up the
project as more and more IO tasks fought each other.
This being said, do we have an estimate of the number of copy
operations that this project has to perform and how long they take
so we can use this as a absolute limit on the performance of this
project? I think that would help give us much more interesting
context to the numbers.
Another important factor to consider is that we have been
focusing on the scheduler, and rightly so. But do we have a
measure of how much time the scheduler takes vs the elements? or
indeed dose adding works affect the build time per element, i
presume snake vis can tell use this? or dose it get confused by
multiple cores?. on your 8 core box I would expect 4 builder to be
faster than 8 only if the task was not IO bound and i would expect
the 12 builder task to be slower as the builders fight for
resources. however the big deltas suggest the scheduler is to
blame rather than the workers. However as we have tools in
bst-benchmarking like
https://gitlab.com/BuildStream/benchmarks/merge_requests/14 that
could very easily be tweaked to give the min max and avarage
element build time it seem strange not to calculate this
information when i would be so easy to do so. If anything it may
well highlight how badly the scheduler performs, but it might also
give some interesting less expected results that might be worth
knowing.
For these very large jobs were the only chance of building them
in a realistic time if they had realistic build times would be to
use RE would we be better bench marking builds with a sleep but
many many many workers? that way we are not confused by the IO
performance and the total time should be dominated by (sleep time)
* (elements in critical path) which will be more deterministic.
would this give use a much more real world test case? I would
think this would highlight the inadequacies of the scheduler even
more starkly, or not.
-----
As usual, the notebook displaying the results for last week's data
has been attached. Please see my previous email [5] if you're
unable to open this. If you're comparing this to previous weeks,
just bear in mind that the data has been regenerated. I've also
attached a compressed tarball that contains all of the results.
This might be of more interest.
I hope that you found this brief overview of these benchmarks
helpful.
It's great to see that, in general, we seem to be heavily
improving BuildStream's performance!
Thanks,
James
[0] https://gitlab.com/jennis/debian-stretch-bst
[1]
https://gitlab.com/jennis/debian-stretch-bst/tree/jennis/use_remote_file
[2] https://gitlab.com/BuildStream/buildstream/merge_requests/1124
[3] https://gitlab.com/BuildStream/buildstream/merge_requests/1257
[4] https://gitlab.com/BuildStream/buildstream/merge_requests/1067
[5]
https://mail.gnome.org/archives/buildstream-list/2019-February/msg00046.html
_______________________________________________
buildstream-list mailing list
buildstream-list gnome org
https://mail.gnome.org/mailman/listinfo/buildstream-list
|