[Notes] [Git][BuildStream/buildstream][juerg/buffer-size] 10 commits: co

Jürg Billeter pushed to branch juerg/buffer-size at BuildStream / buildstream

Commits:

9e3d3e05

by Angelos Evripiotis at 2019-02-11T07:14:50Z

contributing: snakeviz replaces pyflame+flamegraph

Replace the instructions for pyflame+flamegraph with simpler ones for
snakeviz. For our general use-case this seems to be easier and better.
Usage of this tool was demonstrated at the 2019 BuildStream Gathering in
January by Daniel Silverstone, when presenting the aggregate results of
profiling on many target environments.

Here is the relevant mailing list thread:

"Profiling before the gathering"
https://mail.gnome.org/archives/buildstream-list/2019-January/msg00057.html

bb6a692d

by Jürg Billeter at 2019-02-11T09:20:34Z

Merge branch 'snakeviz' into 'master'

contributing: snakeviz replaces pyflame+flamegraph

See merge request BuildStream/buildstream!1129

1ed63e54

by Angelos Evripiotis at 2019-02-11T09:24:48Z

_includes: better provenance on recursive include

Use the provenance of the include block, instead of the whole node.

02e48209

by Angelos Evripiotis at 2019-02-11T09:24:48Z

_includes: better error on missing include

Previously, a missing include would result in an error like this:

    Could not find file at not-a-file.include

Note that the file containing the include was not mentioned.

Now we get an error like this instead:

    element.bst [line 7 column 5]: Include block references a file that
    could not be found: 'not-a-file.include'.

4336e3bf

by Angelos Evripiotis at 2019-02-11T09:24:48Z

_includes: better error on including directory

Previously, include a directory result in an error like this:

    mydir is a directory. bst command expects a .bst file.

Note that the file containing the include was not mentioned.

Now we get an error like this instead:

    element.bst [line 12 column 0]: Include block references a
    directory instead of a file: 'mydir'.

3f6c5000

by Angelos Evripiotis at 2019-02-11T09:24:48Z

_includes: re-use file_path variable

Avoid an unnecessary call to os.path.join().

adde0c94

by Angelos Evripiotis at 2019-02-11T09:24:48Z

tests/format/include: remove unused tmpdir's

Don't create and remove temp dirs unnecessarily when they are not used,
looks like these were just copy-pastes without intended side-effects.

a66f8379

by Jürg Billeter at 2019-02-11T13:52:54Z

Merge branch 'aevri/include-error' into 'master'

More user-friendly reporting on include errors

See merge request BuildStream/buildstream!891

aabdf1b9

by Jürg Billeter at 2019-02-11T14:31:39Z

utils.py: Increase buffer size in sha256sum()

Increasing buffer size from 4 kB to 64 kB speeds up read() bandwidth by
factor 4, according to a very simple benchmark.

649736bc

by Jürg Billeter at 2019-02-11T14:31:39Z

_cas/cascache.py: Increase buffer size in add_object()

Increasing buffer size from 4 kB to 64 kB speeds up read() bandwidth by
factor 4, according to a very simple benchmark.

Changes:

CONTRIBUTING.rst

@@ -1707,26 +1707,13 @@ You can then analyze the results interactively using the 'pstats' module:
  For more detailed documentation of cProfile and 'pstats', see:
  https://docs.python.org/3/library/profile.html.
 -For a richer visualisation of the callstack you can try `Pyflame
 -<https://github.com/uber/pyflame>`_. Once you have followed the instructions in
 -Pyflame's README to install the tool, you can profile `bst` commands as in the
 -following example:
 +For a richer and interactive visualisation of the `.cprofile` files, you can
 +try `snakeviz <http://jiffyclub.github.io/snakeviz/#interpreting-results>`_.
 +You can install it with `pip install snakeviz`. Here is an example invocation:
 -    pyflame --output bst.flame --trace bst --help
+-
 -You may see an `Unexpected ptrace(2) exception:` error. Note that the `bst`
 -operation will continue running in the background in this case, you will need
 -to wait for it to complete or kill it. Once this is done, rerun the above
 -command which appears to fix the issue.
+-
 -Once you have output from pyflame, you can use the ``flamegraph.pl`` script
 -from the `Flamegraph project <https://github.com/brendangregg/FlameGraph>`_
 -to generate an .svg image:
+-
 -    ./flamegraph.pl bst.flame > bst-flamegraph.svg
+-
 -The generated SVG file can then be viewed in your preferred web browser.
 +    snakeviz bst.cprofile
 +It will then start a webserver and launch a browser to the relevant page.
  Profiling specific parts of BuildStream with BST_PROFILE
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

buildstream/_cas/cascache.py

@@ -35,6 +35,8 @@ from .._exceptions import CASCacheError
  from .casremote import BlobNotFound, _CASBatchRead, _CASBatchUpdate
 +_BUFFER_SIZE = 65536
++
  # A CASCache manages a CAS repository as specified in the Remote Execution API.
+ #
@@ -371,7 +373,7 @@ class CASCache():
              with contextlib.ExitStack() as stack:
                  if path is not None and link_directly:
                      tmp = stack.enter_context(open(path, 'rb'))
 -                    for chunk in iter(lambda: tmp.read(4096), b""):
 +                    for chunk in iter(lambda: tmp.read(_BUFFER_SIZE), b""):
                          h.update(chunk)
                  else:
                      tmp = stack.enter_context(utils._tempnamedfile(dir=self.tmpdir))
@@ -380,7 +382,7 @@ class CASCache():
                      if path:
                          with open(path, 'rb') as f:
 -                            for chunk in iter(lambda: f.read(4096), b""):
 +                            for chunk in iter(lambda: f.read(_BUFFER_SIZE), b""):
                                  h.update(chunk)
                                  tmp.write(chunk)
                      else:

buildstream/_includes.py

@@ -40,19 +40,34 @@ class Includes:
              includes = [_yaml.node_get(node, str, '(@)')]
          else:
              includes = _yaml.node_get(node, list, '(@)', default_value=None)
++
 +        include_provenance = None
          if '(@)' in node:
 +            include_provenance = _yaml.node_get_provenance(node, key='(@)')
              del node['(@)']
          if includes:
              for include in reversed(includes):
                  if only_local and ':' in include:
                      continue
 -                include_node, file_path, sub_loader = self._include_file(include,
 -                                                                         current_loader)
 +                try:
 +                    include_node, file_path, sub_loader = self._include_file(include,
 +                                                                             current_loader)
 +                except LoadError as e:
 +                    if e.reason == LoadErrorReason.MISSING_FILE:
 +                        message = "{}: Include block references a file that could not be found: '{}'.".format(
 +                            include_provenance, include)
 +                        raise LoadError(LoadErrorReason.MISSING_FILE, message) from e
 +                    elif e.reason == LoadErrorReason.LOADING_DIRECTORY:
 +                        message = "{}: Include block references a directory instead of a file: '{}'.".format(
 +                            include_provenance, include)
 +                        raise LoadError(LoadErrorReason.LOADING_DIRECTORY, message) from e
 +                    else:
 +                        raise
++
                  if file_path in included:
 -                    provenance = _yaml.node_get_provenance(node)
                      raise LoadError(LoadErrorReason.RECURSIVE_INCLUDE,
 -                                    "{}: trying to recursively include {}". format(provenance,
 +                                    "{}: trying to recursively include {}". format(include_provenance,
                                                                                     file_path))
                  # Because the included node will be modified, we need
                  # to copy it so that we do not modify the toplevel
@@ -101,7 +116,7 @@ class Includes:
          file_path = os.path.join(directory, include)
          key = (current_loader, file_path)
          if key not in self._loaded:
 -            self._loaded[key] = _yaml.load(os.path.join(directory, include),
 +            self._loaded[key] = _yaml.load(file_path,
                                             shortname=shortname,
                                             project=project,
                                             copy_tree=self._copy_tree)

buildstream/utils.py

@@ -235,7 +235,7 @@ def sha256sum(filename):
      try:
          h = hashlib.sha256()
          with open(filename, "rb") as f:
 -            for chunk in iter(lambda: f.read(4096), b""):
 +            for chunk in iter(lambda: f.read(65536), b""):
                  h.update(chunk)
      except OSError as e:

tests/format/include.py

  import os
 +import textwrap
  import pytest
  from buildstream import _yaml
  from buildstream._exceptions import ErrorDomain, LoadErrorReason
@@ -27,6 +28,46 @@ def test_include_project_file(cli, datafiles):
      assert loaded['included'] == 'True'
 +def test_include_missing_file(cli, tmpdir):
 +    tmpdir.join('project.conf').write('{"name": "test"}')
 +    element = tmpdir.join('include_missing_file.bst')
++
 +    # Normally we would use dicts and _yaml.dump to write such things, but here
 +    # we want to be sure of a stable line and column number.
 +    element.write(textwrap.dedent("""
 +        kind: manual
++
 +        "(@)":
 +          - nosuch.yaml
 +    """).strip())
++
 +    result = cli.run(project=str(tmpdir), args=['show', str(element.basename)])
 +    result.assert_main_error(ErrorDomain.LOAD, LoadErrorReason.MISSING_FILE)
 +    # Make sure the root cause provenance is in the output.
 +    assert 'line 4 column 2' in result.stderr
++
++
 +def test_include_dir(cli, tmpdir):
 +    tmpdir.join('project.conf').write('{"name": "test"}')
 +    tmpdir.mkdir('subdir')
 +    element = tmpdir.join('include_dir.bst')
++
 +    # Normally we would use dicts and _yaml.dump to write such things, but here
 +    # we want to be sure of a stable line and column number.
 +    element.write(textwrap.dedent("""
 +        kind: manual
++
 +        "(@)":
 +          - subdir/
 +    """).strip())
++
 +    result = cli.run(project=str(tmpdir), args=['show', str(element.basename)])
 +    result.assert_main_error(
 +        ErrorDomain.LOAD, LoadErrorReason.LOADING_DIRECTORY)
 +    # Make sure the root cause provenance is in the output.
 +    assert 'line 4 column 2' in result.stderr
++
++
  @pytest.mark.datafiles(DATA_DIR)
  def test_include_junction_file(cli, tmpdir, datafiles):
      project = os.path.join(str(datafiles), 'junction')
@@ -47,7 +88,7 @@ def test_include_junction_file(cli, tmpdir, datafiles):
  @pytest.mark.datafiles(DATA_DIR)
 -def test_include_junction_options(cli, tmpdir, datafiles):
 +def test_include_junction_options(cli, datafiles):
      project = os.path.join(str(datafiles), 'options')
      result = cli.run(project=project, args=[
@@ -128,7 +169,7 @@ def test_junction_element_not_partial_project_file(cli, tmpdir, datafiles):
  @pytest.mark.datafiles(DATA_DIR)
 -def test_include_element_overrides(cli, tmpdir, datafiles):
 +def test_include_element_overrides(cli, datafiles):
      project = os.path.join(str(datafiles), 'overrides')
      result = cli.run(project=project, args=[
@@ -143,7 +184,7 @@ def test_include_element_overrides(cli, tmpdir, datafiles):
  @pytest.mark.datafiles(DATA_DIR)
 -def test_include_element_overrides_composition(cli, tmpdir, datafiles):
 +def test_include_element_overrides_composition(cli, datafiles):
      project = os.path.join(str(datafiles), 'overrides')
      result = cli.run(project=project, args=[
@@ -158,7 +199,7 @@ def test_include_element_overrides_composition(cli, tmpdir, datafiles):
  @pytest.mark.datafiles(DATA_DIR)
 -def test_include_element_overrides_sub_include(cli, tmpdir, datafiles):
 +def test_include_element_overrides_sub_include(cli, datafiles):
      project = os.path.join(str(datafiles), 'sub-include')
      result = cli.run(project=project, args=[
@@ -192,7 +233,7 @@ def test_junction_do_not_use_included_overrides(cli, tmpdir, datafiles):
  @pytest.mark.datafiles(DATA_DIR)
 -def test_conditional_in_fragment(cli, tmpdir, datafiles):
 +def test_conditional_in_fragment(cli, datafiles):
      project = os.path.join(str(datafiles), 'conditional')
      result = cli.run(project=project, args=[
@@ -222,7 +263,7 @@ def test_inner(cli, datafiles):
  @pytest.mark.datafiles(DATA_DIR)
 -def test_recusive_include(cli, tmpdir, datafiles):
 +def test_recursive_include(cli, datafiles):
      project = os.path.join(str(datafiles), 'recursive')
      result = cli.run(project=project, args=[
@@ -231,6 +272,7 @@ def test_recusive_include(cli, tmpdir, datafiles):
          '--format', '%{vars}',
          'element.bst'])
      result.assert_main_error(ErrorDomain.LOAD, LoadErrorReason.RECURSIVE_INCLUDE)
 +    assert 'line 2 column 2' in result.stderr
  @pytest.mark.datafiles(DATA_DIR)

[Notes] [Git][BuildStream/buildstream][juerg/buffer-size] 10 commits: contributing: snakeviz replaces pyflame+flamegraph

Jürg Billeter pushed to branch juerg/buffer-size at BuildStream / buildstream

Commits:

5 changed files:

Changes: