[BuildStream] Jupyter notebooks for performance analysis (later)



Hello list!

After some verbal conversations with folks recently, we thought it would be a
good idea to bring awareness of Jupyter notebooks to the list.

I think we could use Jupyter notebooks to great effect when later approaching
the problem of analysing BuildStream performance and visualizing the results.

If you're not familiar with them, they are a way to combine code, text, and
graphics in a web interface. They are like an executable 'notebook', which you
can share with people. They are repeatable, since they are executable, and lend
themselves to quickly remixing other people's analyses.

Here is an example of a typical notebook, from the marvellous Julia Evans:
https://nbviewer.jupyter.org/github/jvns/talks/blob/master/2013-04-mtlpy/pistes-cyclables.ipynb

I imagine we would do something like:

o Have a repository on GitLab to contain the notebooks and any helper scripts.
o Have a cloud machine which rebuilds our notebooks when there's a change,
  perhaps a free Heroku worker. The notebooks would look something like:
    o Shell out to 'wget' to download some csv data from the performance run.
    o Use the 'pandas' python module to load csv/json/db etc as a 'dataframe'.
    o Ask the dataframe to calculate various views on the data, plotting them
       on the page with matplotlib.
    o Save the notebook and push to GitLab.
    o Send alerts if appropriate (raise GitLab issues?).
o View the latest version when we want with http://nbviewer.jupyter.org
o Play with the latest version online with https://mybinder.org/

Food for thought!

Cheers,
Angelos

P.S.

Here is Julia Evans' post with most of this information:
https://jvns.ca/blog/2017/11/12/binder--an-awesome-tool-for-hosting-jupyter-notebooks/

P.P.S.

Notebooks have become something of a big deal in the machine learning space.
If you're looking to see what all the fuss is about with ML and Jupyter, I'd
recommend checking this out: https://course.fast.ai/lessons/lesson1.html


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]