On Fri, 2017-09-15 at 15:53 +0100, Sam Thursfield wrote:
Hi Tristan, In general I agree with your premises, and I think the proposal is workable. I don't have anything better to propose.Option Declaration ------------------ A project declares which options are valid for the project in the project.conf. These options should have some metadata which can be used to declare the defaults, assert valid values of the options, and also a description string which the CLI can use to communicate the meaning of project options to buildstream users (not all users building a project wrote the project.conf).Are you expecting to support only enum values, or freeform strings and integers too?
I certainly dont want to support only enums, although I was undecided on whether the data would simply be in string form and have conditional statements deal with typing; or to have typing encoded into the option metadata. One idea I am liking more and more is the (contains list_opt "string") kind of conditional, where one could check for the presence of a word in a whitespace separated wordlist, which one could use to whitelist elements for a feature or to test for a single feature in a list (like the compiler tuning example I made in my reply to Sander). <slight deviation from topic> As you've been looking into bootstrapping compilers with BuildStream maybe you can shed some light on what we could do for this, because I feel my approach doesnt solve it perfectly.
From what I understand, currently we can only single case symbolic
machine names and make a huge list of full tunings flags depending on that symbolic name. This is an area I think yocto excels at and I would like to have a solution that allows enough flexibility for this (of course without being shell scripts which execute and source eachother). At the basic level, maybe this could be done by allowing a project to: A.) Define symbolic names, maybe they are "macros" or "presets" B.) The symbolic preset defines values for options C.) Write conditionals based on the options It's just the first approach that comes to mind, but would allow us to define feature lists associated to symbolic machine names, and then write conditional YAML fragments based on the resulting feature sets instead of having to special case every machine name individually. Any other ideas ? Implementing a solution to this should not block our landing of a project options feature; however, our approach should probably be informed by if/how we intend to address this kind of complex case. </slight deviation from topic>
This sounds quite similar to Meson's option system[1], which is probably a good sign. 1. http://mesonbuild.com/Build-options.htmlFormat Enhancements ------------------- I would propose we add some special tokens which can be used at any level of a buildstream element.bst file, or also in some specific parts of the project.conf (since project.conf is declaring options, we cannot conditionalize that part) Below are my ideas for the '>>', '<<', '==', '??' and '!!' operators.On one hand this is ugly as sin because YAML describes itself as a "data serialization standard", where self-modification shouldn't really be a thing. On the other hand, it already contains two special operators ('&' and '*', which are effectively "copy" and "paste") and in the interests of being "human friendly" there is definitely justification for allowing more syntax sugar.
Right, asides from this we already sort of break that rule because the loaded yaml dictionaries are post processed and composited multiple times. As I've said else where I've migrated from: '??': condition: ... to: (?): condition: ... I feel like it will stand out more and I dont like the quotes. That said I'm open to changing the conditionals to something more conforming, if we really expect that the result is going to be more legible. For the other (>), (<) and (=) tokens, I dont see any way around it; it's already become an annoyance that you cannot extend arrays in 'split-rules' but are forced to override them; we need some format if we want to let the user decide about append/prepend/override (and of course, this removes the need for post/pre commands everywhere as added sugar). That said it's not that evil, we are just deserializing dictionaries which bear meaning about how they are to be composited against other deserialized dictionaries.
I think the symbols you've chosen are pretty good, I like that they don't look anything like normal text so just glancing at a .bst file should set off alarm bells of "this isn't just a list of dictionaries, there's extra processing being done" and hopefully the reader will head for the documentation. It's worth a quick review of existing solutions in this area. There are some processing/filtering tools:* jq -- https://stedolan.github.io/jq/ -- CLI tool for running filterson JSON-serialized data, which supports all kinds of manipulations, path-based access, and conditionals* xlst -- https://www.w3.org/standards/xml/transformation -- similar butfor XML and is about as horrific to use as you would expect And also formats that support variants / self-modification: * Ansible Playbooks -- https://docs.ansible.com/ansible/latest/playbooks.html -- mixes Jinja2 templates with YAML, to provide variable substitution and conditionals using Jinja's expression syntax* jsonnet -- http://jsonnet.org/ -- extends JSON to add an expressionsyntax based on JavaScript (although specified independently)
Nod, the conditionals themselves are a preprocessing step and it's indeed possible to instead generate YAML from "buildstream format", I've given that a small amount of thought... honestly not much.
There are probably more things that I'm missing.The '??' expression format ~~~~~~~~~~~~~~~~~~~~~~~~~~ So at first I was thinking what this would look like as a pure YAML format, but it looks like it will be way too verbose for expressing simple comparisons. Example: variables: '??': condition: kind: ifeq args: option: debug value: on then: conf-extra: --enable-debug else: conf-extra: --disable-debugThis could be abbreviated to be on one line: variables: '??': condition: { kind: ifeq, args: { option: debug, value: on } } then: "conf-extra: --enable-debug" else: "conf-extra: --disable-debug" It's not super readable though.Later I thought maybe we do our own parsing of strings like `ifeq(option, value)`, but that also becomes a little unwieldy, hard to maintain and extend to support compound expressions. So what I'm leaning towards now is to create a simple expression format based on S-Expressions, this way the same expression above would just look like: '??': condition: (ifeq "debug" "on") then: conf-extra: --enable-debug else: conf-extra: --disable-debug This is especially nice once you want to do anything a bit more complex, the following would be a lot more verbose to express if it were in YAML: '??': condition: | (and (ifeq "logging" "off") (ifeq "debug" "on")) then: ... value ... else: ... value ... The S-Expressions are fairly easy to parse and there is a python library for that (http://sexpdata.readthedocs.io/en/latest/).I have to admit I was surprised to see Greenspun's 10th rule borne out here :-)
As they say... nobody expects the Spanish Inquisition! Honestly though, there's probably some good reasons why half baked lisp-like syntaxes get reused a lot. In this case for instance the parser in python is ~600 loc (heavily docstringed) - not being full blown lisp or something is not a bad thing when what you need is much less; I can easily turn around and throw away / replace a dependency like this.
This could be workable, but the choice of S-Expressions is risky. I think those familiar with Lisp will be unhappy that our implementation doesn't match up with their preferred dialect of Lisp, and those unfamiliar with Lisp will think "what on earth am I looking at".
Ok so frankly I'm not attached to parenthesis nested lists esthetically speaking, I am however quite attracted by the simplicity of it, we probably can achieve similar simplicity with something else. I'm not convinced that a: condition: | %{foo} == "bar" kind of notation is simple though; it tries to be very human friendly and programing languagey, and then leaves us a bit blind if we want to later extend the operators, what would we use for the kind of word in list 'contains' conditional ? (maybe we do like python sets and use C bitwise field test operators on them ?). This operator comparison expression approach is also more rigid and demanding, it would have to be done perfectly the first time. We dont have a nice relaxed namespace in which we can deprecate the 'ifeq' symbol for a new 'equals' symbol on the day that we figure out that comparisons should have been case insensitive, we might instead be in a corner left with yucky workaround alternatives, advising users that they should use the '===' operator in new projects, instead of the existing but botched and unrecommended '==' operator. Just to illustrate the simplicity of the nested lists (which can be expressed with S expressions, but certainly also other formats), I wrote up the following in really not too much time the other day: https://bpaste.net/show/553780fab83d (attached as well as testsexp.py but pasted in case the list eats my attachments). Without much code at all it already supports constructs such as: (or option1 option2 ...) (and option1 option2 ...) (ifeq option1 "pony") (ifeq option1 option2) (and option1 (ifeq option2 "pony")) Any format that gives us structured data is plausible to implement the same way, and while I wouldnt mind the format to be different; I wouldn't like to end up maintaining or depending on something more complex just because of some silly stigma attached to parenthesis nesting. Cheers, -Tristan
One option is to reuse Jinja2 expressions, as they are quite Python-like, and are already used in Ansible. The jinja2 library looks flexible enough that we could set up an execution environment and just evaluate Jinja expressions[2] to produce values, rather than using the full templating functionality. 2. http://jinja.pocoo.org/docs/2.9/templates/#expressions This could get us something like this: condition: | %{logging} == "off" and %{debug} == "on" then: ... value ... else: ... value ... I would prefer if we could use True and False for booleans rather than the strings "off" and "on". That way we could get to this: condition: %{logging} and %{debug} then: ... value ... else: ... value ... Some rambling to finish ... does anyone remember BuildJ [3]? It was a project to replace Autotools/CMake/etc with a declarative JSON/YAML format. Started out promisingly, but faltered before completely figuring out conditionals. Hopes for replacing Autotools and CMake then gravitated towards Meson, which is pretty much on track for success at this point. Rather than using YAML, Meson defines a Python-esque DSL for build instructions which is deliberately not Turing complete (no loops or functions) and can be parsed in a few 1000 lines of Python. I used to be disappointed that Meson didn't use a "declarative" approach but I actually find it fine to work with now. I like YAML because it's always possible to reliably parse it, unlike a Turing-complete programming language such as Shell or BitBake. But Meson's language also ticks that box. It is also apparently designed to be re-writable so that IDEs can make changes to hand-written meson.build files, although I'm yet to see how well that works. If it does, I'd be interested in what BuildStream would look like if it abandoned YAML for a similar Python-like DSL. Of course, this is not at all BuildStream 1.0 territory, but something to think on :-) [3] https://wiki.gnome.org/Attic/BuilDj Sam
Attachment:
testsexp.py
Description: Text Data