Re: Reviewing GVariant



On Tue, 2009-10-27 at 21:03 -0400, Matthias Clasen wrote:
> Just to run down the concepts you meet while reading this chapter,
> mostly without introduction or explanation:
> - type classes
> - basic types, maybe types, container types
> - 'any' types
> - generic types
> - variant types and type strings
> - element types
> - generic tuple type
> - wildcard types and concrete types
> - type 'matching'

> I'd really like to call for some reduced excess and some clarity of exposition.

I agree.  There are a few key ideas here, and my inability to come up
with a good name for any of them has resulted in mixed usage throughout.
I appreciate suggestions here (without turning this into a bikeshed,
please).


In all cases here (as with all type systems) the core question is that
of separating things that match (or "are of") a certain type from those
that do not match (or "are not of") that type.  This motivates
"matching".

One potentially confusing pitfall here is that in addition to values
matching (or not matching) a given type, we also speak of types matching
another type.  A type matches another type if all instances of one are
also instances of another; think of the normal subclassing relationship.

One thing I don't like about the word "match" is that it's unspecific.
Does the pattern match the item or does the item match the pattern?
Maybe we could avoid this language entirely with "has_type()" for
GVariant and something along the lines of "is_subtype_of()" for
GVariantType.

The words "generic", "wildcard" and "any" all mean approximately the
same thing here -- a type that has subtypes.  As alluded to above, the
GVariant type system is capable of expressing non-specific types.  For
example, "array of anything: 'a*'" or "triplets where the first item is
an integer and the other two items are any type: '(i**)'".

In contrast, I use the word "concrete" for a type that is not a wildcard
type.

I'm thinking that I suddenly like the word "specific" for concrete
types.  I'm not sure why I didn't think of that earlier.

The thesaurus suggests some of these antonyms for "specific" (filtering
ones that are obviously useless): general, indefinite, uncertain, vague,
ambiguous, uncategorical.

We could speak of "definite" and "indefinite" types.  That has the
additional advantage of clear disjointness/completeness.





About the three different kinds of things (API wise) that you encounter
in GVariantType.  They all come in both "definite" and "indefinite"
forms:


First, a type string is just a string.  It has a particular format (as
outlined in the grammar provided in the docs).  DBus signature strings
describe multiple arguments to a method as a concatenation of the
individual arguments' type strings.  A GVariant type string is just
that: the type of an individual value.

  "i"    - a 32bit integer
  "as"   - an array of strings
  "(ii)" - a pair of 32bit integers

GVariant type strings can be made "indefinite" by the addition of
wildcard characters.  For example:

  "(i*)" - a pair where the first element is a 32bit integer and the
           second element has any valid type

Not all strings are valid GVariant type strings.  There are functions to
check for that.



Second, GVariantType.  Conceptually (and in implementation, actually --
but not officially) these are almost exactly the same as GVariant type
strings.  They have a one-to-one relationship -- any GVariantType can be
converted to a type string and back again.  Essentially, a GVariantType
is a "parsed out" value of a GVariant type string.

In reality (and not a fact that I want to document), a (GVariantType *)
is very little more than a casted (char *).  There are several important
reasons that I make the distinction, however.

  1: freedom to handle the memory management of GVariantType *
     differently from the ways typically used for strings

  2: an "input validation" argument.  By the "official" API, you have to
     go through guarded cast functions (G_VARIANT_TYPE ()) to turn a
     string into a GVariantType *.  This means that functions can take
     a GVariantType * and be certain that they are getting the real
     thing and not an invalidly formatted string.

  3: GVariantType * is not necessarily nul-terminated.  This allows the
     type iteration functions over structures to work without
     allocating additional memory.  For example, if you execute this
     code:

type = G_VARIANT_TYPE ("(si)");
first_type = g_variant_type_first (type);

then first_type will be equal to the first type of the pair -- a string.
It is a pointer to the "s" inside of the static string "(si)" with no
additional memory allocation.  When you call g_variant_type_next() it
will be moved on to point at the "i" in that same string.  Trying to
directly print a GVariantType* as if it were a string obviously won't
work for this reason -- you'd see "si)" instead of "s" and "i)" instead
of "i".



Finally, GVariantTypeClass.  This is essentially a small finite
enumeration of the "biggest" type of a value (or another type).  The
type corresponding to the type string "(ii)" has type class "tuple".
The type corresponding to the type string "a(ia{sv})" has type "array".
As a rule, you can determine the type class of a particular type
entirely by looking at the first character of the type string.

> - This whole thing would be a lot more digestible if you added a long
> introduction that explains the type system and introduced the
> terminology.

I agree.  The part up there ^^ might be something like a first draft :)

> - If you can't explain clearly in the docs why type classes are a
> useful thing to have, do away with them.

You can have switch statements over them.  This is critically important
when you're doing anything that has to consider the existence of all
types -- basically any kind of serialisation/conversion/etc.

> - Get your terminology straight. Do types 'match' other types or do
> they match values ? Are generic types the same as any types ? And
> maybe the same as wildcard types as well ?

TYPE_ANY, TYPE_ANY_TUPLE, TYPE_ANY_ARRAY, etc are "indefinite" types (as
per terminology introduced above).  TYPE_CLASS_ANY, TYPE_CLASS_TUPLE and
TYPE_CLASS_ARRAY are the type classes of those types.

There might be an argument for doing away with 'any' types as they
appear in GVariantTypeClass since you never need to switch() over
indefinite types.  Having them there is nice, though: it provides that
all GVariantTypes (including "*") have a class that they fall into.  I
can't currently think of any strong technical reason that we can't
restrict g_variant_type_get_class() to only work on "definite" types
other than that it wouldn't be "as nice" anymore.

> - Unexplained obscurities: Whats this business with defining tuple
> types via callbacks ?

ya.  that's crack.  I'm gonna take it out.

> On to the GVariant docs.
> 
> The terminological uncertainty continues. Does a GVariant _have_ a
> value, or _is_ it a value ? The introduction says it has a value,
> whereas the rest of the docs seem to be written assuming that a
> variant is a value.

Look to python:

I say "5" in that language and it's not *really* 5.  It's actually an
object (of type 'int') that contains the value 5 and has a bunch of
methods that allow you to explore its fiveness.  It's the same here: it
really depends on how much you squint your eyes.  I agree, though, that
the documentation itself should be consistent.  I'd say that it *is* 5.

> "'r' means exactly the same as 'r'" ?!

Victim of gtk-doc here:

 * '*' means exactly the same as '@*'. 'r' means exactly the same as
 * '@r'.  '?' means exactly the same as '@?'.

Seems like it eats the '@' at the start of the line.  Even the "'" won't
save it.

> "Nested '@' characters may not appear." - this is unclear. Do you mean
> "Consecutive '@' characters may not appear." ?

The documentation for this function is actually a lot worse than you
think.  It's incomplete and possibly inaccurate in places.  I need to
fix this.

> g_variant_print() should xref g_variant_markup_print().

noted.

> GVariantIter docs seem to confuse next() and next_value(). E.g. they
> talk about next() returning NULL when in reality it returns FALSE.

_next_value() used to be called _next() and _next() used to be called
g_variant_iterate().  I changed that for consistency, but missed a spot.

> The concepts of flattening and serialization could be explained in the
> introduction.

noted.
> 
> (I've not stopped to point out the numerous instances of parameter
> name confusion).

I need to comb over the docs a lot.


> So much for today, I'll look at the code tomorrow.

Thanks for the review so far.  A lot of work is cut out for me,
clearly :)


Cheers



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]