Re: Introspection



Very good summary, much clearer than what I would be able to come up
with probably. Also, I'm gonna to be a bit messy, it's 5AM ;)

Dnia 10-01-2005, pon o godzinie 12:03 -0500, Matthias Clasen napisał:
> I'd like to get the Introspection effort going again, with the 
> goal of getting something ready for GLib 2.8. To spark the discussion, 
> here is a quick braindump of on this topic. Note that I haven't
> looked in depth at the prototype work done by Rob Melby yet, so
> a lot of this may already be done or at least begun (I hope 
> Rob and Mathrick will jump into the discussion to clear this up).

Here I am :). I gave Rob's code quick ride, unfortunately I'm not as
familiar with it as I'd like to be. I'll beat Rob into providing some
decent public repository for that code and also try start cleaning it up
if I only find some time.

> The Goal should be to allow people to write fully introspectable 
> objects (for the D-BUS bindings, for scriptable apps interfaces, etc.)
> and to allow language bindings to get rid of all the all the .defs
> and header scanning that is being done currently. There should be
> a way to programmatically invoke the described functions. We need to
> provide metadata for classes, interfaces, functions (both methods of 
> classes and freestanding functions)
> 
> We should probably not cover all irregularities of C apis (varargs come
> to mind).

One irregularity that would be needed is at least partial support for
arbitrary C structs (we should be able to denote it's an opaque blob and
its size I think, maybe some constructor/destructor thingy a'la GValue?
Probably I'm thinking too much). This is needed for interfacing with
external APIs, and also with some APIs that should use GObject but don't
for some reasons (like Gnome-VFS, which has many custom structs)

> What information is needed ?
> 
> Functions

[snip]

> Classes/Interfaces

[snip]

> Boxed types

[snip]

> 
> Implementation ideas
> * The metadata should be stored as a binary blob in the library.
>   one idea for finding the metadata is to use a symbol name  
>   derived from the module name, e.g. the metadata for GTK+ would
>   be accessible via a symbol like _GTK_METADATA. Using a fixed
>   symbol name would be problematic when using dlsym() to find
>   the data. 

I think we should use common interface for accessing that data, which
would be then implemented using specific mechanisms. For ELF .so, we
could just use private section or something like that. For python it'd
be something created as we go through code, and then merged with rest of
metadata. There still remains question of activation method though, see
below.

>   The binary format should be extensible. There are different
>   ways to achieve this, e.g. tagging all fields (allows to
>   skip unknown tags), or using a fixed layout and store size 
>   information for each entry (allows to skip unknown stuff
>   at the end of the entry).

Looking at .NET assembly spec, they have some nice ideas -- particularly
attributes are stored in "relational" way -- there's a table of all
attributes in assembly, together with field linking back to symbol
tagged by given attribute instance. Neat way to keep basic metadata
structure fixed. Attributes btw. are pet idea of mine, we want them :)

>   Strings should probably be kept in a string pool.
> 
> * The type information needed for the parameter and field 
>   descriptions goes beyond what gobject provides, since we
>   need to be able to say, e.g. GList of GtkWidget*. We probably
>   don't need a full recursive type system, though.

We want at least somewhat recursive system, didn't give it enough
thought yet, but I suspect going methodically through GStreamer's
GObject usage would reveal some new and interesting ways to encapsulate
data (gst is probably biggest non-gtk+ user of GObject, will need to
grep source for worthy stuff. I will poke Company and jdahlin about it).
Plus Michael is going to yell at us if we don't support recursive
types ;)

XXX: Does gstreamer using element factories (you request factory able to
create elements of type "avidemux" for example, which in terms of GType
corresponds to GstAviDemux. This is later used to create instances of
GstElements which are GstAviDemux) pose a problem for our type system?
Do we want to support it if it does? I remember jdahlin changing some
instantiation stuff in gst-python precisely because of element factories
not fitting into python's OO system quite right, will need to query for
specifics.

> * GType type ids are registered at runtime (except for a few
>   fixed ones), therefore type names must be stored in the on-disk
>   metadata. 
> 
>   In order to get around the problem that types can
>   not be looked up until they are registered, we need to store
>   the name of the get_type function in addition to the type name.
>   
>   To make cross-library references work, we must also store the
>   module name for types from other modules. Eg. the GTK+ metadata
>   blob should store the reference to the PangoLayout type as
>   "Pango", "PangoLayout", "pango_layout_get_type".

Activation and cross-references are going to be tricky, to say the
least. How do we activate referenced types written in Python? How far
are we going in support for activating unregistered types? Is
registering it enough provided we can find it in current metadata repo,
or is loading .so still in order? I suspect that in order for this to
stay sane and flexible enough, we will need to move quite a bit of
responsibility to bindings, and hide that behind pretty interfaces.
Whatever we do, we should avoid solutions like hardcoding sonames for
example, if at all possible.

@Mike Hearn: how does COM do it?

> * It may not be necessary to store the struct offsets explicitly
>   in the metadata blow. If the members of classes/boxed types are
>   described in order, its is possible to write platform-specific
>   code to calculcate the offsets. In any case, there should 
>   probably be a way to find out the struct offsets, so that 
>   language bindings don't have to implement the struct layout
>   calculations themselves.

Bindings are supposed to use libffi anyway. It's quite probable that
they will have to do things like generating thunks/marshallers for
vmethods on the fly anyway, so that's an already existing dependency,
and I guess libffi is supposed to do things like creating structs layout
on demand, right?

> * Metadata for classes/interfaces/boxed types should probably be
>   associated with the GType so that it can be looked up in that way,
>   but we also need a way to enumerate the complete metadata for
>   a module, e.g. to get access to metadata for plain functions.

Also of importance is proper namespacing data, to let bindings know that
gtk_widget_show() is gtk.Widget.show() in reality. This is going to be
fun in case of enums and magic constants, as AFAIK each binding has its
own idea where to put them best (like GtkAttachOptions)

> Open questions
> * Should the property metadata which is already available in 
>   gobject be duplicated ?

Would be nice not to dup it

> * Should we explode the binary metadata into structs or just 
>   navigate directly on the binary data ?

If we use abstracted enough interfaces to navigate the data, it should
be non-issue

> * Should documentation be included in the metadata ?

YES! Everyone loves docstrings.

> * Should type information for parametrized container types
>   be provided on the gobject level, or should we do a simple
>   ad-hoc description for such types ?

How would such ad-hoc description work? Hardcoded knowledge about GList,
GHashTable, etc. into each binding?

Cheers,
Maciej

-- 
Maciej Katafiasz <ml mathrick org>




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]