Re: gnome-vfs/GIOChannel for parsing



[ No need to CC me, BTW ]

On Wed, 2003-03-05 at 22:59, Jody Goldberg wrote:

> Storages contain streams in MS parlance.  In broad strokes there are
> two forms of storage.
>     - structured files : The containing stream has a format which
>       defines sub streams.  Each sub-stream can be read individually.
>       Things like tar, ar, and if you ignore compression zip.
> 
>     - file system in a file : An extension to structured files that
>       allows simultaneous reading/reading from sub-streams.
>       Equivalent to being able to open a sub-stream and write data
>       to it, having the implementation magically handle extending
>       the meta-data/allocation-tables/... to manage the content.

Ah, ok.  I can see that being very useful for certain applications.  

> Interfaces and implementations for these formats is important for
> applications as soon as we start combining multiple applications.
> If I want Gnumeric to contain some drawing managed by dia or
> sodipodi it is inconvenient to store all of them in the same xml
> blob.  Nor is it pleasant to store binary data (images sound) as
> base64 encoded text in the xml.  We need to support structured files
> at a minimum.  It is an open question whether to extend things and
> support file system in a file is necessary.  I am leaning towards
> no (which is why gsf is as it is) but there are some compelling
> arguments in favour.
>
> There are several other design issues that could be discussed
> before things go into glib.

How much of this are you advocating putting in glib?  Again, I can see
the above file-in-a-file stuff being very useful, but only for
relatively complex applications.  Do you really think all of it should
go into glib?

I just looked in the Debian package browser (aptitude) for programs
which depend on libgsf-1.  On my system, that's mrproject, gnumeric, and
an abiword plugin.  That's certainly not a large set; while my sample is
admittedly unscientific, I think it's fairly representative.

Personally, I think that structured file stuff should stay in libgsf. 
As you said earlier, it's gsf's raison-d'ĂȘtre.  

And of course if it later turns out that a lot of applications want
structured file stuff, then we can move it into glib.  But I think
trying to do the entire thing at once (streams+structured
files+registry+...) will make it much more difficult for the glib
maintainers to accept.  

Now, ensuring that the stream stuff can cleanly support structured 
files is a good goal.

> > Fair enough.  In that case though you could just make the size method
> > return -1 or something if the size isn't known.
> 
> We considered this and rejected the notion because it would break
> several use cases (many already in existence).  

Ok.  Well, another alternative is to throw an exception (i.e. set a
GError) if it's unsupported.  But another interface is ok by me too.

> Having an abstract interface in glibs is clearly a good idea, along
> with several implementations for the basic operations.  However, I
> suspect that people would not want MS OLE2 parsers in glib proper.

See above.

> An important use of the stream interface would be to support the
> mime sniffing implementation being discussed else where.

Do you have a link for this?

> They had
> better work together or they are pointless.  So you need to be able
> to sniff the type of a stream (with or without knowing its name) and
> potentially guess a type knowing only a name.  To support proper
> content based sniffing and intelligent mime type support it would be
> nice to support gdk-pixbuf style operations.  Those depend on an
> external registry of known types that can install themselves without
> significant startup cost to most alps. 

I think one thing we will probably want is something like a
GAutoStreamReader which looks at the stream buffer and attempts to guess
at the character set and the EOL convention.  

But MIME stuff opens up a whole big can of worms...glib doesn't depend
on any kind of MIME database now.  Again, I really think this makes more
sense in a separate library like gsf.

I mean...it seems to me that almost all the applications which use glib 
will be doing some sort of file I/O, and having a stream library will be
useful for them.  Probably for most of them, FILE * is sufficient, but I
think the cost of a mimimal stream library in glib is quite small.

> I'll try to be more explicit.  My meaning is that an api to support
> asynchronous operations is orthogonal to the stream interfaces.
> These are very central concepts and there is no rush.  Lets pick
> pieces and work on them.  I'm willing to use gsf as a test bed
> because there are no stable projects depending on them.  However,
> that will change soon.

It's somewhat orthogonal, I agree.  I'm not saying it's a core
requirement, but it will be needed.

> As a starting point would you like to work on some TextReader &
> StreamReader interfaces that wrap a GsfInput (String Reader seems
> pointless given GsfInputMemory an C strings).  The basic interface
> of TextReader seems similar to my intent with GsfInputTextline.

Heh; I think we've switched positions here.  When I started on this, I
thought it made sense to start coding, but now I am seeing that there is
a fairly large disparity between our viewpoints, and I'd like to resolve
that before we code more.

> I don't see alot of use to some of the TextReader interfaces
>     - get/initialize LifetimeService
>     - hashing
>     - equality
>     - serializable

Most of this stuff is just inherited from the .NET object stuff I
believe.

>     - get as string (seems like a convenience routine on the stream)

Yeah, I agree.

> On further examination of the .net Stream interface it seems
> fundamentally different from my goals in gsf.  Its base 'Stream'
> interface mixes in and out.  There is no way to divide them.  I
> would be very much against using that as the primary abstracting in
> glib.

Didn't we already agree to separate it into InputStream and
OutputStream?




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]