Re: XML libs (was Re: gconf backend)



On Sat, 2003-09-27 at 19:51, Daniel Veillard wrote:
>   No and that is a big mistake you're making. Example: you don't have to 
> know about entities at the user level for those apps, you just let the
> parser do the work for you. You won't even know that they have been there.

Well sure, that's what I'm saying. My applications don't want to see
entities. Sure it's in the library internals somewhere.

>   You think in terms where you control the input and output. the
> error is that your next big client is gonna use an Oracle back-end for
> your XML data, and suddenly you don't control the production anymore,
> and if you use a non conformant parser you made a promise that you just
> can't hold, and that kind of thing has serious long-term costs.

Slow down, I'm not advocating gmarkup. That's why I described my ideal
XML lib, and said it would be conformant.

gmarkup solved a simple problem: we needed markup in GtkLabel and
libxml2 (or even expat) was too much of a dependency to sell to the GTK+
team in the 2.0 timeframe. gmarkup is not intended to be an XML parser.
If it was ever made one, it would be by using expat or libxml2 on the
backend.

>    But as I pointed out, a non compliant lib is likely to bear even
> what you consider very simple processing. I'm pretty sure I can find 
> a very simple documents for which gmarkup will work differently from
> expat and libxml2.

I'm also sure of that. gmarkup isn't the point; it clearly deviates from
the ideal I described.

  This has nothing to do with web development versus data oriented
> development. You have a spec, either you're compliant or not. It's a
> contract. And all it costs you to comply to that contract is mostly
> to reuse correctly a compliant library instead of trying to roll your
> own.

But that isn't true. To be XML compliant in terms of handling the stuff
not found in the gmarkup-like subset, you not only have to use the
library, you have to use it properly. Or you have to let it do things
that probably break many apps.

Say I just cut metacity over to a library that handles includes and
dtds. Suddenly themes would probably be able to cause the WM to lock up
by creating unexpected I/O during theme loading. There are probably
security issues as well since themes can be untrusted. To switch to the
library then, I need a detailed understanding of what it is going to be
doing, and then I have to figure out how to turn off the I/O; but when I
turn it off, metacity's theme parser isn't XML compliant anymore, as I
understand it. The features of XML that require nonlocal or even local
I/O seem very browser-centric and problematic for a lot of apps.

>   Well libxml2 uses callback for errors, that's the model everybody
> uses and I'm not sure that was ever questionned by the relatively large
> user base. Since your model seems to impose an asynchronous processing
> I think this will need some discussion on the mailing-list. I cannot
> change radically to a new model without at list a bit of explanation.

Basically I want to write a function:

 MyAppDataStructure*  load_xml_file (const char *filename, GError
**error);

So the question is how to do that. The problem is that functions such as
xmlLoadACatalog() (totally random example) don't return any explanation
of the error; you can look at errno, but you don't know if the errno is
for stat() or open() or read() or there could be a parse error or
out-of-memory and errno is junk. So the only possible error to display
to the user is "failed to load catalog" or something, with no further
diagnostic. Also, sometimes on failure it looks to me like
xmlGenericError was called and sometimes it wasn't.

What you want to display for a parse error is the line where the error
happened and a problem description; for an I/O error you want strerror
(errno). GError/DBusError/CORBA_environment/C++exceptions are a way to
propagate this detailed information.

Not that I really advocate doing this for libxml2; it seems like it
would basically double your API size by adding
xmlLoadACatalogWithError() and so forth. I _don't_ think this is a good
idea, for the record.

> > This is why if someone says they don't like metacity I don't get angry,
> > as long as they don't get personal about it. Nobody is making them use
> > metacity so they don't have any reason to yell if they don't like it,
> > just don't use it. I am a big fan of people having other WMs to use so I
> > can keep mine simple.
> 
>   The BIG difference is that metacity implement the specs related to
> the desktop behaviour (well I assume so it's your domain) but gmarkup
> does not adhere to the specs which drive XML parsing. You just can't
> compare non-compliant and compliant code bases.

Again, I'm not defending gmarkup as anything other than a pragmatic
hack.

> > Well, those parts aren't handled properly now; stuff breaks if you try
> > to use them, no matter what XML lib you're using. Apps just don't expect
> > XML to be more than a doctype, elements, attributes, content, and the
> > simple entities, and if the XML lib feeds them other stuff they just get
> > confused or ignore it.
> 
>   Yes it matter what XML lib you use. conformant libs will just do the
> same processing and delibver the same output. Non-conformant subset based
> ones will catch fire and burn, or more viciously corrup data and generate
> wrong logic burried in code. As soon as the code on top start using a 
> non-conformance deviation, the data and the code is toast.

When I say XML lib I'm saying a conformant lib.

Though I am still hoping a conformant lib can be small and avoid some of
the problematic things like doing I/O behind the app's back.

Havoc





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]