Re: alternative gmarkup parser



Owen Taylor <otaylor redhat com> writes: 
> This looks OK; I think you are right that it is probably as
> easy to use as the node thing and certainly easier to language-bind.
> 
> Its sort of crying out to be an object, but aside from the
> impossibility of that dependency, that wouldn't be very convenient
> in C. 
>

Yeah. It's bindable as-is, without GObject, by using the user_data for
the wrapper object. Not ideal but works.
 
> As far as I'm concerned, its OK if you go ahead and commit. Though 
> a test suite for glib/tests would probably be a good idea.
>

I'm a step ahead of you - in fact I have 35 test files that cause each
and every error condition. So error handling works. I'm not sure the
parsing works since I only have 2 or 3 tests for that, but hey! ;-)

> Very easy:
> 
> Step a) fix g_utf8_validate() to properly handle trailing incomplete
>         (I think I already did this in the tree where I was working
>         on utf-16 handling)
> Step b) Use g_utf8_validate()
>

OK, once you commit that I'll move the code over to use it.

> > typedef enum
> > {
> >   /* Hmm, can't think of any at the moment */
> >   G_MARKUP_FOO = 1 << 0
> >   
> > } GMarkupParseFlags;
> 
> :-) Are empty enumerations valid C?
>

Bring in the language lawyers... 

> Is there a reason to have both the error() callback and the error
> result here?  When would you use one or the other?
>

In general the error callback is useful when the parser wants to do
something on error, and the error results are useful when the user of
the parser or the user of some code that employs the parser wants to
do something on error.

The error callback is useful so you can put 'fprintf (stderr, "blah")'
in only one place. Any of the callbacks can be NULL for apps where you
don't feel like using them.

> (One thing that I might suggest if you are going to have both is that
> you should pass the character/line into the error() callback in
> machine-readable form.)
>                                                     

I've already added a function that gets the line/char number from the
ParseContext. This is mostly useful when constructing your own errors
inside the other callbacks though; inside the error callback, the
GError already contains the line/char in the message, so if we pass
them in people will keep writing code that does this:
 fprintf (stderr, "Error on line %d char %d: %s", line, ch, error->message);

Resulting in messages like:
 Error on line 10 char 30: Unknown entity on line 10 char 30

Havoc




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]