Re: [xml] Schema validity failure for valid document



On Mon, Jan 10, 2005 at 06:44:15PM +0100, Kasimier Buchcik wrote:
I tried an initial implementation.
There is a problem with negated namespaces in wildcards:
[...]
P:\libxml2-lab\tests\2005-01-10>xmllint --noout --schema errRep1.xsd 
errRep1_0.xml
DEBUG terminal: 0
DEBUG nbval: 3
Element 'x': This element is not expected. Expected is one of { 
("http://FOO";, "b"), ("http://BAR";, any), ("http://FOO";, "c") }.
errRep1_0.xml fails to validate

  Small suggestion use {http://FOO}b as in the XPath REC, it's a shorter
and already well known notation for namespaced names.
  Otherwise looks cool ! How to retrofit those new dynamic error codes
in the __xmlRaiseError framework may be a bit challenging, maybe separate
the translatable strings from the potential values which are language
independant.

This is due to the fact that negated namespaces are build using
an automaton approach in xmlSchemaBuildAContentModel:

  deadEnd = xmlAutomataNewState(ctxt->am);
  ctxt->state = xmlAutomataNewTransition2(ctxt->am,
    start, deadEnd, BAD_CAST "*", wild->negNsSet->value, type);
  ctxt->state = xmlAutomataNewTransition2(ctxt->am,
    start, NULL, BAD_CAST "*", BAD_CAST "*", type);
  xmlAutomataNewEpsilon(ctxt->am, ctxt->state, end);

The namespace is let through and then caught with a dead-end.

Any ideas?

   The problem is that we reach that "dead state", from there we can't
extract useful informations.
   Sounds the reverse from reachable state. Basically when you construct
the automata you can build a list of dead states, i.e. any state from which
you can't possibly transition to a final state. We could build that list
and then error out (or rollback if not determinist) earlier.
   It might be cheap to do based on the epsilon transition elimination, 
but this may not help getting an accurate error code or message out of the
regexp, for example we could save as the error state "start" when going
though that transition to the dead state, then xmlRegExecErrInfo() would
extract the values from the state before going though the error transition.
This seems a more "formal" approach to solving that case of errors
extraction problems, but would that really work for you ?

One workaround I see, would be to add a special negating character to
the namespace name if calling xmlAutomataNewTransition2, and augment
xmlRegStrEqualWildcard to handle negations:
Example:
  "*|~http://FOO"; - the tilde indicates a negation

But this would be a non-automaton approach, plus I don't know if it's
too hackish.

   I would prefer keeping an automata based approach as much as possible.
   Maybe the simplest is to add some informative error information to
such dead state when creating them, keep this in the xmlRegexp and extract
them in the xmlRegExecErrInfo() routine, but it may as well be an ugly
workaround and not formal enough.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]