AW: [xml] xmllint - Newbie THINKS there may be a whitespace errorin2.6.23



Hi, 

-----Ursprüngliche Nachricht-----
Auftrag von John Navratil
Gesendet: Mittwoch, 26. April 2006 15:52


[...]

The example schema and two example documents which do not appear to 
exhibit this behavior are:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"; 
elementFormDefault="qualified" attributeFormDefault="unqualified">
 <xs:element name="A">
  <xs:complexType>
   <xs:sequence>
    <xs:element name="B">
    </xs:element>
   </xs:sequence>
  </xs:complexType>
 </xs:element>
</xs:schema>

<A>
 <B>
 </B>
</A>

<A>
 <B/>
</A>

The difference in the schema is that the entire structure of 
node 'B' is 
removed.  If however, the 'complexType' tag is replaced, giving...

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"; 
elementFormDefault="qualified" attributeFormDefault="unqualified">
 <xs:element name="A">
  <xs:complexType>
   <xs:sequence>
    <xs:element name="B">
     <xs:complexType>
     </xs:complexType>
    </xs:element>
   </xs:sequence>
  </xs:complexType>
 </xs:element>
</xs:schema>

the error returns.  From your description, I am prepared to 
accept this 
as the null condition where the type of the node is mixed by 
virtue of 
the 'complexType' tag even though it contains nothing, and in 
my initial 
example contained only an attribute definition.

The observation you made is based of the following mechanisms
of XML Schema:
- The former schema: if there's no type specified for the element
  declaration then the type defaults to xs:anyType, which accepts
  any content and any attributes.
- The latter schema: the spec:
    "This case is understood as shorthand for complex content
     restricting the ·ur-type definition·"
  So this restricts xs:anyType. If a restriction does not
  define any particles, then it's an empty content model; so no
  content allowed.

I would like to explain the "real world" example which lead 
to this for 
your consideration.  We are using a message to transfer data from one 
database to another.  We wished to establish an optional relation 
between an one entity in our message and another.  In our case a 
customer might be a retail customer or one who purchases through a 
distributor.  This led to the XML fragment...

   <xs:element name="Distributor" minOccurs="0">
    <xs:complexType>
     <xs:attribute name="ID" type="IDType" use="required"/>
    </xs:complexType>
   </xs:element>

so the user could code something like '<Distributor ID="1234" />'. 
Notice that we have a node with a 'complexType' in order to 
provide the 
attribute definition, but which has no sub-nodes (i.e. not 
very mixed).

The script which generated this code was designed to emit the start 
node, then recursively render any content, then render the end node. 
This lead to the document fragment of the form...

<Distributor ID="1234">
</Distributor>

This violates the (not entirely unreasonable) assumption

"that noone writes...
<B>
</B>
... if he doesn't want those space characters."

May I suggest that xmllint treat a complexType with nothing but 
attributes as mixed-type for purposes of  '--noblanks' 
processing.  Your 
'--noblanksall' would work as well.  Perhaps something more specific 
such as '--attrs-strip-blanks' is more appropriate.

The mechanism of --noblanks and XML Schema validation are not related;
i.e., it is performed as a seperate step before validation.

But maybe what you actually want is to define the different
characteristics of the content in the schema. The following
schema allows the character content of "B" to contain whitespace,
since it uses xs:token as the content type, which has a
xs:whiteSpace facet of "collapse", thus whitespace is removed
from the character content before the xs:length facet of "0"
is applied.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"; 
        elementFormDefault="qualified">

 <xs:simpleType name="foo">
        <xs:restriction base="xs:token">                
                <xs:length value="0"/>
        </xs:restriction>
 </xs:simpleType>       

 <xs:element name="A">
  <xs:complexType>
   <xs:sequence>

    <xs:element name="B">
     <xs:complexType>
                <xs:simpleContent>
                        <xs:extension base="foo">
                                <xs:attribute name="ID" type="xs:string"/>
                        </xs:extension>                 
                </xs:simpleContent>             
         </xs:complexType>       
    </xs:element>

   </xs:sequence>
  </xs:complexType>
 </xs:element>

</xs:schema>


Regards,

Kasimier



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]