[xml] Decoding in attributes broken for libxml2-2.6.2



Hi,

I have a simple test-xml:

<?xml version="1.0" encoding="UTF-8"?>
<Document uri="http://local/?a=1&amp;b=2"; />

with an encoded ampersand in an attribute.

And I have a suiting test-program, which uses the SAX interface for parsing the document (linked with g++):

#include <libxml/SAX.h>
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>

static int FD = 0;

static int myXmlInputReadCallback(void *arg, char *buffer, int len) {
  return read(FD, buffer, len);
}


static int myXmlInputCloseCallback(void *arg) {
  close(FD);
  return 0;
}


static void myStartElement(void *arg, const xmlChar *name, const xmlChar **atts) {
  int i;
  if (atts) {
    for (i = 0; atts[i]; i += 2)
      printf("attr name: %s value: %s\n", atts[i], atts[i + 1]);
  }
}

int main(int argc, char *argv[]) {
  xmlSAXHandler handler = {0};
handler.startElement = reinterpret_cast<startElementSAXFunc>(myStartElement);
  xmlParserCtxtPtr ctxt;
  FD = open(argv[1], O_RDONLY);
  ctxt = xmlCreateIOParserCtxt (&handler, 0,
reinterpret_cast <xmlInputReadCallback> (myXmlInputReadCallback), reinterpret_cast <xmlInputCloseCallback> (myXmlInputCloseCallback),
                                0, XML_CHAR_ENCODING_NONE);
  xmlParseDocument(ctxt);
  xmlFreeParserCtxt(ctxt);
}


If I link the program with e.g. libxml2-2.5.10 (and prior versions) I get the expected result:

attr name: uri value: http://local/?a=1&amp;b=2

If I link the program with libxml2-2.6.2 I get the unexpected result:

attr name: uri value: http://local/?a=1&#38;b=2

Now the URL isn't anymore legal, because the #-sign is used fore anchors. Is this new behavior intended or a bug?

Ciao,
....................................................................
Marc Ewert              neofonie GmbH          Tel: +49.30.24627-241
Projektleitung          Robert-Koch-Platz 4    FAX: +49.30.24627-120
ewert neofonie de       D-10115 Berlin         Web: www.neofonie.de




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]