Hi Daniel, All,
in libxml2-2.4.x, when one called xmlNodeDump, the result was a UTF-8
encoded string. This seems to have changed in libxml2-2.6.x (or
probably during 2.5.x). Instead of UTF8 it now uses character entities
to encode all non-ascii.
I must say that this change is quite annoying. Why isn't the new
xmlNodeDump called xmlNodeDumpASCII or something at least for the sake
of backward compatibility?
Anyway, the milk was spilt, I guess I have to learn to live with
that. But I'd like at least to make XML::LibXML's $node->toString()
behavior consistent, so I have to find a suitable work-around.
Since xmlNodeDump doesn't provide any parameter for setting the
requested encoding (which would always be UTF8 in our case), I
explored xmlsave.c and came up with the following code, which is
rather longish and seems rather low-level (esp. the memset).
xmlBufferPtr buffer;
buffer = xmlBufferCreate();
xmlOutputBufferPtr outbuf;
outbuf = (xmlOutputBufferPtr) xmlMalloc(sizeof(xmlOutputBuffer));
if (outbuf != NULL) {
memset(outbuf, 0, (size_t) sizeof(xmlOutputBuffer));
outbuf->buffer = buffer;
xmlNodeDumpOutput(outbuf, doc, root_element, 0, 0, "UTF-8");
xmlFree(outbuf);
if ( xmlBufferLength(buffer) > 0 ) {
printf("%s\n",ret);
}
}
I wonder, is there any shortcut for that?
Also, while this works, I was surprised that I got a UTF8-encoded
result even when I changed the parameter for xmlNodeDumpOutput to
"iso-8859-2" (Linux, iconv is compiled in). I won't do that in
XML::LibXML, but still... :-/
Thanks,
-- Petr
Attachment:
pgp4QucUKIU9B.pgp
Description: PGP signature