Re: [xml] Useless function calls in xmlSetProp()?
- From: Julien Charbon <jch 4js com>
- To: veillard redhat com
- Cc: xml gnome org
- Subject: Re: [xml] Useless function calls in xmlSetProp()?
- Date: Fri, 08 Feb 2008 17:17:31 +0100
Daniel Veillard wrote:
On Mon, Jan 28, 2008 at 03:26:02PM +0100, Julien Charbon wrote:
Hum, sure, UTF-8 validation shall not be removed. Anyway to evaluate
this extra complexity, I made the simple program [see below] that do
1000 iterations of xmlSetProp(node, name, value) and calculate the sum
of all these calls with various 'value' parameter:
okay, this makes a difference, agreed, is that really perceptible
on a real application run ? I'm unsure ...
True. On our application, performance improvement is between 0.5 -
0.7%. More than nothing, but certainly not a super fast revolution...
Patch applied to current libxml2 trunk:
But I tend to like the patch for a few reasons:
- it cleans things up and show the actual process
- it enforces the UTF-8 check in a clear manner
- it doesn't change apparently the actual behaviour of the API
== tree.c.patch ==
The patch looks fine to me, if you can provide the final version as an
email attachment, I will try to apply it,
Seems fine and clear. Attached to this email the "final" patch
against current trunk.
Note:
Changing doc->encoding to "ISO-8859-1" in case of not valid UTF-8
value is coming from previous xmlEncodeEntitiesReentrant() call.
Thanks.
--
Julien
Index: include/libxml/xmlerror.h
===================================================================
--- include/libxml/xmlerror.h (revision 3690)
+++ include/libxml/xmlerror.h (working copy)
@@ -398,6 +398,7 @@
XML_TREE_INVALID_HEX = 1300,
XML_TREE_INVALID_DEC, /* 1301 */
XML_TREE_UNTERMINATED_ENTITY, /* 1302 */
+ XML_TREE_NOT_UTF8, /* 1303 */
XML_SAVE_NOT_UTF8 = 1400,
XML_SAVE_CHAR_INVALID, /* 1401 */
XML_SAVE_NO_DOCTYPE, /* 1402 */
Index: tree.c
===================================================================
--- tree.c (revision 3690)
+++ tree.c (working copy)
@@ -92,6 +92,9 @@
case XML_TREE_UNTERMINATED_ENTITY:
msg = "unterminated entity reference %15s\n";
break;
+ case XML_TREE_NOT_UTF8:
+ msg = "string is not in UTF-8\n";
+ break;
default:
msg = "unexpected error number\n";
}
@@ -1814,11 +1817,15 @@
cur->name = name;
if (value != NULL) {
- xmlChar *buffer;
xmlNodePtr tmp;
- buffer = xmlEncodeEntitiesReentrant(doc, value);
- cur->children = xmlStringGetNodeList(doc, buffer);
+ if(!xmlCheckUTF8(value)) {
+ xmlTreeErr(XML_TREE_NOT_UTF8, (xmlNodePtr) doc,
+ NULL);
+ if (doc != NULL)
+ doc->encoding = xmlStrdup(BAD_CAST "ISO-8859-1");
+ }
+ cur->children = xmlNewDocText(doc, value);
cur->last = NULL;
tmp = cur->children;
while (tmp != NULL) {
@@ -1827,7 +1834,6 @@
cur->last = tmp;
tmp = tmp->next;
}
- xmlFree(buffer);
}
/*
@@ -6466,11 +6472,15 @@
prop->last = NULL;
prop->ns = ns;
if (value != NULL) {
- xmlChar *buffer;
xmlNodePtr tmp;
- buffer = xmlEncodeEntitiesReentrant(node->doc, value);
- prop->children = xmlStringGetNodeList(node->doc, buffer);
+ if(!xmlCheckUTF8(value)) {
+ xmlTreeErr(XML_TREE_NOT_UTF8, (xmlNodePtr) node->doc,
+ NULL);
+ if (node->doc != NULL)
+ node->doc->encoding = xmlStrdup(BAD_CAST "ISO-8859-1");
+ }
+ prop->children = xmlNewDocText(node->doc, value);
prop->last = NULL;
tmp = prop->children;
while (tmp != NULL) {
@@ -6479,7 +6489,6 @@
prop->last = tmp;
tmp = tmp->next;
}
- xmlFree(buffer);
}
if (prop->atype == XML_ATTRIBUTE_ID)
xmlAddID(NULL, node->doc, value, prop);
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]