|
It is also relatively trivial to do it yourself ... the
offset of the nodes in my experience are always the same for the same
document. So you can keep in memory the offset from the base node (not
the address of the nodes which is not constant). So after the first
read you would have direct access to any data you wanted in subsequent
passes. I would also question the need for this if (as it sounds) you would only be reading the document twice. In the "old days" I agonized to write super efficient code -- and in the more recent years have found that two parses, one after the other, due to modern machines and cache ... are hardly noticeable even with extreme volume. We tend to do a lot of smaller documents with libxml2 but we process in and out about 50,000 document per hour on a relatively modest computer (4 core IBM AIX). Many are read over and over because it is lazier and easier to code that way and pressure to get things working is now exceeding pressure to make the ultimate performance code. What I am saying is you might cringe at -- and instinctively hate (as I do) -- the idea of just reading it twice == but you might want to run some benchmarks and see if you really care or not. Eric On 3/31/2013 12:00 AM, Liam R E Quin wrote: On Sat, 2013-03-30 at 08:02 +0100, Martin B. wrote: [...] -- Eric S. Eberhard VICS PO Box 3661 Camp Verde, AZ 86322 928-567-3727 work 928-301-7537 cell http://www.vicsmba.com/index.html (our work) http://www.vicsmba.com/ourpics/index.html (fun pictures) |