On Mon, Sep 07, 2015 at 09:47:40AM +0200, Christian Ceelen wrote:
> Hi,
>
> Did anyone find the time to look into this or test this?
Sorry Christian,
not forgotten but not done yet, will try this week !
thanks,
Daniel
> Cheers,
> Christian
>
> On Mon, Aug 31, 2015 at 11:47 AM, Christian Ceelen <
> christian ceelen gmail com> wrote:
>
> > Hi,
> >
> > i think i no know what i did wrong while filling the hash table. And yes.
> > My bad ... :
> >
> > xslt.c: 5448
> > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI,
> > template);
> >
> > This adds a pointer to the xmlNode in the parsed stylesheet to the hash
> > table. For just detecting collisions that seems to be ok.
> > But it is utterly useless for using it in xsltFindTemplate in 'import.c'
> > as that expects a pointer to the compiled template struct 'xsltTemplate'.
> > So my initial patch for reusing it doing a bad pointer cast here:
> > --- a/libxslt/imports.c
> > +++ b/libxslt/imports.c
> > @@ -400,19 +400,12 @@ xsltFindTemplate(xsltTransformContextPtr ctxt, const
> > xmlChar *name,
> > return(NULL);
> > style = ctxt->style;
> > while (style != NULL) {
> > - cur = style->templates;
> > - while (cur != NULL) {
> > - if (xmlStrEqual(name, cur->name)) {
> > - if (((nameURI == NULL) && (cur->nameURI == NULL)) ||
> > - ((nameURI != NULL) && (cur->nameURI != NULL) &&
> > - (xmlStrEqual(nameURI, cur->nameURI)))) {
> > - return(cur);
> > - }
> > - }
> > - cur = cur->next;
> > - }
> > -
> > - style = xsltNextImport(style);
> > + if(style->namedTemplatesHash != NULL){
> > + cur = (xsltTemplatePtr)
> > xmlHashLookup2(style->namedTemplatesHash, name, nameURI);
> > + if(cur != NULL)
> > + return (cur);
> > + }
> > + style = xsltNextImport(style);
> > }
> > return(NULL);
> > }
> >
> > This explains why it was not working as i had hoped. The hash table
> > returns the xmlNodePtr instead of the xsltTemplatePtr expected by
> > xsltFindTemplate. The patch Daniel does the same, it expects
> > xsltTemplatePtr.
> >
> > So changing in xslt.c: 5448
> > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI,
> > template);
> > to
> > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI,
> > ret);
> > Makes it possible to reuse the hash table for the named templates also in
> > xsltFindTemplate.
> >
> > I pushed the updated changes to github and attached a clean patch to the
> > email. It is created against 1.1.28.
> >
> > The patch adds the hash table initialization and population to
> > xsltAddTemplate (pattern.c), sets the initial size to a more reasonable 8
> > entries, reuses it in xsltFindTemplate (imports.c), and adds a test case to
> > check that an error is emitted during compilation for template names that
> > are not unique.
> >
> > So far i haven't replaced the helper functions as Nico suggested, as i was
> > thinking of moving the look up code to imports.c and share part of a lookup
> > local to a single stylesheet between xsltFindTemplate and
> > xsltParseStylesheetTemplate. I am not sure if you rather go for a small
> > code duplication or for having small helper classes. In terms of
> > performance i doubt that there is a noticeable difference.
> >
> > Best regards,
> > Christian
> >
> >
> > On Mon, Aug 31, 2015 at 8:27 AM, Daniel Veillard <veillard redhat com>
> > wrote:
> >
> >> On Thu, Aug 20, 2015 at 07:21:50PM +0200, Christian Ceelen wrote:
> >> > Hi,
> >> >
> >> > Ok. I think i found the answer by trial and error myself. I was not
> >> able to
> >> > look up the named templates in the hash table for the (pre-) compiled
> >> > templates.
> >> >
> >> > So i followed Nick's suggest and came up with this first draft:
> >> > https://github.com/GNOME/libxslt/pull/1
> >>
> >> So I applied the patch, this looks reasonable, we need to add to the
> >> end of the structure, to minimize risk of ABI breakage. Should change
> >> the 1024 value that Nick pointed out.
> >>
> >> > I tested locally and didn't get any surprises. I also created a simple
> >> test
> >> > with 2 named templates that have the same name to test for collision
> >> > detection. The new code behaves in that case as the previous one. But so
> >> > far I have no idea yet where and how to add it to the tests in the test
> >> > suite as this is a test for failure detection.
> >> >
> >> > Looking at a second glance at the imports.c and xsltFindTemplate i
> >> think it
> >> > would be worthwhile to add the new function i created for lookup in the
> >> > hash table in the file imports.c and try to encapsulate the behavior
> >> there.
> >> > Do you think i should refactor the code in that direction?
> >>
> >> That's where things started to go wrong, I modified xsltFindTemplate()
> >> to look in the namedTemplatesHash instead of digging the templates list,
> >> patch is attached, and that raised a number of errors in the regression
> >> tests.
> >> So it seems to me that namedTemplatesHash isn't initialized with all the
> >> named templates from the stylesheet and hence the collision detection
> >> is also likely wrong even if it passed an initial simple test.
> >>
> >> So I think we need to give it more scrutiny, i.e. each time a template
> >> is
> >> added to style->templates we need to make sure we also add it to
> >> style->namedTemplatesHash.
> >>
> >>
> >> > Best regards,
> >> >
> >> > Christian
> >>
> >>
> >> > On Thu, Aug 20, 2015 at 10:28 AM, Christian Ceelen <
> >> > christian ceelen gmail com> wrote:
> >> >
> >> > > Hi Nick,
> >> > >
> >> > > Thank you very much for the answer. That sounds like a very good
> >> idea. Yet
> >> > > would the validation still be correct, if the hashtable is local to
> >> the
> >> > > stylesheet in which the template was defined? I was wondering what
> >> would
> >> > > happen if the same template name is defined in two files, where one
> >> of the
> >> > > files then imports or includes the other. Concerning the imports i
> >> think
> >> > > the standard is clear, the defined template has lower preference, so
> >> having
> >> > > two with the same name sounds legit to me. But what happens when the
> >> file
> >> > > is included?
> >> > > My current understanding is that real validation requires to walk the
> >> > > include tree for each template name. Would that be correctly
> >> replicating
> >> > > the existing behavior?
> >> > >
> >> > > Btw. In xsltInernals.h:L1514 i found in struct _xsltStylesheet:
> >> > > void *templatesHash;
> >> > > Is this table already containing what we want?
> >> > >
> >> > > Best regards,
> >> > >
> >> > > Christian
> >> > >
> >> > > On Wed, Aug 19, 2015 at 1:00 PM, Nick Wellnhofer <wellnhofer aevum de
> >> >
> >> > > wrote:
> >> > >
> >> > >> On 19/08/2015 10:47, Christian Ceelen wrote:
> >> > >>
> >> > >>> At the moment i do not have a deep understanding of the libXSLT
> >> code. My
> >> > >>> current guess is, that this point at which the template name is
> >> > >>> validated to
> >> > >>> be unique in the stylesheet. The current algorithm seems to be
> >> walking
> >> > >>> linearly through the list of all already created templates and is
> >> > >>> comparing
> >> > >>> the name with each. Given that the compilation process is walking
> >> > >>> through all
> >> > >>> templates this loop means that we have an O(n^2) algorithm (with n
> >> being
> >> > >>> the
> >> > >>> number of template instances in the stylesheet to compile).
> >> > >>> The huge number of templates in my XSLTs are just so far over the
> >> edge,
> >> > >>> that
> >> > >>> the compilation takes 35s. I ran a test in which i skipped the
> >> loop. This
> >> > >>> reduced the compilation time to below 2.5 s.
> >> > >>>
> >> > >>> Would anyone let me know if i have understood the code? What can i
> >> do to
> >> > >>> improve the code that would get easily accepted and released? I am
> >> open
> >> > >>> to any
> >> > >>> kind of suggestions on what to do to speed this validation step up
> >> with
> >> > >>> a data
> >> > >>> structure or mechanism already existing ?
> >> > >>>
> >> > >>
> >> > >> Yes, template names are verified by searching a linked list which is
> >> > >> O(n^2). Template lookup by name uses the same list:
> >> > >>
> >> > >>
> >> https://github.com/GNOME/libxslt/blob/master/libxslt/imports.c#L375
> >> > >>
> >> > >> It shouldn't be too hard to change the code to use an xmlHashTable:
> >> > >>
> >> > >>
> >> https://github.com/GNOME/libxml2/blob/master/include/libxml/hash.h
> >> > >>
> >> > >> Simply add a the hash table to struct _xsltStylesheet
> >> > >>
> >> > >>
> >> > >>
> >> https://github.com/GNOME/libxslt/blob/master/libxslt/xsltInternals.h#L1509
> >> > >>
> >> > >> add some initialization and cleanup, then use xmlHashAddEntry2 and
> >> > >> xmlHashLookup2 with the template's name and namespace URI. This
> >> should make
> >> > >> name verification O(n) overall and lookup of a single template by
> >> name O(1).
> >> > >>
> >> > >> What is now the developer or working repository for libxslt?
> >> > >>> I found the git repository on git.gnome.org <http://git.gnome.org>,
> >> the
> >> > >>> GNOME/libxslt on github, plus several forks which seem to be working
> >> > >>> repos.
> >> > >>> Can you point me please to the repo against which i should use for
> >> > >>> further
> >> > >>> investigation or patches?
> >> > >>>
> >> > >>
> >> > >> The official repo is the one at git.gnome.org:
> >> > >>
> >> > >> https://git.gnome.org/browse/libxslt/
> >> > >>
> >> > >> Nick
> >> > >>
> >> > >> _______________________________________________
> >> > >> xslt mailing list, project page http://xmlsoft.org/XSLT/
> >> > >> xslt gnome org
> >> > >> https://mail.gnome.org/mailman/listinfo/xslt
> >> > >>
> >> > >
> >> > >
> >>
> >> > _______________________________________________
> >> > xslt mailing list, project page http://xmlsoft.org/XSLT/
> >> > xslt gnome org
> >> > https://mail.gnome.org/mailman/listinfo/xslt
> >>
> >>
> >> --
> >> Daniel Veillard | Open Source and Standards, Red Hat
> >> veillard redhat com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/
> >> http://veillard.com/ | virtualization library http://libvirt.org/
> >>
> >
> >
> _______________________________________________
> xslt mailing list, project page http://xmlsoft.org/XSLT/
> xslt gnome org
> https://mail.gnome.org/mailman/listinfo/xslt
--
Daniel Veillard | Open Source and Standards, Red Hat
veillard redhat com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | virtualization library http://libvirt.org/
_______________________________________________
xslt mailing list, project page http://xmlsoft.org/XSLT/
xslt gnome org
https://mail.gnome.org/mailman/listinfo/xslt