Re: [xml] entity resolver callback context not too useful for python



Daniel Veillard wrote:
On Mon, Nov 28, 2005 at 10:06:07PM -0500, Brad Clements wrote:

What appears to happen when processing document() during the transform:

1. libxml2 creates a new parsing context,
2. libxml2 calls the entity resolver with that new context

3. libxml2 parses the returned data

But since this is a new parsing context, I've never seen it before, so even if I could find out my original xml/xslt document's parsing context, they wouldn't be helpful because they're not related to the freshly created context made during entity resolution.

It's a bit of chicken-and-egg problem, because loading the original xsl file might also do entity lookups (xsl:include), so even before I get a document handle, I could be doing lookups.


My hack-around solution was to use xmlReadDoc, and pass in a


  use xmlReadCtxtDoc with a context you created yourself

I think Brad has found the same problem I wrote about a couple of weeks ago on the libxslt list. You can use xmlReadCtxtDoc and recover _private in the external entity loader. That works fine. But when you are applying a XSL transformation and document() gets processed, the context pointer that gets passed to the entity resolver is not something that you can use to recover context. There is nothing there that refers back to the original document or the top level stylesheet. Depending on what path gets you to the external entity loader, the context pointer can contain a xsltStylesheetPtr, a xsltTransformContextPtr, or NULL. This is a fairly big problem if you want to do anything more complicated in a multi threaded environment than parsing a single level XML document.

I hacked around this by finding all of the calls I make to the library that end up calling the external entity loader, then protecting them with a mutex. That lets me stash my context information for the thread that owns the mutex and recover it in the entity loader code. That's okay for me because I know my documents are small and the transformations are fast. I might be in trouble if I didn't control the parameters so closely. (If I was running my code on a public server, for instance.)



mangled URI with my own custom scheme, whose netloc can be dereferenced back to the originating 'http request context'.

It'd be handy of something like xmlReadDoc could be given some opaque value that would be passed to the entity loader. Further, that opaque value would have to be set on all documents loaded during the parsing process.


  At the C level there is a _private pointer for parsing contexts that
application can use to associate application data with that specific processing.

It's the same answer here. This works for xmlReadCtxtDoc, and may work throughout libxml2 as long as you don't do XSLT, but it doesn't work when you try to use libxslt and apply XSL transformations.

- Rush



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]