Re: [xslt] UTF-8 escaping
- From: "Wesley W. Terpstra" <wesley terpstra ca>
- To: xslt gnome org
- Subject: Re: [xslt] UTF-8 escaping
- Date: Mon, 19 Aug 2002 18:51:55 +0200
On Mon, Aug 19, 2002 at 12:06:03PM -0400, Daniel Veillard wrote:
> xmlXPathRegisterFuncNS
> in include/libxml/xpathInternals.h
Eek! I was sort of hoping to just use xsltproc (that way users could use any
xslt engine by specifying an alternate command). :-)
Well, I suppose I should suck in my gut...
Since this option would allow me to work with older libxsl/libxml I may
persue it. Along these lines, is there the equivalent of <xsl:fallback>
for XPath? I know there is a function-available() but is that ok?
eg:
<xsl:if test="function-available('crazy-extension')">
<xsl:value-of select="crazy-extension('blah')"/>
</xsl:if>
I am concerned that some compiling xslt engine out there will see my
crazy-extension and complain that it is unavailable.
> <pedantic>this is not a Recommendation, XQuery is a Working Draft
> and at the moment I would say that IETF rules the URI infrastructure
> not W3C, the RFCs are far more normative in this respect :-)</pedantic>
You are, of course, correct. ;-)
> > The function I implemented *will probably be* part of XPath.
> > It is listed for xpath 2.0:
> > http://www.w3.org/TR/xquery-operators/#func-escape-uri
>
> Hum, I don't claim to get XPath 2.0 compliance, and even the
> XQuery draft suggest to have it registered in their function namespace
>
> http://www.w3.org/2002/08/xquery-functions
I know you don't have XPath 2.0 compliance (since it isn't finished! :-); I
just wish you had that particular function... I simply cannot figure out any
other way to generate utf-8 in email headers.
> > I just thought it would be nicer to use the new function in XPath 2 than to
> > create yet another engine-specific extension.
>
> Well if you put it directly in the XPath core while it's not standardized
> then you make it an engine-specific extension, precisely :-)
True... But, when it becomes standarized, then it isn't anymore. From a
user's point of view (mine) this is slightly better: I use function only
available on Y, but in a few years it will be available on *.
> Maybe XQuery extensions could be added to libxml2 but then I would really
> require them to be listed as XPath extensions anyway, seems to me it would
> fit your needs, right ?
Yes. OTOH, if you know of a pure xslt method (below) tell me!
This would be far prefered since I could have even IE6 render my xml.
> > PS. Is there any existing way in libxslt to do:
> > str->utf-8->hex
> > and get the answer back inside xsl?
>
> libxslt only manipulates strings in UTF8 internally, so the first step
> is somewhat trivial but the second part might be more challenging, I don't
> think there is an ad-hoc XSLT entry point for this, maybe EXSLT or this
> could be done on pure XSLT ...
I know you use utf-8 inside. Isn't it wonderful? :-)
(except for the variable length thing... )
The thing I don't how to do though is to get the utf-8 exposed to me within
xsl. There doesn't seem to be a pop-single-byte-numeric-value-from-string
function. :-)
Even a convert-to-unicode-number-from-single-char-string function would
suffice. Then I could re-encode the unicode number to utf-8 myself, and
hexify it.
However, xslt seems bent on keeping char->integer functions out of
programmers hands! I understand the philosophy; there has already been so
much pain and suffering caused by charsets, but ... unicode! It's not the
be-all-and-end-all, but xml can only express values in it anyways.
I didn't see anything useful in EXSLT last time I looked. There is a
encode-uri function, but libxslt1 doesn't have it. (It is marked as 'other')
http://www.exslt.org/str/functions/encode-uri/index.html
... it also sucks for uri fragments since it doesn't escape ":/;?".
OTOH, it would work just fine for email since these are ok! :-)
Plus, this would permanently spoil any hope of my xsl working on IE6 since I
doubt they will ever adopt exslt, but xpath 2 is another matter.
PS. example where I need this utf-8 ability:
http://www.terpstra.ca/lurker/message/20020819092938.GA22015%40dat.etsit.upm.es.html
the "(reply)" link should be using utf-8, but for now I just give up on
these chars and encode =3F (?).
Thanks a lot for taking the time to respond to my concerns!
--
Wesley W. Terpstra <wesley@terpstra.ca>
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]