Re: my worry about the recent libxml change
- From: Daniel Veillard <veillard redhat com>
- To: Darin Adler <darin eazel com>
- Cc: Gnome Hackers <gnome-hackers gnome org>
- Subject: Re: my worry about the recent libxml change
- Date: Fri, 23 Mar 2001 06:57:35 -0500
[sorry for the repost ...]
On Thu, Mar 22, 2001 at 06:28:00PM -0800, Darin Adler wrote:
> The current code in xml-i18n-tool, OAF, and Nautilus depends on the
> following property: Text in XML files, localized strings that come from
> gettext, file names, GTK widget labels, and other strings all use the same
> character set (the local one for each locale, often Latin-1).
> 
> Adding code to libxml to properly handle character sets when reading and
> writing XML will retroactively decide that existing files are in some
> particular character set, when they are actually in a mix of character sets.
  And hence no XML parser should ever get them back to you:
    http://www.w3.org/TR/REC-xml#charencoding
--------------
It is a fatal error when an XML processor encounters an entity with an
encoding that it is unable to process. It is a fatal error if an XML
entity is determined (via default, encoding declaration, or higher-level
protocol) to be in a certain encoding but contains octet sequences that
are not legal in that encoding. It is also a fatal error if an XML
entity contains no encoding declaration and its content is not legal
UTF-8 or UTF-16.
--------------
  Again I have expressed this was a libxml1 problem for one year, 
and urged people to switch to a compatible parser like libxml2.
> Making libxml DOM trees in memory always use UTF-8 will break all the code
> that puts strings in and takes them out without doing any translation.
It wont break the code, it shows the code is broken, significant difference.
I don't want to break your code, but i want it to be fixed, I will help
people to transition like I have said one year ago when releasing the
first libxml2 version. Point is until recently nobody gave a fuck about
what I was saying, you get more pressure now, well I'm sorry ...
> I don't see how to make a program that works compatibly with both the old
> and new versions of libxml. I have no idea how to address this issue in the
> code for the various packages.
  you can test xmlParserVersion exported by libxml1 and libxml2 and act
to use the translation where needed i.e. when the string value is > "1.8.11"
[root rpmfind /root]# nm /usr/lib/libxml.so.1.8.11 | grep xmlParserVersion
0005528c D xmlParserVersion
[root orchis /root]# nm /usr/lib/libxml.so.1.8.12 | grep xmlParserVersion
0004d58c D xmlParserVersion
> I hope someone can prove with testing or coding or both that I am wrong, and
> this change can be done compatibly.
  I hope it can too. But if the final question left is:
    "Shoudl we prefer keeping the existing broken platform over
     adherance to standard and better I18N support"
  then my definitive answer is a resounding NO ! 
I just hope we won't end up being stuck at this question.
Daniel
-- 
Daniel Veillard      | Red Hat Network http://redhat.com/products/network/
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next]   [
Thread Prev][
Thread Next]   
[
Thread Index]
[
Date Index]
[
Author Index]