At 10:13 03.06.02 +0400, Vitaly Lipatov wrote:
I try new 0.90RC3 version with my old dia files from 0.88.1 version and I have trouble with encoding. In old files russian letters looks like <dia:string>#ëÏÎÔÒÏÌÌÅ ðþ#</dia:string> There is UTF-8 in new one. It can't tranlate old files correctly (for russian letters). Can I convert files from old format by hands? Any suggestions?
There is some code in lib/dia_xml.c (around line 134) which tries to be smart about default encoding and valid UTF-8. If there are no bytes found where the MSB is set it assumes well formed utf8. In you case this is plain wrong. Placing the correct encoding like: <?xml version="1.0" encoding="CP1252"?> in your dia file should help. Beware: the encoding string works for german on win32. I don't know the correct encoding string for russian on Linux ... But if you prepare a file which has the offending bits set (without an encoding definition) Dia will complain about the missing encoding and will show what it assumes to be the default. You could also apply the attached patch, which does not only seek for the MSB but the '&' char encode character too. Finally it fixes the re-writing of the temporay file including you default encoding. Hope this helps, Hans
Attachment:
dia-2002-06-03-hb.diff
Description: Text document
-------- Hans "at" Breuer "dot" Org ----------- Tell me what you need, and I'll tell you how to get along without it. -- Dilbert