Akira, here are the details. The "audit" file mentions all .[ch] files as of the beginning of summer 2001. Files with no tags have not been audited for UTF-8 correctness. Those with just "UTF8" are okay, some are tagged "!UTF8 (done)", meaning they had non-compliances, but that it's now sorted out. (forget the "LAZY" tags, these are for the now removed support for "lazyprop"-style of handling properties). I would highly suggest you begin with files tagged "!UTF8 (done)" to see how it's done; the main complexity here is that the program must still work even if not HAVE_UNICODE (we will drop non-unicode support one day, but not yet), and it must also work when there are situations of GTK_CHARSET_MISMATCH (such as GTK_DOESNT_TALK_UTF8_WE_DO, which was my target for gtk1.4/GNOME1, or GTK_TALKS_UTF8_WE_DONT which is the case currently on Win32, and of course the final target is GTK_TALKS_UTF8 and (!GTK_CHARSET_MISMATCH) (this is the GNOME2 /gtk2 case). I'll be happy to give more detailed explanations on more specific cases... lib/prop_text.[ch] is certainly a good place to look at for the interaction between dia and gtk in the various states of UTF-8 awareness.
BTW I have another question. it's about gnome-print support. Dia can print it out without gnome-print. but it's just complex that support the printing for CJK. so Dia also can't handle it. if the printer has no fonts for CJK, current implementation will not be print out even if possible. so it needs to implement like gnome-print for printing with all of environment, I think. I mean the fonts should be embedded on PS. for example, this problem should occur on Windows too. but I don't try to run it on Windows. doesn't anyone fall on this problem?
The solution currently implemented is the following: there is a module in lib/ps-utf8, called the "PS Unicoder", which follows the approach taken by Microsoft in its Postscript drivers; basically, character maps are built on the fly, using only the characters actually used by the text. It of course works great in 8859-1, I haven't got reliable and consistent reports 0for non-8859-1 (let's say, it "sorta works"). Now, I'm not running 8859-1 anymore, but I don't have the time to test (and for only two different characters...). Of course, managing the encoding tables into something the printer understands is only 50% of the job... ...Currently, dia depends on having the printer (or the Ghostscript) to include a very strict set of fonts. This is the main reason there is not a lot of freedom in the choice of fonts. Downloading more fonts means being able to locate the font files or glyph outlines on the local system or the potentially remote font server, and turning them into things the printer can swallow and display nicely. This is not my area of expertise; Lars Clausen had some work in progress there, you'd probably better ask him (he'll probably comment if he reads this <grin/>). Happy hacking ! -- Cyrille -- Grumpf.
Attachment:
audit
Description: Text document