Re: sprintf and utf8



 * We are in fact assuming ASCII compatible locales. This 
   is basically safe because pretty much everybody else
   does too. So " " is " ", in ASCII, UTF-8, and the current
   locale.

 * We also assume the results of %d and %g are ASCII.
   This is not quite so safe as the previous assumption,
   but I'm not aware of any locales that violate it.
   ("Arabic-Indic" numerals are used for in some contexts
   for Arabic, Farsi, etc, but the more familiar Arabic
   numerals are also understood, and I believe, generally
   used in technical contexts.)

   Again, a locale that broke this assumption, would probably
   cause problems for a lot of other programs.

 * strftime() is clearly more of a problem; for dates,
   we have g_date_strftime() that handles UTF-8 properly.
   (Though I'd like to replace it with a g_date_format()
   that doesn't have the bizarre buffer handling of strftime())

   For other uses of strftime(), you are basically on your
   own for converting the format string to the locale, and 
   the results from locale to UTF-8; this is something we
   should take care of in GLib eventually, perhaps 
   just as part of #50076, "Time API to go with Date API"

 * The one thing you have to watch out for is that 
   %.Ns is not UTF-8 safe for some C libraries (GNU
   libc has the bizarre that even though N is in 
   bytes, it should fail if the string is not chopped
   at an integral number of locale characters.) So,
   it needs to be avoided.

Regards,
                                        Owen

   



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]