Re: [gtk-list] Re: gtk bug or glibc locale bug?

From: Scott Stone <sstone ume pht co jp>
To: gtk-list redhat com
Subject: Re: [gtk-list] Re: gtk bug or glibc locale bug?
Date: Fri, 6 Nov 1998 09:58:30 +0900 (JST)

On 5 Nov 1998, Owen Taylor wrote:

> 
> Changwoo Ryu <cwryu@adam.kaist.ac.kr> writes:
> 
> > I recently upgraded several Debian packages---xfree86 with no
> > -DX_LOCALE, wcsmbs patch from Debian-JP and Korean locale for wcsmbs.
> > But after the upgrade, all my GTK+ programs didn't properly work with
> > multibyte language.  GTK+ library has been compiled from the source.
> > 
> > I attached the fix.  The patch is a part of the GTK+ XIM improvement,
> > <http://arch.comp.kyutech.ac.jp/~matsu/my_products/gtk/xim-1998.09.16.patch>
> > 
> > The problem was, the result of mblen ("\xc0", MB_CUR_MAX) was -1 in
> > "C" locale.  But I believe it should be 1.  Is it glibc's (or
> > wcsmbs's) bug?
> >
> > Anyway, this patch fixes the problem.  If noone complains, I'll commit
> > this.

Are the glibc people and the wcsmbs people even talking to each other?  Or
is this another case of the Japanese developer community deciding to make
their own dev fork and not comply with the base standard yet again?

> 
> This is not really a correct patch.
> 
> The problem the code is supposed to detect, is that under 
> stock glibc (I don't know about the Debian-modified glibc)
> the mb* functions always deal in UTF-8, which isn't useful
> for what GTK+ wants to do. 
> 
> "0xc0\n" is not a valid UTF8 string, hence the return of -1.
> This tells GTK+ - OK, the C library's multibyte functions
> aren't useful, so treat everything as 1 byte.
> 
> Your patch, unfortunately, breaks the 1 byte locales in stock glibc,
> because encoded in UTF8, the maximum length of a 1-byte character is
> 2, so for a 1 byte character MB_CUR_MAX==2.
> 
> I think the correct thing to do, in the short term, is
> to apply something like:
> 
>  ftp://ftp.gtk.org/pub/gtk/patches/gtk-a-higuti-980912-0.patch.gz
> 
> which switches over the Entry and Text widget to using wide
> characters. Locale-dependent variable-width encodings are just not
> reliable. In the long term, Unicode is the right way to go.
> 
>                                         Owen
>  
> > ----------------------------------------------------------------------
> > diff -u -r1.85 gtkmain.c
> > --- gtkmain.c	1998/10/25 19:30:02	1.85
> > +++ gtkmain.c	1998/11/05 13:36:49
> > @@ -405,15 +405,18 @@
> >    current_locale = g_strdup (setlocale (LC_CTYPE, NULL));
> >  
> >  #ifdef X_LOCALE
> > +  /* with X_LOCALE, MB_CUR_MAX is always 4 regardless of the locale */
> >    if ((strcmp (current_locale, "C")) && (strcmp (current_locale, "POSIX")))
> >      gtk_use_mb = TRUE;
> >    else
> > +    gtk_use_mb = FALSE;
> > +#else
> > +  if ((strcmp (current_locale, "C")) && (strcmp (current_locale, "POSIX"))
> > +      && MB_CUR_MAX != 1)
> > +    gtk_use_mb = TRUE;
> > +  else
> > +    gtk_use_mb = FALSE;
> >  #endif
> > -    {
> > -      setlocale (LC_CTYPE, "C");
> > -      gtk_use_mb = (mblen ("\xc0", MB_CUR_MAX) == 1);
> > -      setlocale (LC_CTYPE, current_locale);
> > -    }
> >  
> >    g_free (current_locale);
> 
> -- 
> To unsubscribe: mail -s unsubscribe gtk-list-request@redhat.com < /dev/null
> 

--------------------------------------------------
Scott M. Stone <sstone@pht.com, sstone@pht.co.jp>
Head of TurboLinux Development/Systems Administrator
Pacific HiTech, Inc (USA) / Pacific HiTech, KK (Japan)

References:
- Re: gtk bug or glibc locale bug?
  - From: Owen Taylor

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]