Re: Faster UTF-8 decoding in GLib
- From: Daniel Elstner <daniel kitta googlemail com>
- To: Behdad Esfahbod <behdad behdad org>
- Cc: gtk-devel-list gnome org
- Subject: Re: Faster UTF-8 decoding in GLib
- Date: Sat, 27 Mar 2010 22:21:30 +0100
Hi,
Am Samstag, den 27.03.2010, 16:51 -0400 schrieb Behdad Esfahbod:
> On 03/27/2010 04:27 PM, Daniel Elstner wrote:
> >
> > It is not meant to check for errors.
> 
> Good point.
> 
> > I think it is totally arbitrary to handle some potential errors but not
> > others.  And I think the current implementation does not do that check
> > either -- it will behave differently, but it is still undefined.
> 
> The current implementation definitely does the check:
[...]
OK, looks like I misremembered.  My bad.  However, it is not documented
as such:
/**
 * g_utf8_get_char:
 * @p: a pointer to Unicode character encoded as UTF-8
 * 
 * Converts a sequence of bytes encoded as UTF-8 to a Unicode character.
 * If @p does not point to a valid UTF-8 encoded character, results are
 * undefined. If you are not sure that the bytes are complete
 * valid Unicode characters, you should use g_utf8_get_char_validated()
 * instead.
 * 
 * Return value: the resulting character
 **/
> Anyway.  Nice construct :).  For future reference, it must be used with 32bit
> ints only.  Otherwise it can go wrong.
Thanks! :-)
Well, I assume that ints are at least 32 bit wide on any platform
supported by GLib.  But if you meant to say that it would break with
larger ints, I don't see why.  As long as the type is unsigned, it
should be fine.
--Daniel
[
Date Prev][
Date Next]   [
Thread Prev][
Thread Next]   
[
Thread Index]
[
Date Index]
[
Author Index]