Re: Faster UTF-8 decoding in GLib
- From: Behdad Esfahbod <behdad behdad org>
- To: Daniel Elstner <daniel kitta googlemail com>
- Cc: gtk-devel-list gnome org
- Subject: Re: Faster UTF-8 decoding in GLib
- Date: Sat, 27 Mar 2010 16:51:11 -0400
On 03/27/2010 04:27 PM, Daniel Elstner wrote:
> Hi,
>
> Am Samstag, den 27.03.2010, 16:12 -0400 schrieb Behdad Esfahbod:
>
>> Err, you're right. My bad. It's still broken though since it doesn't check
>> that the fragment bytes all start with the bits 10. Missing error checking.
Looking at:
http://git.collabora.co.uk/?p=user/zabaluev/glib.git;a=commitdiff;h=9ace0f84dcbb7d95996c93c2236e0ec0253ee479
> It is not meant to check for errors.
Good point.
> I think it is totally arbitrary to handle some potential errors but not
> others. And I think the current implementation does not do that check
> either -- it will behave differently, but it is still undefined.
The current implementation definitely does the check:
- for ((Count) = 1; (Count) < (Len); ++(Count))
\
- { \
- if (((Chars)[(Count)] & 0xc0) != 0x80) \
- { \
- (Result) = -1; \
- break; \
- } \
- (Result) <<= 6; \
- (Result) |= ((Chars)[(Count)] & 0x3f); \
- }
Anyway. Nice construct :). For future reference, it must be used with 32bit
ints only. Otherwise it can go wrong.
behdad
> --Daniel
>
>
>
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]