Re: Faster UTF-8 decoding in GLib
- From: Behdad Esfahbod <behdad behdad org>
- To: Daniel Elstner <daniel kitta googlemail com>
- Cc: gtk-devel-list gnome org
- Subject: Re: Faster UTF-8 decoding in GLib
- Date: Sat, 27 Mar 2010 18:04:29 -0400
On 03/27/2010 05:49 PM, Daniel Elstner wrote:
> Hi,
>
> Am Samstag, den 27.03.2010, 17:40 -0400 schrieb Behdad Esfahbod:
>> On 03/27/2010 05:21 PM, Daniel Elstner wrote:
>>> Well, I assume that ints are at least 32 bit wide on any platform
>>> supported by GLib. But if you meant to say that it would break with
>>> larger ints, I don't see why. As long as the type is unsigned, it
>>> should be fine.
>>
>> If the utf8 byte has more than 6 leading 1 bits, [...]
>
> That's an oxymoron.
>
>> [...] and with a 64bit int, the
>> construct tries to consume 7 or 8 bytes. Right?
>
> Undefined behavior.
Sure, I wasn't referring to valid data. In valid UTF-8, there is no 5byte or
6byte sequences either.
b
> --Daniel
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]