Re: Faster UTF-8 decoding in GLib

From: Behdad Esfahbod <behdad behdad org>
To: Daniel Elstner <daniel kitta googlemail com>
Cc: gtk-devel-list gnome org
Subject: Re: Faster UTF-8 decoding in GLib
Date: Sun, 28 Mar 2010 16:34:10 -0400

On 03/27/2010 06:57 PM, Daniel Elstner wrote:
> However, for other invalid conditions to result in defined behavior,
> explicit checks would be required in the code.  I see no reason to pay
> the cost for insufficient validation checks in light of the fact that
> the documentation explicitly states that the behavior is undefined if
> the input is not valid UTF-8.  It might be a different matter if it
> would write past the end of a buffer or something, but that's not the
> case here.

Well, there's a bit more to it.  Just because some bytes in a file are invalid
acording to the spec doesn't mean your text editor should refuse to open the
file.  While g_utf8_get_char() and friends do assume valid UTF-8 data, it's an
unwritten assumption that for invalid bytes they simply skip the byte and
return -1.  And I want to keep it that way and perhaps even document it.  I
think I use that in Pango IIRC.

Anyway, getting way off-topic here.

behdad

References:
- Faster UTF-8 decoding in GLib
  - From: Mikhail Zabaluev
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]