Re: UTF-8 Functions
- From: Darin Adler <darin bentspoon com>
- To: Owen Taylor <otaylor redhat com>
- Cc: gtk-devel-list gnome org, gtk-i18n-list gnome org, trow ximian com
- Subject: Re: UTF-8 Functions
- Date: Mon, 2 Jul 2001 16:39:30 -0700
On Monday, July 2, 2001, at 04:18 PM, Owen Taylor wrote:
The question, I guess is whether it is worth adding:
g_ut8_collate_key_casefold (), which is currently
g_utf8_collate_key (g_utf8_casefold (string));
But might eventually be implemented as:
g_utf8_collate_key_extended (string,
G_COLLATE_SECONDARY,
G_NORMALIZE_ALL_COMPOSE);
[ There are issues of correctness here as well as efficiency ]
It's certainly easy enough to do ... just a few lines of code. My
main hesitation is whether we know yet whether that is the right part
of the parameter space to give a special name.
Clear analysis as usual.
I think perhaps I want to retract my previous comment/request. If I
understand correctly, g_utf8_collate_key (without g_utf8_casefold) will
still typically sort strings in a way that is not unduly sensitive to case.
In other words, we get this kind of order:
A, a, B, b
not this kind:
A, B, a, b
If that's so, then I think it's not particularly important to add the case
folding version. It's only needed when you want to "partly collate" things
and put a bunch of identical items into the same bucket. That's not the
usual case, I don't think.
It might be common to case fold and normalize and then use the resulting
string as a key. But I can't think of a case where you'd want to case fold
and normalize and then still want to collate in a locale-specific way.
-- Darin
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]