Re: Faster UTF-8 decoding in GLib
- From: Mikhail Zabaluev <mikhail zabaluev gmail com>
- To: Daniel Elstner <daniel kitta googlemail com>
- Cc: gtk-devel-list gnome org
- Subject: Re: Faster UTF-8 decoding in GLib
- Date: Fri, 19 Mar 2010 12:12:08 +0200
Hi,
2010/3/16 Daniel Elstner <daniel kitta googlemail com>:
>
> Addendum: It's actually not longish at all, even though it may look like
> that in the C code. There are exactly two branches. I bet that many
> macros in GTK+ expand to more than that.
OK, here you go:
http://git.collabora.co.uk/?p=user/zabaluev/glib.git;a=shortlog;h=refs/heads/fast-utf8-elstner
In addition to applying your code in existing functions where
difference was felt, and some more opportunistic tweaks, this
introduces two new functions, g_utf8_iterate() and
g_utf8_iterate_back(), which are inlined.
Performance results for Intel Core 2 follow.
The mainline, tested with branch utf8-perftest:
GTest: run: /utf8/perf/get_char
(MAXPERF:ASCII: 164.1 MB/s)
(MAXPERF:Latin-1: 162.8 MB/s)
(MAXPERF:Cyrillic: 200.4 MB/s)
(MAXPERF:Chinese: 234.2 MB/s)
GTest: result: OK
GTest: run: /utf8/perf/get_char-backwards
(MAXPERF:ASCII: 146.2 MB/s)
(MAXPERF:Latin-1: 136.3 MB/s)
(MAXPERF:Cyrillic: 142.7 MB/s)
(MAXPERF:Chinese: 181.0 MB/s)
GTest: result: OK
GTest: run: /utf8/perf/get_char_validated
(MAXPERF:ASCII: 130.5 MB/s)
(MAXPERF:Latin-1: 121.1 MB/s)
(MAXPERF:Cyrillic: 141.7 MB/s)
(MAXPERF:Chinese: 195.1 MB/s)
GTest: result: OK
GTest: run: /utf8/perf/utf8_to_ucs4
(MAXPERF:ASCII: 107.5 MB/s)
(MAXPERF:Latin-1: 95.8 MB/s)
(MAXPERF:Cyrillic: 127.4 MB/s)
(MAXPERF:Chinese: 148.4 MB/s)
GTest: result: OK
GTest: run: /utf8/perf/utf8_to_ucs4_fast
(MAXPERF:ASCII: 125.7 MB/s)
(MAXPERF:Latin-1: 122.1 MB/s)
(MAXPERF:Cyrillic: 173.1 MB/s)
(MAXPERF:Chinese: 300.9 MB/s)
GTest: result: OK
The top of fast-utf8-elstner:
GTest: run: /utf8/perf/iterate
(MAXPERF:ASCII: 570.1 MB/s)
(MAXPERF:Latin-1: 449.5 MB/s)
(MAXPERF:Cyrillic: 395.9 MB/s)
(MAXPERF:Chinese: 561.3 MB/s)
GTest: result: OK
GTest: run: /utf8/perf/iterate_back
(MAXPERF:ASCII: 384.6 MB/s)
(MAXPERF:Latin-1: 364.9 MB/s)
(MAXPERF:Cyrillic: 432.1 MB/s)
(MAXPERF:Chinese: 451.5 MB/s)
GTest: result: OK
GTest: run: /utf8/perf/get_char
(MAXPERF:ASCII: 186.0 MB/s)
(MAXPERF:Latin-1: 171.4 MB/s)
(MAXPERF:Cyrillic: 248.5 MB/s)
(MAXPERF:Chinese: 398.6 MB/s)
GTest: result: OK
GTest: run: /utf8/perf/get_char-backwards
(MAXPERF:ASCII: 138.2 MB/s)
(MAXPERF:Latin-1: 135.3 MB/s)
(MAXPERF:Cyrillic: 173.3 MB/s)
(MAXPERF:Chinese: 264.9 MB/s)
GTest: result: OK
GTest: run: /utf8/perf/get_char_validated
(MAXPERF:ASCII: 128.7 MB/s)
(MAXPERF:Latin-1: 119.3 MB/s)
(MAXPERF:Cyrillic: 143.6 MB/s)
(MAXPERF:Chinese: 210.7 MB/s)
GTest: result: OK
GTest: run: /utf8/perf/utf8_to_ucs4
(MAXPERF:ASCII: 62.7 MB/s)
(MAXPERF:Latin-1: 71.5 MB/s)
(MAXPERF:Cyrillic: 109.7 MB/s)
(MAXPERF:Chinese: 156.8 MB/s)
GTest: result: OK
GTest: run: /utf8/perf/utf8_to_ucs4_fast
(MAXPERF:ASCII: 153.4 MB/s)
(MAXPERF:Latin-1: 149.2 MB/s)
(MAXPERF:Cyrillic: 244.9 MB/s)
(MAXPERF:Chinese: 352.6 MB/s)
GTest: result: OK
Note the bad results for utf8_to_ucs4, which are caused by the wrong
pattern in which G_IMPLEMENT_INLINES is used in glib, and which I
reproduced in this new code. It makes the non-inlined extern versions
of the functions get used in the source file that's made responsible
for emitting them for the non-inline API. A proper implementation
would be to create a dedicated source file to collect all non-inlined
emissions throughout glib. But that will wait for another branch.
Without this wart, the performance is better:
GTest: run: /utf8/perf/utf8_to_ucs4
(MAXPERF:ASCII: 128.7 MB/s)
(MAXPERF:Latin-1: 120.8 MB/s)
(MAXPERF:Cyrillic: 151.9 MB/s)
(MAXPERF:Chinese: 211.1 MB/s)
GTest: result: OK
Enjoy,
Mikhail
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]