Re: PROP: use of Perl Compatible Regular Expressions (long, sorry)
- From: Brian Stafford <brian stafford uklinux net>
- To: Albrecht Dreß <albrecht dress arcormail de>
- Cc: balsa-list gnome org
- Subject: Re: PROP: use of Perl Compatible Regular Expressions (long, sorry)
- Date: Tue, 17 Jul 2001 12:00:04 +0100
On Tue, 17 July 10:14 Albrecht Dreß wrote:
I'll give the patch a run later today!
> Problems:
> =========
>
> I made several tests with this lib on Linux/Intel and Linux/PowerPC and
> could not see problems yet. The performance is not very different from the
> posix stuff.
In my experience, PCRE performs about the same as GNU libc's regex.
It is *much* faster than the Henry Spencer RE code.
> Some caution is necessary as pcre can return empty matches.
> E.g. rewriting the expression above to "\\b[[:alpha:]']*\\b" may return an
> empty string (which is correct). The simple solution is to use
> "\\b[[:alpha:]']+\\b" instead.
Can't beat writing the correct RE :-)
> Note that PCRE do not resolve the problems with detecting the national
> characters ("Umlauts") in the [:alpha:] class as it relies on libc.
I wrote a simple test program as follows.
#include <locale.h>
#include <ctype.h>
int
main (int argc, char **argv)
{
int c;
if (argc > 1)
setlocale (LC_CTYPE, argv[1]);
printf ("[[:alpha:]]\n");
for (c = 0; c < 256; c++)
if (isalpha (c))
putchar (c);
putchar ('\n');
printf ("[[:alnum:]]\n");
for (c = 0; c < 256; c++)
if (isalnum (c))
putchar (c);
putchar ('\n');
printf ("[[:upper:]]\n");
for (c = 0; c < 256; c++)
if (isupper (c))
putchar (c);
putchar ('\n');
printf ("[[:lower:]]\n");
for (c = 0; c < 256; c++)
if (islower (c))
putchar (c);
putchar ('\n');
}
Running this a few times gave the following output
1025 $ ./locale
[[:alpha:]]
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
[[:alnum:]]
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
[[:upper:]]
ABCDEFGHIJKLMNOPQRSTUVWXYZ
[[:lower:]]
abcdefghijklmnopqrstuvwxyz
1026 $ ./locale en_GB
[[:alpha:]]
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ
[[:alnum:]]
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ
[[:upper:]]
ABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ
[[:lower:]]
abcdefghijklmnopqrstuvwxyzßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ
1027 $ ./locale C
[[:alpha:]]
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
[[:alnum:]]
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
[[:upper:]]
ABCDEFGHIJKLMNOPQRSTUVWXYZ
[[:lower:]]
abcdefghijklmnopqrstuvwxyz
1029 $ ./locale de_DE
[[:alpha:]]
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ
[[:alnum:]]
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ
[[:upper:]]
ABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ
[[:lower:]]
abcdefghijklmnopqrstuvwxyzßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ
Given the above, I'd suspect the problem lies in Blasa's use of setlocale()
and not with GNU libc.
Brian Stafford
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]