Re: Performance implications of GRegex structure



On Fri, March 16, 2007 08:28, Owen Taylor wrote:
>   char *
>   get_leading_digits(const char *str)
>   {
>      GStaticRegex regex = G_STATIC_REGEX_INIT("^\\d+", 0);
>      GMatcher *matcher;
>      char *result = NULL;
>
>      matcher = g_matcher_new_static(&regex, str, 0);
>      if (g_matcher_matches(matcher))
>          result = g_matcher_get(regex, 0);
>
>      g_matcher_free(matcher);
>
>      return result;
>   }
>
> I'm not going to argue that the current GRegex API is unworkable,
> but I think it obscures the nature of the system - first you compile a
> regular expression, then you match against it - and that's going to make
> it harder for people to write correct, efficient code.

IMO the separation between regex and the matcher is so obvious
(IIRC Java and Python do it) that's not even worth discussing.

As for the examples, the last version seems the most readable.
And for the case where you're not interested in performance, it
ca simplify to:

  char *
  get_leading_digits(const char *str)
  {
     GStaticRegex regex = G_STATIC_REGEX_INIT("^\\d+", 0);
     return g_static_regex_matches(&regex, str);
  }

where g_regex_matches() is a helper defined as:

  char *
  g_static_regex_matches(const GStaticRegex *regex, const char *str)
  {
     GMatcher *matcher;
     char *result = NULL;

     matcher = g_matcher_new_static(&regex, str, 0);
     if (g_matcher_matches(matcher))
         result = g_matcher_get(regex, 0);

     g_matcher_free(matcher);

     return result;
  }

IOW separating the compiled regex from the matcher does not have to
result in more complicated usage pattern for people that don't care
about performance.

-- 
Dimi Paun <dimi lattica com>
Lattica, Inc.





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]