Re: Performance implications of GRegex structure



Gustavo J. A. M. Carneiro wrote:
  I can't resist to not state my opining on this :P

  I think it's OK to have a single GRegex object, with no separate match
or matcher, IF g_regex_copy is basically a lightweight copy[1].
It is.
  I think this matches well with the rest of the GLib APIs wrt. thread
safety.  None[2] of the other GLib data structures are thread-safe. E.g.
you can't share a GList between threads, you have to protect it by a
mutex or have one copy for each thread.  So why should GRegex follow a
different pattern?
As far as I understand, it won't be made magically threadsafe.
You will have to have a per-thread copy or protect it with mutex
or something [1].

The real problem is that g_regex_copy() is not what people think it is,
and people are used to object names which contain "match". It sounds
heavy enough to want to change the API, to make API names reflect
what objects and methods do.

I personally am fine with GRegex, and I'd prefer it stay as it is,
but having to use Matcher in place of Regex doesn't seem
to be high price for everybody being happy, so why not?

As to language bindings, IMO no public API which is intended to
be used from bindings should use GRegex. Don't know why, but I
am pretty sure it will lead to troubles or at least to unnecessary
work in bindings and non-C-code using those bindings. GRegex
properties or signal arguments is a bad idea, like GHashTable.

Yevgen

[1] It's possible to make regex_match() return some Match (as opposed
to "Matcher") objects, like in Python, but then these objects would need
to keep state, not just the "result", and you would need to pass Match
object to next regex_match(), which would be totally weird.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]