[gtksourceview/wip/chergert/pcre2: 2/2] pcre2: use JIT for PCRE2 matching
- From: Christian Hergert <chergert src gnome org>
- To: commits-list gnome org
- Cc:
- Subject: [gtksourceview/wip/chergert/pcre2: 2/2] pcre2: use JIT for PCRE2 matching
- Date: Wed, 30 Sep 2020 23:38:42 +0000 (UTC)
commit bde1ff0680a8ece56b0dcdd257e38b7b8e7e397b
Author: Christian Hergert <chergert redhat com>
Date: Wed Sep 30 12:41:49 2020 -0700
pcre2: use JIT for PCRE2 matching
If the JIT is supported, lets use it. This has some rather good performance
benefits with a bit of memory overhead for the JITd code pages. In
particular, the cost of regex creation goes up by about 4x (using C as an
example language syntax). The average runtime of those regex drops by about
the same magnitude (4x). However, the worst cases can drop significantly
with local tests showing an order of magnitude less (10x).
All our regex are cached by the source language, so the amount we run
the regex compared to creation of them is significant.
The goal here is that we can save a lot of time during updates for other
main loop work if we save how much we spend in Regex. It also means we can
support larger files without timing out (currently 2msec timeout per-line
before hitting protections).
Some basic numbers compariing a non-JIT PCRE2 to a JIT PCRE2. We're comparing
PCRE2 in both cases, although it alone is an improvement over GRegex.
Creating regexes:
+-----------+-------+-------+------+
| Language | Min | Max | Avg |
+-----------+-------+-------+------+
| C | .001 | .104 | .007 |
| C (JIT) | .004 | .383 | .031 |
| CSS | .001 | .711 | .022 |
| CSS (JIT) | .004 | 3.147 | .101 |
| JS | .001 | .186 | .011 |
| JS (JIT) | .003 | 4.474 | .050 |
+-----------+-------+-------+------+
Executing regexes:
+-----------+-------+-------+------+-------------------+
| Language | Min | Max | Avg | Notes |
+-----------+-------+-------+------+-------------------+
| C | .000 | .196 | .003 | gtktreeview.c |
| C (JIT) | .000 | .061 | .001 | gtktreeview.c |
| CSS | .000 | .211 | .022 | gtk-contained.css |
| CSS (JIT) | .000 | .068 | .001 | gtk-contained.css |
| JS | .000 | .215 | .001 | windowManager.js |
| JS (JIT) | .000 | .108 | .000 | windowManager.js |
+-----------+-------+-------+------+-------------------+
An important thing to remember here is how often these functions are called.
Creating regexes is done a few times, while executing them are hundreds of
thousands to millions of times even for just loading a file.
Since we need to highlight things between drawing of frames on the main loop
anything we can do to reduce this overhead increases our ability to stay smooth
while drawing. Further more, it allows for highlighting files with longer lines
where we might otherwise trip up.
Fixes #158
gtksourceview/implregex.c | 31 ++++++++++++++++++++++++-------
1 file changed, 24 insertions(+), 7 deletions(-)
---
diff --git a/gtksourceview/implregex.c b/gtksourceview/implregex.c
index 2d5c9234..aaa26be2 100644
--- a/gtksourceview/implregex.c
+++ b/gtksourceview/implregex.c
@@ -42,6 +42,7 @@ struct _ImplRegex
PCRE2_SPTR name_table;
int name_count;
int name_entry_size;
+ guint has_jit : 1;
};
struct _ImplMatchInfo
@@ -190,6 +191,9 @@ impl_regex_new (const char *pattern,
®ex->name_table);
}
+ /* Now try to JIT the pattern for faster execution time */
+ regex->has_jit = pcre2_jit_compile (regex->code, PCRE2_JIT_COMPLETE) == 0;
+
#ifdef GTK_SOURCE_PROFILER_ENABLED
if (GTK_SOURCE_PROFILER_ACTIVE)
message = g_strdup_printf ("compile=%lx match=%lx pattern=%s",
@@ -545,13 +549,26 @@ again:
prev_begin = match_info->offsets[0];
prev_end = match_info->offsets[1];
- rc = pcre2_match (match_info->regex->code,
- (PCRE2_SPTR)match_info->string,
- match_info->string_len,
- match_info->start_pos,
- match_info->match_flags,
- match_info->match_data,
- NULL);
+ if (match_info->regex->has_jit)
+ {
+ rc = pcre2_jit_match (match_info->regex->code,
+ (PCRE2_SPTR)match_info->string,
+ match_info->string_len,
+ match_info->start_pos,
+ match_info->match_flags,
+ match_info->match_data,
+ NULL);
+ }
+ else
+ {
+ rc = pcre2_match (match_info->regex->code,
+ (PCRE2_SPTR)match_info->string,
+ match_info->string_len,
+ match_info->start_pos,
+ match_info->match_flags,
+ match_info->match_data,
+ NULL);
+ }
if (set_regex_error (error, rc))
{
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]