[meld] matchers: Copy the passed-in text for mutability and speed
- From: Kai Willadsen <kaiw src gnome org>
- To: commits-list gnome org
- Cc:
- Subject: [meld] matchers: Copy the passed-in text for mutability and speed
- Date: Sun, 2 Oct 2016 00:12:45 +0000 (UTC)
commit c57c36f62b05d8fa75a75dd2d76ccc2443a6579f
Author: Kai Willadsen <kai willadsen gmail com>
Date: Mon Sep 26 06:56:28 2016 +1000
matchers: Copy the passed-in text for mutability and speed
The mutability argument here is pretty clear: we should take a copy of
the sequences, because we can't guarantee that they're not going to
change while we're running our comparison. I'm pretty sure that our
yield points actually happen to guarantee this anyway, but I'd much
prefer being explicit here.
The speed argument is much weirder and more annoying. What this differ
almost always gets passed is a pair of MeldBufferLines instances, which
expose a Python-list-like interface over the lines of a GtkTextBuffer.
What this means in practice is that doing things like iterating over
MeldBufferLines results in half a dozen GTK+ API calls to e.g., get
the text iterator for a visual line, get the start and end of the line,
get the text from that line, clean it up... it's a nightmare and it's
super, super slow.
Doing the whole-buffer copy here does all of this, but only once.
Obviously we pay the memory penalty of copying the whole file, but
given the performance improvements I'm willing to take this as a peak
usage cost.
meld/matchers.py | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)
---
diff --git a/meld/matchers.py b/meld/matchers.py
index af51fa9..160db12 100644
--- a/meld/matchers.py
+++ b/meld/matchers.py
@@ -82,8 +82,11 @@ class MyersSequenceMatcher(difflib.SequenceMatcher):
def __init__(self, isjunk=None, a="", b=""):
if isjunk is not None:
raise NotImplementedError('isjunk is not supported yet')
- self.a = a
- self.b = b
+ # The sequences we're comparing must be considered immutable;
+ # calling e.g., GtkTextBuffer methods to retrieve these line-by-line
+ # isn't really a thing we can or should do.
+ self.a = a[:]
+ self.b = b[:]
self.matching_blocks = self.opcodes = None
self.aindex = []
self.bindex = []
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]