Re: Caching merged diffs



2009/7/11 Stephen Kennedy <stevek gnome org>:
>> In Differ, merged diffs are repeatedly calculated whenever we iterate
>> over changes. The merged versions of diffs can theoretically differ,
>> as the underlying text is passed in on each call. However, this text
>> always ends up being the raw file text, and we always update our diffs
>> whenever this text changes. As such, we can cache the merged version
>> of our diffs by disallowing the effectively unused texts argument.
>>
>> I'm attaching two patches, the first of which implements this caching
>> of merged diffs, and the second of which provides a short-circuit for
>> two-way comparisons. On two-way diffs, these patches provide a very
>> slight speed improvement; on three-way diffs, the speed-up is easily
>> measurable -- around 5% in a few quick tests based on scrolling
>> through a three-way source file comparison.
>
> Thats a good idea. I've semi-deliberately not changed any of the diffing
> code in ages because I've always intended to rewrite it with an asynchronous
> interface. I think this is the only way to get acceptable performance on
> large files/large change blocks.
>
> It's reasonable then to put the diff computation & most of the i/o in a
> subprocess, which would make the ui much more responsive.

I couldn't agree more. This caching patch has the benefit that it
drops us to set_sequences_iter() and change_sequence() as the only
interface points that modify internal state. Turning these into
asynchronous methods shouldn't be *too* hard...

Kai


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]