Re: Possible Glib BTree substitute
- From: Havoc Pennington <hp redhat com>
- To: gtk-devel-list redhat com
- Subject: Re: Possible Glib BTree substitute
- Date: 29 Feb 2000 21:32:02 -0500
Hi,
I'll let Josh or some other expert comment on the data structure in
general, but some notes with respect to my text widget:
- The BTree data structure and its nodes are entirely concealed
in the single file frootkxtbtree.c in my widget, so it's
certainly possible to experiment with skip lists while retaining
all the other parts of the widget. It might be nice to rename
the BTree object, since its API does not require a BTree.
But even the BTree object is not public. So basically you can
change my widget to use a BTree and keep it binary/source
compatible.
Anyway, I think it would be wrong to compare text widgets on the
basis of the data structure used, the data structure is not
by any means the only important code in a text widget.
- The tree nodes are not really a memory problem; they contain
9 32-bit values, and a 50,000-line buffer results in about
7,700 nodes; 36 bytes per node times 8000 is under 300K.
Of the 9 elements in the node, all 9 (I think) are also
required in a skip list node. So you can only improve on
memory by reducing the number of nodes, and even then the
nodes are a small percentage of total memory used once you
count lines, wrapped display lines, character data, and other
stuff that either data structure will have.
- The only user-visible speed problem with the BTree, with a
50,000-line buffer, is re-wrapping a lot of lines. gprof
reveals most time spent in gdk_char_width(), with the second
largest amount of time spent retrieving a list of tags
applied to a given point in the buffer.
Pango will increase the gdk_char_width() time. This is really
the huge bottleneck.
Finding the list of tags involves walking up the tree from a leaf,
and at each parent node adding up the number of tag toggles below
that node (if you look at the code, in frootkxtbtree.c, inc_count()
is the function that shows up most in the profiling). Basically
this operation is O(n) in the number of parent nodes a line has.
So, this will be faster if the skip list is shallower than the
tree.
- I think you'll find that most of the complexity of frootkxtbtree.c
is not the tree (all the tree complexity is in
froo_tkxt_btree_rebalance()), rather the complexity is in updating
all the index information at the nodes (char_count, line_count,
tag summaries, max width, total height). There is also
significant complexity in the lines and segments, but a skip list
would have lines and segments as well.
So a different data structure isn't going to help that much with
the complexity that exists in this application.
- The frootkxtbtree.c code could be substantially improved by
rewriting it with a generic tree data structure which the
widget uses, most likely.
Some thoughts. Anyway, basic points are:
- the memory problem has nothing to do with nodes
- the speed problem is in rewrapping lines; this involves
font metrics, and it involves figuring out which tags apply
to a given region of text. You can speed this up potentially
by reducing the depth of your data structure, but the
character metrics stuff will continue to be the main problem
(Pango makes it worse, too).
- the Tk widget isn't really tied to the BTree, in fact the BTree
and Node structures are not defined in any header file.
- the BTree code in the Tk-based widget could use a good cleanup,
and a generic glib-style container is likely a good start.
Havoc
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]