Re: opinion poll on text widget iterators



Guillaume Laurent <glaurent@worldnet.fr> writes:
> 
> Another annoying thing is the type name. :-)
> 

Yeah it will be changed. ;-)
 
> Actually, for C++, the first form is much preferable. An STL iterator
> is usually used as follows :
> 
> for(iter = begin(); iterm != end(); iter++) 
> {
>   /* do stuff with iter */
> }
> 
> So we need to have seperate operations for moving it forward, and
> checking if we've reached end or not. The unref part is in
> iter::operator++(), so no problem if it takes 3 lines, the programmer
> won't see it.
>

You can still do this with the static object I think, can't you? And
it will be faster, since you can assign/compare the iterator by value,
instead of doing a malloc/free for each ++; you can end up with
all-inline operator= and operator==.

end() is going to require a special-case cache to get O(1), this can
be done though.

If you want O(1) dereference and increment time, the static iterator
is looking a lot more attractive. The reason is that _any_ change to
the buffer _at all_ is going to make the O(1) iterator into a dangling
pointer. The O(n) iterator is robust against many types of buffer
change. I would rather the O(1) iterator disappear as soon as it
possibly can... More on this below.

I know that no mutations are allowed while using a C++ iterator (at
least for vector), so this is expected behavior.

> > Misgiving #2: these index objects are 90% of the time not actually
> >  used for iteration; it will be both easier and more efficient to use
> >  _foreach() API's that are also provided for iterating over the
> >  buffer.
> 
> Not in C++, we already have for_each(), so having one to wrap looks
> quite awkward. I wonder how Python or Perl manage this - my guess is
> that the problem is similar.
> 

Indeed, well, you would not have to wrap the foreach functions, they
are mostly for C. However I agree you need a nice O(1) iterator object
in order to avoid wrapping the foreach functions.
 
> > "Dereferencing" a FrooTkxtIndex is O(n) where n is the length
> >  of the line containing the index.
> 
> We've got a problem here... Why is that ?
>

OK I phrased this slightly incorrectly. It is actually O(n) where n is
the number of segments in the line, not the length of the
line. Briefly, a line is a linked list of segments; a segment can be:
 - toggle a tag on or off
 - character data
 - embedded pixmap data
 - embedded widget data
 - a mark
 
Segments have a size in bytes.

The FrooTkxtIndex currently contains:
 - a pointer to the BTree
 - a pointer to the line
 - a byte offset

So dereferencing the index is like:

FrooTkxtSegment* seg;
gint byte = 0;

seg = index->line->segments;
while (seg != NULL)
{
  if ((byte + seg->size) > index->byte)
    break;
 
  byte += seg->size;
  seg = seg->next;
}

If the line is nothing but a single character data segment, then this
loop is always a single iteration, but if the line contains text with
attributes, or pixmaps, etc. then you can get a longer search.

The thing is, it's not O(1) technically but it is _in effect_ very
fast, and it's fairly robust since segments can be changed without
breaking the iterator. The iterator only breaks if you delete or
shorten the line. 

To get "true" O(1) behavior your iterator would contain a pointer to
a segment. However, _any_ change to the btree can destroy the validity
of your segments. For example, if you move a mark or remove a tag, the
character data segments on either side of the mark/tag segment will be
merged, and the mark/tag segment deleted. Marks and tags have size 0,
so do not affect the current FrooTkxtIndex.
 
> But these "common use cases" are actually quite uncommon if you
> consider how an iterator is used in general.
> 

That's one question however, is FrooTkxtIndex really an iterator in
the STL sense. I don't think it is right now; so the options are to
create a second iterator type that is like STL, _or_ change
FrooTkxtIndex to be like STL. Currently FrooTkxtIndex would not be
used in the same cases you'd use an STL iterator.

Thanks for the response, I'm getting a clearer idea of the issues...

Havoc



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]