Re: Lacking of a ref-counted string.



> On Wed, 20 Aug 2008 21:07:39 -0400 "Havoc Pennington" wrote:
>> If we think of GLib features as either for C, or for language bindings
>> in general, or for vala, this particular feature seems like it would
>> be *only* for vala - refcounted strings would be pretty strange in
>> plain C, and just overhead for other language bindings that already
>> have native string types they have to convert gchar* to.
> I personally have quite often wanted refcounted GStrings in "normal"
> programs that I've written; i.e. entirely unrelated to VALA.

This issue comes up repeatedly, and each time the response is to ask for proof that it would make things better.  How about the opposite, where's the proof that it would make things worse?  Exactly how much slower would GTK get if it had to ref-count instead of copy strings everywhere, and how much more memory would it consume if it had to share a pointer instead of duplicating every string that gets passed in or out of a widget.  How less stable will GTK be with proper string lifetime management built-in.  How much more complicated is it for bindings (most of which use ref-counted strings themselves) to wrap a reference to a string instead of wrapping a whole new copy of the string.

Strings can be significant, too, especially if you start writing them in some of the other multi-byte scripts supported by UTF-8, and quite frequently GTK will copy a string simply so that the programmer doesn't have to concern themselves with managing its lifetime until GTK is finished with it.  Then when you ask for it back, sometimes you'll get a copy which you have to free when you're done with.  And if in the meantime you give it to another widget, or even even back to a different property of the same widget, yet another copy gets made.

And then there's g_strdup_printf() which is about the only decently safe way to use the printf() family of functions.  And there you end up creating a string, handing it off to a GTK widget which immediately makes its own copy, and then you turn right around and g_free() it again.  The alternative is to pre-calculate and pre-allocate memory for it, but that opens you up to the possibility of small off-by-1 errors, is generally fiddly and error-prone, is generally what g_strdup_printf() is supposed to make better.  And still ends up being copied.

A few times I've taken the source code to a widget, re-worked it to use ref-counted strings, and it certainly seemed an improvement to me.  I could create a string, pass it into several properties of the widget, bring it back again, all without making a copy until I wanted to change it.  The ref-counted strings are only needed where the value of a string is going to be retained, so a function that throws away the value and only keeps some computed result from it, doesn't need to bother with ref-counted strings at all, but that doesn't mean you can't pass in the raw C string from a ref-counted string.  And I'm pretty sure addition and subtraction of a reference is a lot cleaner than returning a string just to have it g_free()'d by the caller.  I can't very well do that with every widget I use, however, which is why I've been thinking of switching to using C++

The last argument I often hear against ref-counted strings, is thread-safety.  What does that have to do with GTK, where you do all your GUI work from a single thread and use other threads as workers only.  And just how is it any different to any other non-trivial GLib structure?  If you want it to be thread-safe, you need to do some extra work to make it so.  And of course you would use copy-on-write for any shared string, which makes it safe to pass around pretty much however you wish without having to worry.  Simply duplicate and unref before you change it in any way, it still seems a great deal better than having to duplicate the string several times whether you change it or not.

Maybe the GTK guru's don't need ref-counting and other tricks to be able to manipulate and pass around strings safely and efficiently, but the rest of us mere mortals could sure do with it.


Personally I'd be happy with a GString that has type, ref-count, length, and gchar* members, where type refers to a structure containing the standard set of string operations, including conversion into and from GTK's native UTF-8.  Having a different string type for I/O (including file names) compared to strings passed around GTK, protects us from encoding issues by allowing GTK widgets to safely assume their strings are correct simply by feeding them through a g_string_to_native() function (could even be done as a #define for efficiency) that will duplicate and convert if needed, or pass it through unchanged otherwise.  But even just ref-counting alone would help, and even if it hurts efficiency a little, I fully believe it would be worth it.


Fredderic


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]