Re: patch to change behavior of g_strsplit on empty string back to 1.2 behavior



Darin Adler <darin bentspoon com> writes:

> I did more testing, and found a buffer overrun problem in g_strsplit
> too in cases where the string ends in a delimiter. Here's the patch
> with that fixed (also attached to the bug report).
> 
> OK to commit?

Well, there is clearly one thing missing here ... the corresponding
change to the documentation (docs/glib/tmpl/string_utils.sgml) :-)

That needs a comment to the point:

 "The result of a splitting an empty string is an empty vector"

Since there is no way you could logically deduce this.
 
> Index: glib/gstrfuncs.c
> ===================================================================
> RCS file: /cvs/gnome/glib/glib/gstrfuncs.c,v
> retrieving revision 1.61
> diff -p -u -r1.61 gstrfuncs.c
> --- glib/gstrfuncs.c	2001/07/12 09:23:38	1.61
> +++ glib/gstrfuncs.c	2001/07/17 16:16:11
> @@ -1569,15 +1569,18 @@ g_strsplit (const gchar *string,
>   {
>     GSList *string_list = NULL, *slist;
>     gchar **str_array, *s;
> -  guint n = 1;
> +  guint n = 0;
> +  const gchar *remainder;
> 
>     g_return_val_if_fail (string != NULL, NULL);
>     g_return_val_if_fail (delimiter != NULL, NULL);
> +  g_return_val_if_fail (delimiter[0] != '\0', NULL);
> 
>     if (max_tokens < 1)
>       max_tokens = G_MAXINT;
> 
> -  s = strstr (string, delimiter);
> +  remainder = string;
> +  s = strstr (remainder, delimiter);
>     if (s)
>       {
>         gsize delimiter_len = strlen (delimiter);
> @@ -1587,18 +1590,22 @@ g_strsplit (const gchar *string,
>   	  gsize len;
>   	  gchar *new_string;
> 
> -	  len = s - string;
> +	  len = s - remainder;
>   	  new_string = g_new (gchar, len + 1);
> -	  strncpy (new_string, string, len);
> +	  strncpy (new_string, remainder, len);
>   	  new_string[len] = 0;
>   	  string_list = g_slist_prepend (string_list, new_string);
>   	  n++;
> -	  string = s + delimiter_len;
> -	  s = strstr (string, delimiter);
> +	  remainder = s + delimiter_len;
> +	  s = strstr (remainder, delimiter);
>   	}
>         while (--max_tokens && s);
>       }
> -  string_list = g_slist_prepend (string_list, g_strdup (string));
> +  if (*string && max_tokens)
> +    {
> +      n++;
> +      string_list = g_slist_prepend (string_list, g_strdup (remainder));
> +    }

If you look at the docs, the documented behavior is supposed to be that 
the trailing portion from the part past max_delimiter go into the last token. That is,

  "w,x,y,z", ",", 2 => "w" "x,y,z" 

The code in GLib stable didn't do that either, however, what it produced was

  "w,x,y,z", ",", 2 => "w" "x" "y,z" 

Regards,
                                        Owen




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]