Re: [Tracker] Tracker to do list

From: Laurent Aguerreche <laurent aguerreche free fr>
To: Jamie McCracken <jamiemcc blueyonder co uk>
Cc: Tracker List <tracker-list gnome org>
Subject: Re: [Tracker] Tracker to do list
Date: Thu, 07 Sep 2006 22:22:30 +0200

Le jeudi 07 septembre 2006 Ã 20:03 +0100, Jamie McCracken a Ãcrit :

Laurent Aguerreche wrote:

Le jeudi 07 septembre 2006 Ã 17:05 +0100, Jamie McCracken a Ãcrit :

Jamie McCracken wrote:

Laurent Aguerreche wrote:

I wonder whether the use of strlen() on UTF-8 is correct, it
shouldn't... If I remember correctly, unicode can use arrays filled that
way:
'\0' 'H' '\0' 'E' '\0' 'L' '\0' L '\0' 'O'      ("HELLO")
where a '\0' can be replaced by a value to stock characters on 2 bytes.
But I don't remember if it happens with UTF-8. I'll have to check what
happen with strlen() and funky characters.

utf-8 is not unicode.

utf-8 if ascii is always 1 byte per character and is indistinguishable 
from plain text/ascii

Non-ascii is always 2-4 bytes per character (mostly 2 bytes though).

Also non-ascii bytes cannot contain an ascii character within its 
multibyte sequence. (multibyte characters in utf-8 always have bytes 
with most significant bit of 1 whereas ascii is always less than 128 so 
has msb of 0)

for ref: http://en.wikipedia.org/wiki/UTF-8


Ok, thank you.

So I introduced a bug in tracker-utils.c during my work on UTF8. :-)

In is_text_file(), I wrote:

    if (data_read) {
            char *s;

            s = g_locale_to_utf8 (buffer, 65565, NULL, NULL, NULL);

I propose this replacement:

    if (data_read) {
            char *s;

            s = g_locale_to_utf8 (buffer, -1, NULL, NULL, NULL);


yes thanks - seems I missed that one when reviewing your work. The 
buffer would need to be at most 4x the size of the input string to be 
fully utf-8 safe.

It might be worth checking if thats the case elsewhere in tracker.



There is another bug now: tracker_db_save_file_contents() is called with
directory as file_name... So, of course, fgets() blocks on it.
It seems I've found the reason: text_filter_file was sometimes wrongly
set to a non-NULL value (because no initialisation happened) in
tracker_metadata_get_text_file().

I provide a patch.


Laurent.

Attachment: correct-tracker_metadata_get_text_file+variables.diff
Description: Text Data

Follow-Ups:
- Re: [Tracker] Tracker to do list
  - From: Laurent Aguerreche

References:
- [Tracker] Tracker to do list
  - From: Jamie McCracken
- Re: [Tracker] Tracker to do list
  - From: Laurent Aguerreche
- Re: [Tracker] Tracker to do list
  - From: Jamie McCracken
- Re: [Tracker] Tracker to do list
  - From: Jamie McCracken
- Re: [Tracker] Tracker to do list
  - From: Laurent Aguerreche
- Re: [Tracker] Tracker to do list
  - From: Jamie McCracken

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]