Re: Can someone please comment on this short program
- From: Matthias Kaeppler <matthias finitestate org>
- To: gnome-vfs-list gnome org
- Subject: Re: Can someone please comment on this short program
- Date: Wed, 14 Dec 2005 07:52:07 +0100
dannym wrote:
So:
std::string fnIso8859_15Foobar = dirhandle.read_next()->get_name();
Glib::ustring u8Foobar = filename_to_utf8(fnIso8859_15Foobar);
Uri uriFoobar = Uri::create(u8Foobar);
Well, except that you renamed the variables, this is what I do :)
That doesn't help solving the problem however.
Is filename_to_utf8 a glibmm wrapper or the original function? If it is
a wrapper it should be made to have a signature like:
Glib::ustring filename_to_utf8(std::string fnSource);
(if it doesn't have already, that is)
Yes, that's right.
// error! Invalid byte sequence in conversion input
RefPtr<Uri> uri = Uri::create(filename);
If the filename is some kind of whack local encoding, this is expected
and actually good. Think if it didn't notice and you passed the url
around to some unsuspecting innocent other machine...
I don't think users which use encodings other than UTF-8 (which are
still prominent by the way) will feel using my filemanager if it breaks
each time it reads a filename not encoded in UTF-8. So no, this is not a
good thing :)
Uri::create() expects UTF-8, so you /have/ to convert the filename to
UTF-8 using the glib conversion functions:
RefPtr<Uri> uri = Uri::create(filename_to_utf8(filename));
No conversion error anymore, but now the Uri object is effectively
useless, because it doesn't point to an existing entity anymore; there
is no file on the system by the name the conversion yields.
What do you mean? It should "point" to the right file...
No, it doesn't. That's because special characters such as umlauts have
different character codes in UTF-8 than they have in ISO. I think my
initial post showed that this doesn't work:
File "t�":
Filename encoded in ISO-8859-1:
file:///home/matthias/t%E4st | exists: false
Filename encoded in UTF-8 (and THIS is actually pointing to something,
although it's the same "name"):
file:///home/matthias/t%C3%A4st | exists: true
So yes, it does make a difference. Both times the same file, but in
different encodings; for one the Uri says it doesn't exist, while in the
other encoding it says it does.
... it just passes whatever bytes it finds. So how is glib finding out that
this filename on that partition is "ISO-8859-1" encoded?
Only by looking at the environment or by the user telling it that
filenames are encoded in whatever G_FILENAME_ENCODING is holding.
I really don't see the point in clinging to this anymore, we should just
make _all_ the filesystem names UTF-8 (or at least look like that to
userland for filesystems that have assumptions of encoding)...
I very much agree, but that still doesn't solve my problem :)
Regards,
Matthias
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]