Re: Filename encodings and GLib



On Tue, 2003-10-14 at 04:30, Alexander Larsson wrote:
> On Mon, 2003-10-13 at 17:14, Owen Taylor wrote:
> > This is in reference to:
> > 
> >  http://bugzilla.gnome.org/show_bug.cgi?id=101792
> >  
> > Right now, the GLib model is that there are three forms for a filename:
> > 
> >  A) "System filename form" ... NUL terminated byte sequence,
> >     no interpretation for user display
> >  B) UTF-8 form. 
> >  C) URI form
> ...
> >  - We still haven't figured out whether URI's encode UTF-8
> >    filenames or system filenames. Nautilus and GLib, I believe,
> >    are inconsistent about this.
> 
> Yeah. Which is a bit unfortunate. The problem with URIs is of course
> that we can't rely on the encoding of the filename in general, since the
> file could be on e.g. a remote ftp server with unknown encoding. Another
> issue is also that nautilus must be able to handle misencoded filenames
> so they can be renamed to something correct.

Note that for ftp, http, etc, it's a non-issue. The encoding is whatever
the remote server picks; the extent to which we can interpret the
octets as a human-readable string will depend on the relevant RFC's.

Really, the only question is for URI's we are generating ourselves,
and in particular, for file:// URIs.

I don't think nautilus is particularly unique in needing to handle 
misencoded filenames. And if it *is* unique, that still doesn't mean
it can use a different file:// scheme than the rest of the desktop.

If we believe that the straight octet encoding is correct, then we
should:

 A) Fix GLib to do the same
 B) Push this as a mini-spec on freedesktop.org

The main problem with straight octet-encoding of filenames is that
at best you can only guess how to display them to the user as anything
other than the literal URI.

> > So, we'd need to provide wrapper functions for the C library functions
> > that took our system filename form. The C standard functions that take
> > filenames are, to my knowledge, remove(), rename(), tmpname(),
> > fopen(), freopen().
> 
> There should probably be a g_win32_filename_to_codepage/widechar() or
> something, if you want to pass the filename to other Win32 native
> functions.

Something along those line makes sense to me; certainly the version
to convert to the wide representation. (Though that's just
g_utf8_to_utf16 internally, I think.)

Regards,
					Owen





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]