Re: normalizing filenames and strings



On Wed, 2007-03-28 at 16:46 +0200, Alexander Larsson wrote:
> On Tue, 2007-03-27 at 13:15 -0400, Dr. Michael J. Chudobiak wrote:
> > > Filenames could also be NFC normalized when created, although that's
> > > not absolutely necessary.
> > 
> > It would be nice if gnome mandated a standard approach for 
> > normalization. Does everyone like NFC? (http://unicode.org/reports/tr15 
> > for info.)
> > 
> > > This could be fixed at a low level, in gtk filechooser for some cases
> > > or in apps. Gnome-vfs should handle that too.
> > 
> > It would be nice if gnome-vfs could handle this in the background, so 
> > coders don't have to worry about uri escaping and normalization at the 
> > same time. (The existing normalization functions have to be used on 
> > unescaped URIs. It's already tricky enough keeping track of gnome-vfs 
> > escaping issues...)
> 
> Its very hard and quite expensive to handle normalization automatically
> at the low level. You have to intercept every i/o operation, and it can
> introduce very strange behaviour (since we can't control whats already
> on the disk). We have to accept that unix filenames are strings of bytes
> and that we just cannot enforce any meaning on them (although we can do
> our best to try to make them some normalized form of utf8).
> 
> For uri escaping I'm doing my best to make it not an issue in the new
> GVFS API that is to replace gnome-vfs. (By not using uris much in the
> API.)
> 
> In practice i don't think there is an enourmous problem. Most files are
> either selected in the fileselector/filemanager (so we don't care about
> normalization, just the filename bytestring that was selected) or for
> new files, typed into the file selector.If the fileselector can do some
> normalization for typed-in names we shouldn't really in normal use cause
> any "duplicate" unnormalized filenames.

IMHO the only work needed to handle this is in all filename-selection
widgets, which should do completion based on similar unicode names (like
the fileselector does already for names differing only by case).

	Xav





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]