Re: normalizing filenames and strings



On Wed, 2007-03-28 at 21:40 +0200, Denis Jacquerye wrote:
> On 3/28/07, Alexander Larsson <alexl redhat com> wrote:
> > On Wed, 2007-03-28 at 19:43 +0200, Denis Jacquerye wrote:
> > > On 3/28/07, Alexander Larsson <alexl redhat com> wrote:
> > > > On Wed, 2007-03-28 at 11:50 -0500, Shaun McCance wrote:
> > > > > On Wed, 2007-03-28 at 16:55 +0200, Xavier Bestel wrote:
> > > > > Most applications that operate on files will accept file name
> > > > > arguments when invoked.  What are we supposed to do with these?
> > > > > Bear in mind that the argument isn't only used by shell junkies.
> > > > > It's also used when, for example, you double-click a JPG to open
> > > > > EOG.  Nautilus passes the file name to EOG.
> > > > >
> > > > > If we don't normalize, users might have a hard time opening
> > > > > files from the command line.
> > > >
> > > > Filenames on disk can *never* *ever* be changed. They are byte strings
> > > > and must be treated as such, otherwise you can't open or operate on the
> > > > file they reference.
> > > >
> > > > However, when creating a *new* file, given a utf8 string as filename, we
> > > > can normalize it before creating the file.
> > >
> > > For command line or invoked name, applications could test for the
> > > requested name; if inexistant, they should attempt with the
> > > canonically equivalent filenames existing.
> >
> > No, its never right to guess like this. It can lead to all sorts of
> > problems, and it is a performance drag. File names are exact
> > identifiers, not UI strings.
> 
> So how should it be done? If I have a file "é" (precomposed) and I
> type "é" (composed), how is the existing file going to be opened?

If the file already exists its rarely a problem. The typeahead matching
in the file selector can do normalization, or you could click on the
filename in the file selector. In the shell you could use tab
completion.

Of course, there is some situations where things can be tricky in the
shell, like the case of an unnormalized single-character filename. But
there are many other similar cases, like files with newlines in them,
filenames that start with dashes ("-rf" comes to mind), or filenames
that are also a uri. The more we have applications try to do "magic"
with the passed in filename (like guessing if its a filename or a uri),
the more strange corner case behaviours we get.

Its up to the filesystem to handle filenames in a consistent way. If the
filesystem normilizes, that is fine, but if it doesn't, we shouldn't try
to work around it.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                            Red Hat, Inc 
                   alexl redhat com    alla lysator liu se 
He's a war-weary alcoholic senator with a secret. She's a high-kicking 
motormouth journalist who can talk to animals. They fight crime! 




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]