new mime detection approach



I read the mime detection threads, and considered the opinions and
options. It seems that there is no perfect solution to this problem.
After some discussion with dave I came up with a solution that seems to
work pretty well, although its not perfect.

Basically, we use regexp detection by default, and if that fails (i.e.
due to no extension) we fall back to magic sniffing. However, for
important uses (such as file properties, file launching etc) we force
sniffing. We also sniff files when they are selected the first time. 

This has a couple of interesting features:

* Allows "unix-style" no extensions files to work as before
* Does minimal I/O for directories where files have extensions,
resulting in better performance and less HD wear.
* Never moves around icons without the user explicitly doing something
with them, meaning clicks are deterministic. (i.e. the icon won't move
right before you click on it because a background sniff changed the type
of something and relayouted the folder.)
* Only sniffs files on some explicit action, minimizing e.g. network
bandwidth use, while still never being fooled by wrong extensions for
important operations.

It does introduce a strange behaviour though. Clicking on a file (or
otherwise selecting it) can change the type (and icon) of it. This will
however only happen on "invalid" input, i.e. files with the wrong
extensions. This is not common, and I think this behaviour is less
irritating for a novice user than not being able to open the file.

The one thing people have pointed out that this doesn't handle is the
case of miss-sniffed files being hard for users to correct, and
mime-magic in general only being modifiable by developers. However, I
think that, given some work, we can make the mime sniffing pretty damn
good. In the end I trust magic more than extensions, since we control
the magic and can fix any issues, while we can never control or fix the
users filenames.

And for the very few cases where the worst happens with sniffing, we
just have to make sure that the open with ui allows you to open any type
of file with any app, then these cases shouldn't cause any
insurmountable problem to users.

I'm sure some people will dislike this behaviour, but I think its a fair
compromise. I did some measurments of sniffing performance, and I think
that it is an actual problem. On my (pretty fast) machine reading a dir
with 500 mpeg files takes 3.5 seconds (just the gnome-vfs part) with
sniffing and 0.09 seconds without it. Thats 40 times as slow. We just
have to accept that seek times and rotational delays makes opening a lot
of files slow, and that it will continue to be slower and slower 
relative to disk bandwidth and cpu performance in the future. And
latencies gets even worse with network filesystems.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                            Red Hat, Inc 
                   alexl redhat com    alla lysator liu se 
He's a Nobel prize-winning drug-addicted rock star haunted by an iconic dead 
American confidante She's a hard-bitten bisexual traffic cop who don't take no 
shit from nobody. They fight crime! 




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]