Re: new mime detection approach

El mar, 20-01-2004 a las 12:21, Brian Nitz escribió:
> Alexander,
>    I haven't looked at this gnome-vfs code in a while, but I suspect 
> that you have to use regex and filename hints along with sniffing to 
> differentiate some OpenOffice/Staroffice documents from zip files.
> Also I'm not sure it is possible to determine whether a .jar file is a 
> Java library, data, or an executable without unpacking and looking in 
> Do you have any suggestions on handling these cases?
> We will probably always need a mixture of sniffing and filetype/regex 
> for legacy documents but wouldn't it be possible to write meta-data into 
> a cache for all gnome documents written via gnome-vfs?
> A directory would contain a .vfs-meta file with something like this:
>     Number of files: 3
>     Number of files type-checked by gnome-vfs: 3
>     myfile.jpg    type:image/jpeg     launchwith:gimp
>   type:video/mpeg4     launchwith:totem
>   type:unknown         launchwith:undefined
> If the .vfs-meta file doesn't exist or the number of files in a 
> directory doesn't match the number typechecked by gnome-vfs, you'd know 
> you have to do some sniffing.  Wouldn't this speed up thumbnailing, 
> launching and directory browing for all gnome created files?

Yes.  This is also called EAs (when not done with standalone, hidden

> If only this had been done right back in 1980 :(

It was done right.  The apple mac did it right.  We're pretty much the
only ones still on this.  Even Windows will get away from this thing
once they release WinFS.

> Alexander Larsson wrote:
> > On Thu, 2004-01-15 at 05:43, Peter Harvey wrote:
> > 
> > 
> >>I have no idea how the magic mime-type detection code works, but would
> >>it be made faster by using the extensions of a file as a hint? ie. if
> >>you are trying to determine the type of a file with .jpg extension,
> >>first check the JPEG magic entry? This should only involve reading the
> >>first few characters of the file.
> > 
> > 
> > This doesn't help. The slow part about sniffing is reading the first few
> > bytes of the file. You have to do several seeks on the hard-drive, and
> > each seek is delayed by the HD seek time and on average half a
> > rotational delay. This adds up to about 10-15 msecs per file, and faster
> > harddrives don't really make this better. In fact, Alan Cox claimed that
> > opening a new file takes about the same time as reading another 8 meg
> > from a continuous file (on modern disks). 
> > 
