Re: Suggestion for file type detection approach



I agree that extensions are much too crude for file association,though they may be useful as a clue to speed up sniffing (e.g. if I have a suspician what the filetype is, at least I'll know where to look in the content for proof) I also like Sean Middleditch's idea for an error message offering to correct the file extention (to improve compatibility with legacy OS's ;-)

I've been wondering about another possibility which could improve security:


1) When an application is installed, it registers a private key in the application registry and enters itself as capable of opening certain file types.

2) When the application creates a document, it signs the document with it's private key.

3) When gnome-vfs is looking for a document handler, it can determine whether the file has a handler and it can look furthur (if security is set to "high") to determine whether the file was signed by a trusted application.

4) If security is set to "high" and the document signature doesn't match a trusted application, a popup can ask what to do with untrusted content.

This should eliminate the possibility of *.doc.scr and *.jpg.reg worms. Obviously we would have to fall back on extension clued content sniffing for existing and cross platform docs and warn the user that this is untrusted.


Also AFAIK, OSX has two levels of file association:
  - The default view/open application for a file type
  - The creator of a file.

This means you can have 2 jpeg images which open with different applications, one will open with photoshop (because it was created with photoshop) the other will open with iview (or whatever) because it has no creator. You could argue the intuitiveness of this behavior, it's probably more useful to content creators than content viewers.

Arvind Narayanan wrote:

My 2 cents:

* It is fairly common to get misnamed files. For instance, a webserver
has a cgi script that generates a pdf file on the fly and the browser
prompts the user to save it as .cgi. I have seen users become totally
confused by this. Had they been using nautilus they would have opened
the file without any problem.

* I'm not sure if it is "natural" for users to associate file type with
file extension. Even if it is, its just not feasible on Unix. On MS-windows it is Ok because the OS has an idea about file types, but
on Unix it would definitely be an ugly hack.

* The speed depends greatly on the type of files. If they are mostly
folders, it is very fast but if they are executables or mp3s it is very
slow. The result is that performance is acceptable most of the time
but is very bad some of the time.

* Nothing is displayed before file type is completely determined for all
the files in the directory. IMHO changing this would greatly alleviate
the speed problem. Do file type detection only for files in the current
view.

* Caching would also lead to a big speed boost. At a minimum, sniffed
types for files in the current navigation stack should be cached, so
that back and forward are instantaneous.

* Another suggestion for speed: Use file type as a "clue" for sniffing.
What I mean is:
	# If the file ends in .tar:
		- First check if it has a tar header. If yes declare it as a tar file
		- If no then do a full sniffing to check for each of the other types
	# If the file is in /somepath/bin/
- First check if it is an ELF executable. - If not do a full check
	# etc.
This would make sniffing instantaneous for *most* files.

Cheers,
Arvind





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]