Re: Suggestion for file type detection approach
- From: Jeffrey Stedfast <fejj ximian com>
- To: Arvind Narayanan <arvindn meenakshi cs iitm ernet in>
- Cc: gnome-devel-list gnome org
- Subject: Re: Suggestion for file type detection approach
- Date: Sun, 04 Jan 2004 11:32:04 -0500
yes, the directory scan was already cached - pretending to test
first-run is bullshit because you don't know if anything may have
already scanned the directory and thus pre-cached it.
unless someone knows of a way to be sure that there's no cache, it's an
impossible-to-do-fairly task.
anyways... about gnomevfs-ls:
- seems gnomevfs-ls stat()'s some of the mime-info files multiple times
even after it has opened/read them in.
- read()'s 4k from each file (4k is a pretty efficient size to read, but
if we don't *need* 4k, it might be overkill - file seems to read much
smaller chunks - 32 bytes and more if it needs it). I guess what should
be done is to decide what the max amount we ever need is and simply read
in that amount. if there are just a handful that need more or need a
chunk later in the file, perhaps encode this somehow in the mime-info
files?
also to keep in mind when comparing gnomevfs-ls vs ls, gnomevfs-ls has
to load/parse/etc several mime-info files - you need to remember that
normally an application using gnome-vfs will only have to load those
mime-info files once for the entire session and so other than that first
time nautilus loads a directory (which is basically just startup), it
will never need to load/parse those files again. whereas when comparing
ls vs gnomevfs-ls, it is a significant overhead.
anyways, because pre-caching makes such a huge impact on sniffing, it
does suggest that the process is i/o bound (duh, makes sense). and so
cacheing these sniff results in EA or some other file or something might
definitely be a way to go.
Jeff
On Sun, 2004-01-04 at 10:28, Arvind Narayanan wrote:
> On Sun, Jan 04, 2004 at 10:30:47AM -0500, Jeffrey Stedfast wrote:
> >
> > just because the bulk of the time spent will be in i/o, does not mean
> > that profiling will be useless.
> >
> > anyways, just out of curiosity, I timed 'ls -l' and 'file' in /usr/bin
> > and here are the results I get:
> I tried this too. You're getting such quick times because the disk
> blocks are cached already. It would have been very slow the first time
> you tried it. Reboot and try file again if you like.
>
> This is what I get:
>
> $ time file /usr/bin/* > /dev/null # first time
>
> real 0m16.017s
> user 0m0.310s
> sys 0m0.480s
>
> $ time file /usr/bin/* > /dev/null
>
> real 0m0.455s
> user 0m0.290s
> sys 0m0.160s
>
> $ time file /usr/bin/* > /dev/null
>
> real 0m0.450s
> user 0m0.300s
> sys 0m0.150s
>
> >
> > `time /bin/ls -l /usr/bin`
> > real 0m2.783s
> > user 0m0.060s
> > sys 0m0.010s
> >
> > `time file /usr/bin/*`
> > real 0m2.788s
> > user 0m0.060s
> > sys 0m0.120s
> >
> > looks to me like "sniffing will suck performance-wise" is a hunk of
> > crap, at least assuming file-type sniffing is done efficiently.
> But you can't do it efficiently unless you can magically predict
> which directory the user is going to click on next. And you can't
> pre-cache the entire disk, can you?
>
> Arvind
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]