Re: Suggestion for file type detection approach



just because the bulk of the time spent will be in i/o, does not mean
that profiling will be useless.

anyways, just out of curiosity, I timed 'ls -l' and 'file' in /usr/bin
and here are the results I get:

`time /bin/ls -l /usr/bin`
real    0m2.783s
user    0m0.060s
sys     0m0.010s

`time file /usr/bin/*`
real    0m2.788s
user    0m0.060s
sys     0m0.120s

looks to me like "sniffing will suck performance-wise" is a hunk of
crap, at least assuming file-type sniffing is done efficiently.

note that I used ls -l so that they both output roughly the same amount
of text which does make a difference in speed, btw. of course, one could
argue that this means ls probably has to parse /etc/passwd and
/etc/groups in order to map uids/gids to strings where it might not have
needed to without the -l flag as well as an extra stat() call, so:

time file /usr/bin/* > /dev/null

real    0m0.200s
user    0m0.140s
sys     0m0.050s

time /bin/ls /usr/bin > /dev/null

real    0m0.027s
user    0m0.020s
sys     0m0.000s

so, done well, file type sniffing can be pretty damn fast - wouldn't you
say? note that ls is not stat()ing each file like the 'file' run has to
(you can check this with strace). Forcing ls to stat() each file alone
doubles the time necessary to list all the files (you can check for
yourself by looking at the time for ls vs ls -l)

looking at the strace output of the file run, looks like file does a
bunch of small reads for each file - so it might even be possible to
speed file up by making it do 1 read() per file (or as few read()'s as
possible)

I'll have to strace gnomevfs-ls once I build it, but if I recall
correctly - it matches file-types by using regex which is pretty slow
compared to the way file does it - so this might be one cause of
gnomevfs-ls slowness.

would also be interesting to see if gnomevfs-ls is making extra stat()
calls and the like. I'll have to check this...

Expect a follow-up from me once I get gnomevfs-ls built

Jeff

On Sun, 2004-01-04 at 09:47, Soeren Sandmann wrote:
> Jeffrey Stedfast <fejj ximian com> writes:
> 
> > has anyone actually done any profiling? 
> 
> Do you know of a profiler that works for this problem? I think disk
> access is likely to be the bottleneck, and the various profilers
> generally measure CPU usage.
> 
> (A disk use profiler would be a great thing to have, but I am not sure
> it's possible to create one without kernel changes).
> 
> 
> Søren
> _______________________________________________
> gnome-devel-list mailing list
> gnome-devel-list gnome org
> http://mail.gnome.org/mailman/listinfo/gnome-devel-list
> 




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]