Nautilus/Medusa search index enhancements
- From: "Manuel Amador (Rudd-O)" <amadorm usm edu ec>
- To: desktop-devel-list gnome org
- Subject: Nautilus/Medusa search index enhancements
- Date: Mon, 17 Feb 2003 14:35:09 -0500
Medusa search/index improvement proposal
Hey there:
I had a couple ideas that kept me restless and so I'm writing this very late in
the evening (or very early in the morning, choose the one you prefer - bett's
number 2).
Goals
to leverage existing code for local area networks
to index removable devices and all kinds of volumes for creation of an offline
searchable catalog
to improve overall user experience
Medusa is both a file indexing daemon and a file search service. So far so
good. But Medusa can be extremely useful (a "slocate" in steroids) for many
other things beyond the "personal computer" use case scenario.
Medusa can be leveraged to support two things I'd like to see:
Removable media
Storage/Local area networks
Right now, Medusa has no (AFAIK) support for indexing removable volumes. It can
index local volumes and removable volumes alright, but can't do so as I
envisioned it could. Coupled with Nautilus, this could mean that searching for
a file containing "user friendly", gives me results in a (e.g. multicolumn
view) from where double clicking the file prompts me to insert the "Backups #2"
volume on my CD-ROM drive Hitachi 48X plus. After which Nautilus could easily
resume regular operations and show me the file's contents.
To add removable device support, three infrastructural areas need to be changed:
database changes
system interaction changes
user interface changes
database
The full path name to a file should no longer be stored. Now Medusa should
store the volume's unique ID (GUID, whatever), and the volume name, along with
the base path (say a file is /usr/lib/gkrellm/plugins/ and /usr is a
filesystem, the base path here would be /lib/gkrellm/plugins).
I don't know if this is the way Medusa handles it right now, but, well, it
seems good. This gives us room to include CD-ROMs and floppies in the mix. If
you look carefully enough, perhaps storing the mount points or device files
isn't even needed, because you store the type of media (hard disk, CD-ROM,
etcetera, information readily available via standard system interfaces) along
with the volume information, so when requiring a file, you can reconstruct the
path independently from wherever it got indexed first (so you can pop your CD
into the Hitachi CD-ROM whenever you are listening to music in your Plextor
where you usually index your volumes).
System interaction
Medusa should be signaled by autorun whenever a drive is inserted, or
by 'dynamic' where a volume is hotplugged - the whole idea being that Medusa
should detect mounts and act accordingly
Medusa should start indexing files if it's a new volume or begin monitoring
files there for changes.
Medusa should perhaps detect unmounts and immediately close all files being
indexed, to avoid the "busy: cannot umount" problem when umounting drives. And
it should be compatible with supermount.
It should also monitor for changes in files so as to avoid rescanning the
entire hard disk the braindead way slocate does nowadays (I think Medusa
already acts smart in this particular issue). And it should be 'nice' to system
resources (not be a hog when indexing).
User interface integration
Nautilus icons for drives should have a right click menu item that says "Index
this volume now" to signal Medusa that it should begin indexing the removable
volume.
Search results would include files in removable volumes (or at least an option
to include them!) and would index words in all files which have text.
Activating an entry in the search dialog should prompt me for the volume if
it's missing, and mount it, evidently only if this is possible at all.
As you can see, with the proper elbow grease at the nautilus level, and the
proper steel framing at the system level, we now have a very powerful
cataloguing system. As file systems evolve and delve into the metadata thing,
the cataloguing system will get richer all by itself, with little future work.
And I could look for all music files sung by ATB in all my CDs. Which would by
far surpass anything the Microsofties offer nowadays. This all only using
distro-provided Linux software. And at no user effort (except for, perhaps,
floppies, since all other types of removable media are either handled by
autorun or dynamic).
Network scenarios
Great huh? Now imagine this gets extended to support NFS networks. Medusa could
be accessible via the network (a medusa search service in my local machine
could, instead of indexing network mounts, delegate the search to the medusa
search service at the machine where the network mount is exported), much the
way SGI FAM does.
Medusa should also respect the /etc/exports conventions.
This way, we leverage the NFS networks' facilities, with zero extra
configuration, while still providing an extremely low-resource network search
facility. This could mean that a newbie corporate hire could look for every
document with the word "policy" on it, with nearly zero network overhead, on
every corporate file server, and have the network show him ONLY the files he
can see (by traditional UNIX security semantics, which both Medusa and NFS
respect - and slocate as well). And so our new hire can get to work quickly and
know all company policies instead of getting a two-hundred page book or
complicated instructions on how to "Connect to a network drive and access
folder XYZ".
(*) FAM delegates file monitoring to remote FAMs. When FAM cannot connect to a
remote NFS FAM server, it falls back to standard dnotify. This behavior can be
mimicked in Medusa-searchd. To prevent failed file accesses, removable media
search results wouldn't be returned to the client search service.
To work properly, this would evidently need autoconfiguration. This won't
succeed if the Medusa daemon needs to be configured in client or server
machines. Medusa has to work drop-in, out of the box, with current network
configuration.
Medusa-searchd in the NFS server should accept remote connections if the NFS
server is up
Medusa-searchd in the NFS server should respect the /etc/exports conventions
and use the existing configuration files
Medusa-indexd in the client should never index NFS mounted volumes.
Medusa-searchd in the client should always attempt first to connect to the NFS
server, and failing that, use standard search methods or not show any results
from the server at all. Many NFS networks would come down to their knees at the
sole idea of traffic from every client hammering the entire exported share to
index it.
Medusa-indexd/-searchd code should be audited for possible vulnerabilities
involving feeding purposefully corrupted files or search queries
To boot, this could even be reused in a web project as a reusable search
service for intranets (to kill the need for htdig which doesn't really go
beyond HTML files).
I have the feeling that a couple of changes in Medusa would render its
usability much greater than the current prospect. I bet if this gets worked
upon, even the KDE people would get around to using it. Remember how ugly and
slow the search box in KDE is. And another thing: it's slower than Windows'
file search tool. KDE already takes advantage of SGI FAM. Medusa could be the
search service everyone expected.
good luck!
Rudd-O
===========================================================
UNIVERSIDAD TECNICA FEDERICO SANTA MARIA
CAMPUS GUAYAQUIL
CENTRO DE SERVICIOS INFORMATICOS
Mail enviado a traves de IMP-USM: http://www.usm.edu.ec/imp
===========================================================
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]