Nautilus/Medusa improvement ideas



Hi there. I'm sorry about the cross post but I feel this concerns both mailing lists =) Good luck.

Medusa proposal


Hey there:

I had a couple ideas that kept me restless and so I'm writing this
very late in the evening (or very early in the morning, choose the
one you prefer - bett's number 2).

GOALS:
- to leverage existing code for local area networks
- to index removable devices and all kinds of volumes
- to improve the user experience

Medusa is both a file indexing daemon and a file search service.  So
far so good.  But Medusa can be extremely useful (a "slocate" in
steroids) for many other things beyond the "personal computer" use
case scenario.

Medusa can be leveraged to support two things I'd like to see:
- Removable media
- Storage/Local area networks

Right now, Medusa has no (AFAIK) support for indexing removable
volumes.  It can index local volumes and removable volumes alright,
but can't do so as I envisioned it could.  Coupled with Nautilus, this
could mean that searching for a file containing "user friendly", gives
me results in a (e.g. multicolumn view) from where double clicking the
file prompts me to insert the "Backups #2" volume on my CD-ROM drive
Hitachi 48X plus.  After which Nautilus could easily resume regular
operations and show me the file's contents.

To add removable device support, these infrastructural areas need to be
changed:
- database changes
- system interaction changes
- user interface changes

For the database:
* The full path name to a file should no longer be
  stored.  Now Medusa should store the volume's unique ID (GUID,
  whatever), and the volume name, along with the base path (say a
  file is /usr/lib/gkrellm/plugins/ and /usr is a filesystem, the base
  path here would be /lib/gkrellm/plugins).
  I don't know if this is the way Medusa handles it right now, but,
  well, it seems good.  This gives us room to include CD-ROMs and
  floppies in the mix.  If you look carefully enough, perhaps storing
  the mount points or device files isn't even needed, because you
  store the type of media (hard disk, CD-ROM, etcetera, information
  readily available via standard system interfaces) along with the
  volume information, so when requiring a file, you can reconstruct
  the path independently from wherever it got indexed first (so you
  can pop your CD into the Hitachi CD-ROM whenever you are listening
  to music in your Plextor where you usually index your volumes).

System interaction:
* Medusa should be signaled by autorun whenever a drive is inserted,
  or by 'dynamic' where a volume is hotplugged - the whole idea being
  that Medusa should detect mounts and act accordingly
* Medusa should start indexing files if it's a new volume or begin
  monitoring files there for changes.
* Medusa should perhaps detect unmounts and immediately close all
  files being indexed, to avoid the "busy: cannot umount" problem
  when umounting drives. And it should be compatible with supermount.
* It should also monitor for changes in files so as to avoid
  rescanning the entire hard disk the braindead way slocate does
  nowadays (I think Medusa already acts smart in this particular
  issue).  And it should be 'nice' to system resources (not be a hog
  when indexing).

Finally, for the user interface changes:
* Nautilus icons for drives should have a right click menu item that
  says "Index this volume now" to signal Medusa that it should begin
  indexing the removable volume.
* Search results would include files in removable volumes (or at
  least an option to include them!) and would index words in all
  files which have text.  Activating an entry in the search dialog
  should prompt me for the volume if it's missing, and mount it,
  evidently only if this is possible at all.

As you can see, with the proper elbow grease at the nautilus level,
and the proper steel framing at the system level, we now have a very
powerful cataloguing system.  As file systems evolve and delve into
the metadata thing, the cataloguing system will get richer all by
itself, with little future work.  And I could look for all music files
sung by ATB in all my CDs.  Which would by far surpass anything the
Microsofties offer nowadays.  This all only using distro-provided
Linux software.  And at no user effort (except for, perhaps, floppies,
since all other types of removable media are either handled by autorun
or dynamic).

 NETWORKS

Great huh?  Now imagine this gets extended to support NFS networks.
Medusa could be accessible via the network (a medusa search service in
my local machine could, instead of indexing network mounts,
delegate the search to the medusa search service at the machine where
the network mount is exported), much the way SGI FAM does.

Medusa should also respect the /etc/exports conventions.

This way, we leverage the NFS networks' facilities, with zero
extra configuration, while still providing an extremely low-resource
network search facility.  This could mean that a newbie corporate hire
could look for every document with the word "policy" on it, with
nearly zero network overhead, on every corporate file server, and have
the network show him ONLY the files he can see (by traditional UNIX
security semantics, which both Medusa and NFS respect - and slocate as
well).  And so our new hire can get to work quickly and know all
company policies instead of getting a two-hundred page book or
complicated instructions on how to "Connect to a network drive and
access folder XYZ".

(*) FAM delegates file monitoring to remote FAMs.  When FAM cannot
connect to a remote NFS FAM server, it falls back to standard dnotify.
This behavior can be mimicked in Medusa-searchd.  To prevent failed
file accesses, removable media search results wouldn't be returned to
the client search service.

To work properly, this would evidently need autoconfiguration.  This
won't succeed if the Medusa daemon needs to be configured in client or
server machines.

* Medusa-searchd in the NFS server should accept remote connections if
  the NFS server is up
* Medusa-searchd in the NFS server should respect the /etc/exports
  conventions and use the existing configuration files
* Medusa-indexd in the client should never index NFS mounted volumes.
* Medusa-searchd in the client should always attempt first to connect
  to the NFS server, and failing that, use standard search methods or
  not show any results from the server at all.
  Many NFS networks would come down to their knees
  at the sole idea of traffic from every client hammering the entire
  exported share to index it.
* Medusa-indexd/-searchd code should be audited for possible
  vulnerabilities involving feeding purposefully corrupted files or
  search queries

To boot, this could even be reused in a web project as a reusable
search service for intranets (to kill the need for htdig which
doesn't really go beyond HTML files).

I have the feeling that a couple of changes in Medusa would render its
usability much greater than the current prospect. I bet if this gets
worked upon, even the KDE people would get around to using it.
Remember how ugly and slow the search box in KDE is. And another
thing: it's slower than Windows' file search tool.  KDE already takes
advantage of SGI FAM.  Medusa could be the search service everyone
expected.

good luck!


   Rudd-O




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]