[LONG] Roadmap, MC and component model



Hi!

The following is a "structured brain dump" on the subject of a
gnome-ish file system.  It relates to Miguel's recent post about the
roadmap for gnome, especially for mc, and the component/document model.

The file system I'm proposing here is based around a CORBAfied server.
It is a purely user-level system, in that it doesn't *require* kernel
support, and should be usable without super-user priveleges.  The
server I envision will be like some of the other Gnome CORBA-based
servers, i.e. a standard user-level service, but there is nothing to
stop it being run as a shared system-level service instead (a
non-CORBA example of this approach is MCs `mcserv' daemon).

It started life as an idea for supporting a groupware server, with
some quite interesting ideas in it.  A few weeks of work went into the
design over the christmas holiday (before being shelved due to lack of
time and support from the uni here), and what you see here is a
considered application of those ideas, so if they seem a bit wild in
places, it's probably more to do with me not explaining it well than
it being *way* off base.  There are areas of uncertainty, though, and
currently I'm struggling to remember how I had it all laid out.

The aim is to provide a powerful abstraction over standard file
systems which frees you from reliance on specific file system
features, like how locking is done.  "Observing" locks, changes to
files and directories, etc should be possible, so that we could be
informed when a configuration file changes, for example.  Location
independence should be a simple(ish) matter using CORBA, by which I
mean that once you have a CORBA IOR, CORBA takes care of talking to
the server.  An extremely important aspect of the design is that it
moves away somewhat from the directory/file paradigm to a much more
generic model in which more or less any server can be activated on a
CORBA IOR.  Also any object can be "plugged in" at any appropriate
place in the directory tree, which is a bit like mounting a file
system.  More on this later.

To illustrate: we have obtained a CORBA IOR for a server.  This is
analogous to the root directory of a file system.  We can perform an
operation on this IOR, which sends a message to the server.  The
server (if necessary) activates a sub-server for this specific object
(or kind of object).  The "directory" server is activated, and we can
query it to find the names of other objects available from the server.
Each of these will be another CORBA IOR.  Let's say we get one of
these objects, that has the name "README", and we want to read its
contents.  We send a message to the server that owns it (CORBA takes
care of this), asking for the whole contents of the file to be
streamed over (we could also get it block by block).  The server
activates a "file" sub-server which provides the interface to do this.
The file is streamed over, and we read it.  There is also a
"directory" sub-server which works for nested directories.  So we have
a basic file system capability.

This system is not limited to the interface for a standard filesystem
and can handle metadata as well.  It is up to the subserver for the
requested object to find a way to represent this metadata on the real
filesystem (hopefully, you'll see why in a minute).  For some
filesystems it might be a matter of talking to the OS-provided
metadata system.  On others, a matter of simulating the metadata
storage using a database, much as gnome does at the moment (aside: the
gnome metadata API would need to be implemented in terms of this CORBA
service so that gnome metadata would integrate correctly).  In some
cases, it might make sense to store metadata in the file itself, if it
is a document-like file (not sure about this).  From the point of view
of the programmer, there would be a C API that is implemented in terms
of CORBA IDL stubs (aside: it should also be a simple matter to simply
have these stubs accessing the local filesystem and simulating the
features that are not present - this is essentially what is being done
at the moment with gnome metadata, unless I'm mistaken).  The IDL
would be public so that other languages can use them directly or use
the C.  The subserver that is chosen to handle a particular object in
the filesystem is responsible for handling the metadata for that
object, but in the common case that either no special handling is
required or there is nothing to be gained by special handling, simply
passes it down to a next-level subserver to handle.  The "file"
subserver is guaranteed to provide this service, and is likely to be
used as the basis for most other subservers to perform their local
file access.  This way we don't have to keep writing the same old
functionality in every subserver, plus there might be caching
benefits if one kind of subserver handles all requests for access to a
certain type of object.

This system can handle MC-like virtual filesystems so we would not be
losing functionality there, either, although the mechanism is more
like GNU HURD's translators system.  The trick here is to request that
the activation of a file object which happens to be a .tar.gz file
occurs through, not the "file" subserver, but the "tar-gz" subserver,
which is then implemented in terms of the "file" subserver.  The
activation of this server can also be performed by default by changing
the default server for the object via metadata.  This is analogous to
what happens in MC now.

MC, by the way, would become more of a "browser" type client of this
server, which seems to be in keeping with Miguel's suggested roadmap
for MC.  Gnome libraries would also become a client for metadata, and
possibly other things.  Access to local files' metadata can be through
the local server, direct access might be shortcircuited to the
filesystem directly, or maybe one day there will be shared memory ways
of communicating with an ORBit server.  TBD.

Mounting directories and other objects can be done by simply serving
out an IOR for an object on a different server.  When your client
wants the contents of the object, it sends a message in the usual
CORBA way, and CORBA contacts the other server.  All this happens
"under the hood", and need not be seen by the application.  You
achieve a very generic mount-like mechanism but in a way that frees a
server from actually having to serve out objects it doesn't own.
Peer-to-peer type sharing (yuck!) would be quite easy to achieve.

So far I have described mainly file/directory based stuff - which is
ok - but this design actually started off as a groupware system, and
evolved towards a filesystem, so what else could it do?  The idea is
to allow any object on the server to be "activated" - i.e. for the
server to start a subserver to handle requests for the object - and
then accessed via CORBA.  There is of course no need to stick with
just a file-like interface for an activated object; it can be any
CORBA server, local or remote, program or shared library.  Thus, this
system could be used to store objects to be activated using the
component model.  (I should say here that, although I've looked at the
bonobo stuff, I can't yet see the bigger picture of how it is going to
work).  My plan is to allow the main server to decide, based on
metadata associated with the file (yes, I know that my metadata system
is looking a bit circular at the moment!) what subserver combination
to activate (e.g. "untar" on top of "ungzip" on top of "file", or
"gwp-doc" on top of "xml" on top of "file"), OR for the user to
specify what subservers they want, provided that the main server
agrees that the requested combination does not allow them some
permission they would not have had via the standard combination of
subservers combined with their permissions.  (I have not worked out
how to do this, but have a possible solution).

So access can be granted through a directory-like mechanism to objects
stored on the server.  Provided that appropriate permissions are held
(more later), some subserver gets activated that provides an interface
for accessing the object.  Unless we asked for a specific subserver,
we don't know which one we got.  The server returns to us the logical
identity (as a string) of the subserver that is serving the object.
This can be used to tell us what "browser" object to activate locally
to handle looking at the object we have been served, but all we really
need is something that knows what messages to send.  A good example of
this happening will one day be the component system, which knows what
local server to activate to view a remote document.  If this document
contains more objects, they are activated and shown embedded in the
document.  All the objects will probably be stored in one (real) file
on the server, and all the subservers for the document components will
read/write their data via the "xml" subserver, which I guess would
implement something like (or identical to) the DOM interfaces.  This
also allows to have hyperlinks in the document to other things, even
on different servers.

If two people activate the same object at the same time (well, their
"sessions" using it are concurrent), they should be able to perform
shared viewing/editing upon the object/document.  This is a simple
extension of the "document-view" model well-known from OOD: you have
one document (on the server) and two views looking at it (on the
clients).  When one person changes the document, the change in the
view is broadcast, and both parties see it.  In fact, there are issues
I know of which go considerably beyond this, but this is enough for
very simple groupware applications.  I don't want to get too heavily
into the groupware thing at the moment, for the sake of avoiding
confusion (mine as well as everyone else's), but it *should* be
supportable because I think we have a general enough mechanism.  One
day we will have shared working on Gnumeric.  A nice, fun test
application might be a shared canvas-based whiteboard system.  Maybe
then we will embed a gnumeric view component into our whiteboard,
along with a chat "viewer" and do some boring maths stuff....

There are also applications of the server as a general directory
service.  Replication/mirroring should be a possibility, via a
specialised subserver.  Someone who knows about this kind of stuff
should think about it and post their ideas.

On a second reading, one thing that I haven't made very clear in this
document is the distinction between "names" of objects (which are
identified by CORBA IORs and are analogous to filenames) and the
activated versions of objects themselves (which are also identified by
(different) CORBA IORs, and are analogous to open file descriptors).
Sorry about that.  Obviously, from a security perspective, giving
someone else your "open file descriptor" is a bad idea, but giving
them your "filename" is ok, because they have to use their permissions
to convert it to a reference to a subserver for an object.  One case
where handing over a "file descriptor" is ok is when you want someone
to fill up a file with data for you, but you don't want them to know
where that file is being put (or vice versa).  This is a bit like
pipes in unix but distributed.

The permission system I wanted is key-based.  That is, the server is
not responsible for working out who you are.  Instead, when you talk
to it, you hand over a set of keys you have.  Each key is, as usual, a
CORBA IOR, and each grants you permission to access an object in a
certain way, e.g. you might hold a key granting "access" privelege for
object "/docs" (which happens to be a directory), and another key
granting "view", "edit" for "/docs/README", "/docs/ChangeLog".  When
you want to give someone permission for something, you can give away
any permission you have, but no more, in the form of a key.  If
someone changes the permission associated with your key later, any
keys you have given away based upon it are similarly affected.  (A
particular problem here is how to stop people caching the "open file
descriptors" as a means to avoid getting caught by permission
changes).  You could also directly give your personal key away, in
which case you lose the ability to control the other agent's access.
The server itself could run a subserver for handing out keys based
upon an identification mechanism such as username/password pairs.  It
might be worth looking at what Kerberos does, here.  

If you wanted to have embedded hyperlinks to the "names" of other
objects, but wanted people to access them as though they were you
(i.e. not have broken links), you could also provide, in the document,
a key you have created specially for this.  The key is managed
locally, mainly automatically.  This provides for a system that is
hypertextual in structure, rather than following the hierarchical
directory style (essentially this is used only for providing
well-known "names" for objects).  This is analogous to the differences
in the way HTTP- and FTP-based browsing works.

Finally, the good news: if there is a concensus that I haven't gone
raving mad and that this thing *might* actually work and could fit
into the way Gnome is evolving, I would be happy to start hacking on
it #include <standard_complaint_about_having_no_time.h>.  I would need
some advice in some areas, and I'd be delighted to hear (relevant)
suggestions for ideas to rip off, ways to do things, etc.  I would
also like to humbly propose myself as a coordinator for this, but I 
won't cry if someone else does it.

Any comments?  (I'm on gnome-devel-list but not gnome-hackers)

-- Duncan Pierce  (D.R.Pierce@soc.staffs.ac.uk)



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]