On Thu, 2002-05-16 at 23:04, Jonathan Bartlett wrote:
software community are interested in something like this. But unfortunately, to do this well, it needs to be a LOT lower down than gnome. I've come to the conclusion that it needs to be an OS feature, so filesystems, system utilities, etc can take advantage of (or perhaps more accurately, not break) metadata. Nautilus and other GNOME things would be great clients to the OSes metadata service, but they shouldn't be the service providers.Actually, I disagree. I've thought about this a lot, and it seems that as an OS feature, it would really suck, especially for multiuser systems (each user needs their own copy of the meta-data, especially when they disagree). Also, for the OS to be involved with doing callbacks to create cached metadata like thumbnails would be insane.
Except that you then need to make all tools that deal with files of any sort aware of metadata. Which is an enormous undertaking. What happens if I move a file with CLI tools instead of Nautilus? If the metadata isn't attached to the file, it gets lost. Same with emailing someone the file, etc. Or, if I have metadata on another user's files, how can I track where it goes? These are really basic problems that need to be solved. Remember also that things like thumbnails are essentially an implementation detail. No one said that the OS needs to come with thumbnail generation. But it should be aware of metadata, otherwise your metadata database is going to break pretty fast. There are certainly userland parts to this problem, but there are definitely OS level parts.
If you're interested in doing more research on this, I would suggest looking at the Semantic Web work that the W3C is doing, as well as efforts like Dublin Core and what the Library of Congress is doing. OneI've looked into this a little, and I think they are good pointers to how the internals might work.
I wouldn't say that they are good for internals....but they are useful for understanding the issues, and how to approach them. They bring out a lot of the annoying little problems that are not necessarily obvious at first glance.
tricky thing you end up with is where the MIME type doesn't really map to the actual "type" of the document. Like a JPEG that is a scan of a book. That cuts down on what you can do programmatically. The Open Knowledge Initiative, mainly spearheaded by MIT and Stanford, is also working on a meta-information management API....as those apis become published you may want to take a look.Where does one find this project?
web.mit.edu/oki/ The metadata apis are not anywhere close to being published....probably not until this fall at best. We're still banging away on the apis that metadata requires. But metadata is a big part of the planned oki functionality.
Of course, the other two huge hurdles to something like this are how to deal with the network (how does one transmit all that metadata across the wire? especially with non-metadata-aware OSes, and filetypes that don't natively support metadata), and how to make the algorithms for generating accurate metadata not all that intensive computationally.The issue for individuals managing their own metadata is inherently much more simple. It doesn't even require support from external entities (like websites) at all. For example, as a user, I could mark categories on my email, web page bookmarks, and documents. Then, when I use Nautilus to browse by category, it pulls up links to the relevant information.
Yes, but if I have my mp3 collection all nice and metadata-rich, and I give it to you via webdav or ftp, don't you want to have all the (objective) metadata, instead of having to re-create it all? Same with documents, presentations, etc in a workplace. Unless you get something for your effort, you're not going to add any metadata in the first place. What you are describing (just doing simple category-based searches on data) is basically what I was suggesting you work on with Medusa and emblems. I'm sure lots of people would be happy if you picked that project up and did some work on it.
Metadata is useless unless most of it is generated automagically, because users won't be bothered to add it in.Although this might be true for home users, heavy users of information will probably think differently, like those who have to manage a thousand projects. If they can tag all of their relevant resources - emails, web pages, documents, with each project's tag, and also mark priorities, it becomes very useful.
Home users are becoming heavy users of information. ;-) However, you have to think about what the user burden is in adding all of this metadata is. And what the programmer burden is if it can't be automatically taken care of by system calls. I have seen many projects that aim to make information management easier fail because they end up just creating more work for people. So people just stick to the old way because it is less of a pain in the ass (even if it is only in the short term). Anything that takes constant effort is probably not going to work too well. That is why generated metadata is a good idea.
Then you get into issues of "degrees of accuracy" with how well the computer can guess what the correct metadata is. So you lose the normal bivalence of computation, and have to deal with vagueness in logical operations (because most metadata stuff boils down to testing a logical proof's validity).I think this is where people start becoming architectural astronauts, and trying to make a self-aware being through metadata. Real usage of metadata is much simpler.
In very limited contexts, yes. But there is nothing here about sentient computers. It is just about realizing what the actual problem scope is. But maybe I'm biased, as my day job is an IT architect, and I do research in logic and vagueness. ;-) There are two ways to think about the problem - you are essentially trying to create a userland database with a controlled language for categories, and the restraints that only certain clients can touch any files on your computer. While it is the faster way to go, I think it lacks robustness. But as I say, the Medusa thing would be a great way to get a lot of what you want, without all that much work. Something definitely feasible for gnome 2.x.
Anyway, would you happen to know who on the GNOME team would be most interested in this?
Joakim Ziegler has expressed interest in things like this, but I haven't seen much public gnome activity from him in a little while. Rebecca Schulman (I think that's her last name, I don't quite remember) wrote most of Medusa, and so might be a good person to talk to. Someone is also working on a thumbnailing standard, so that might be a good person to talk to. I have a feeling that others might get on board once you have some demonstrable code, but most of the core gnome people are pretty damn busy for the forseeable future, it seems. --Ryan
JonOn Thu, 2002-05-16 at 21:25, Jonathan Bartlett wrote:For anyone who's interested, I am writing a document describing how GNOME could assist users in keeping track of all their information, better than any other system I'm aware of. Please take a read at http://www.eskimo.com/~johnnyb/computers/MetaInformation.html and let me know what you think. I know now is probably not the appropriate time for major infrasture changes, but it's an idea for something to do post-2.0. Jonathan Bartlett _______________________________________________ gnome-love mailing list gnome-love gnome org http://mail.gnome.org/mailman/listinfo/gnome-love
Attachment:
signature.asc
Description: This is a digitally signed message part