Re: proposal for MIME behavior in GNOME

On Fri, 2003-12-05 at 17:47, Daniel Veillard wrote:
> On Fri, Dec 05, 2003 at 11:54:29AM -0500, Jonathan Blandford wrote:
> > Murray Cumming Comneon com writes:
> > 
> > > > Presumably valid xml files will appear as text/xml, and not 
> > > > as text/plain.
> > > 
> > > And will that be handled the same as application/octet-stream files, or just
> > > opened silently with a default XML-editing application such as gedit?
> > 
> > What do you want it to do?  What's the 'right' behavior when you get an
> > unknown xml file.  Should it open in an application like conglomerate or
> > in a text editor.
>   Unfortunately Mime-Type is clearly inadequate now for specifying
> the default processing of a resource in the face of XML usage.
> One way is to try to detect namespaces in the document first kbyte
> or so, grepping for common strings might be sufficient. But it is
> outside purely Mime-Type handling...

We need some way of:
(i) identifying the "type" of an arbitrary XML document, and 
(ii) some way of naming those types which will integrate with the
application database.

How are people solving these problems?

In Conglomerate, for (i) we ignore the mime type of files we open and
have a set of "known" XML document types that we care about.  We look at
the public identifier and URI of a DOCTYPE if present to see if it
matches one we know about; failing that we look a the toplevel element
and see which of our doctypes have one with the same name.  Messy, and
with some (known) problems.  Our registered doctypes also can provide
the app with filename extensions, though IIRC this isn't yet used for
type sniffing.

We also have importer plugins that do look at the mime-type and decide
if they are appropriate to that file, though these are intended for
importing non-XML files into some XML format (for example, plain text
into DocBook).

For (ii) we don't bother naming things in the MIME system; we simply
store a pointer to the document type's "display specification" file
which holds interesting app-specific data about the XML subtype.


James Clark has written an Emacs mode for XML that uses Relax NG
schemas, and he needed a way to decide which schema to use for a given
XML file.

He's invented a mini-language for doing XML type calculations; see this

It lets you write short "locatingRules" schema which can do things like
compare DTD URIs, DOCTYPE public identifiers, top-level element names,
etc.  This solves problem (i), although someone has to write all of the
locatingRules files.

All you get as the output is the URI of a RELAX NG schema, which solves
his problem, but doesn't necessarily address (ii).

Anyone know of any other examples?  

> Daniel
David Malcolm

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]