Re: [ui-dev] Q: How to implement XAccessibleText interface in formula editor ?



Thanks to Peter for asking my opinion. Not sure I know all the answers, but maybe I can help some. I agree with Peter that the returned text should be the unicode character. When in doubt, always use the most efficient representation, but keep it rigorous. Your other questions have to do with representation of markup in equations. This is indeed something I've thought a lot about.

I know nothing about the formula editor, but the internal representation is probably some fairly straightforward representation with syntax similar either to LaTeX or mathML. If it is for display and not computation, then it's probably more like LaTeX. So one possibility for returning the markup is just to use LaTeX math markup. It's rigorous but not very compact.

I have developed a set of special markup characters that permit one to represent equations in LaTeX syntax but very compactly. Several blind students are using this as a very compact linear math representation that can be read in speech or various kinds of tactile hard copy. These characters are defined in a Windows font but not yet in any unicode. The set is complete enough to represent anything representable in LaTeX. In case you are lost, maybe it would help to say that this font has characters for such things as the triple fraction start, middle, stop; subscript indicator, superscript indicator, overscript indicator, underscript indicator, font indicators... So Thomas question about vector is that the symbol is followed by an overscript indicator and an arrow character. Peter's integral representation could be integral underscript something overscript something. This kind of representation could be read directly by a screen reader that knows unicode or translated very easily by existing math braille translation engines.

Does this help or just muddy the water?

John


At 09:16 PM 5/15/02 -0700, Peter Korn wrote:
Hi Thomas,

I've cross-posted to the GNOME Accessibility list, the Mozilla
accessibility list, and the Java Accessibility list, as I think folks in
those communities may have some good suggestions as well.  If nothing else,
there are communities that should be thinking about these issues.  I think
our ATV friends in the community are especially important to get feedback
from on these questions (BAUM, University of Toronto, Freedom Scientific, &
Ai Squared).  I've also directly cc-ed John Gardener, a professor of
Physics at Oregon State University in the U.S.  John has spent many years
looking at ways of representing mathematical equations to the blind.

For the GNOME and Mozilla communities: the question Thomas raises is how
should special characters (that are multi-character strings as they are
stored within a document), and how should characters in formulas, be
represented in the AccessibleText interface.  Thomas' question is specific
to the OpenOffice UNO Accessibility API (and the XAccessibleText
interface), but the problem faces us in GNOME, Netscape, and Java land as
well.

I'm appending my person thoughts at the end of Thomas original e-mail
below.

Thomas Lange wrote:
>
> Hi,
>
> How should I implement the functions
>   virtual awt::Rectangle SAL_CALL getCharacterBounds( sal_Int32 nIndex )
>   virtual sal_Int32 SAL_CALL getIndexAtPoint( const awt::Point& aPoint )
> from the XAccessibleText interface in the formula editor ?
>
> Especially considering that some symbols in the graphical display are
> represented as a single character (for example the greek Sigma
> character) where the character will be represented by a 'word' (here
> '%SIGMA') in the text.
>
> If for example the formula text is "%ALPHA + %BETA equals %KAPPA" which
> would result in a graphical representation like "A+B=K" which has only 5
> characters.
>
> Should I return the bounding rectangle of "A" for the indices 0-5
> (%ALPHA) when calling getCharacterBounds ?
> And when calling getIndexAtPoint for a point of "A" should I return 0 as
> index ?
>
> Or should I expand the symbolic names to the unicode character they are
> representing and thus having the
>   rtl::OUString SAL_CALL getText()
> function return the string "A+B=K" also where the symbolic names will
> have been replaced with the corresponding unicode character.
> This approach would make the implentations of  getCharacterBounds  and
> getIndexAtPoint  quite obvious.
> However many of the unicode characters used in the formula editor come
> from within the private area of the unicode character set. Will this be
> a problem ?
>
> The next problem will be how should I implement these functions for
> formatted text ?
> For example when a text or character has super- and subscripts in its
> graphical representation ?
>
>    n
>   AAA  x
>    m
>
> Which should be the order of assigning the indices to the text ?
> Should index 0 be the 'n', 1 'AAA' and 2 'm' ?
>
> How about some unusual attributed text like symbols with a (single)
> vector arrow above ?
>  -->
>   V
>
> Regards,
> Thomas

I think for symbols like "%ALPHA", they should be seen as a single (Unicode
or UTF8) character wherever possible.  Failing that, I think the characters
of that "word" should share the same bounding rectangle.  But a single
character would, I think, be better.

For more complex formulas, like a summation (or integral) from m to n, I
think you'll have to pick a cannonical order (and it should be one we all
agree to so it's standard throughout all accessible representations of math
formulas).  We could also introduce a bunch of new AccessibleRelations to
try to cover formulas, making each of these characters (or collections of
characters) their own object with relations to other objects.

Unfortunately, none of these approaches is really all that satisfying.


A longer-term approach we should consider is finding a way to represent the
entire formula as it's own object, and create an AccessibleDescription for
the formula that explains what it is (so something like "Integral from m to
n of the function f(x)").  Also, we should look toward standards like
MathML to represent the formulas, and expose the MathML directly for a
screen access product to parse as it chooses.  If there is a MIME type for
MathML, we could use the AccessibleStreamable interface (in GNOME, not yet
in the Java Accessibility API) to indicate there is a MathML stream which
can be parsed for this object.


Regards,

Peter Korn
Sun Accessibility team

John Gardner
Professor and Director, Science Access Project
Department of Physics
Oregon State University
Corvallis, OR 97331-6507
tel: (541) 737 3278
FAX: (541) 737 1683
e-mail: John Gardner orst edu
URL: http://dots.physics.orst.edu





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]