Re: [g-a-devel]Another draft of gnome-speech IDL
- From: Bill Haneman <bill haneman sun com>
- To: Marc Mulcahy <marc mulcahy sun com>
- Cc: gnome-accessibility-devel gnome org, Philip Kwok <Philip Kwok sun com>, Paul Lamere <Paul Lamere sun com>, william walker sun com
- Subject: Re: [g-a-devel]Another draft of gnome-speech IDL
- Date: 01 Jul 2002 15:50:12 +0100
Hi All:
Thanks for the new IDL Marc, and Michael thanks for your comments.
I may agree with Michael about the initialize stuff; however
initialization *is* expensive. The difficulty is, what (if anything)
useful things can the user do with/find out about an uninitialized
driver? If the answer is basically 'nothing' then I agree with Michael
that having explicit initialize/deinitialize methods doesn't seem to
serve much point, and it requires almost everything to throw those
annoying DriverNotInitialized exceptions.
However I think Michael is right when he wonders about the 'modal'
nature of "setCurrentVoice". Though I agree with Marc that Draghi's
original "speaker-based" proposal seemed very hard to implement, I think
that the concept of a speaker or voice within a driver makes a lot of
sense. So my suggestion would be to add
getDefaultSpeaker ()
setDefaultSpeaker (Speaker s)
to SynthesisDriver, and define Speaker to take over many of the speech
methods, along with a createSpeaker () that took a "voice" string.
(I omit the "raises" specification from my examples below)
interface Driver : Bonobo::Unknown {
...
Speaker createSpeaker (in string voiceName);
void freeSpeaker (in Speaker speaker);
}
interface Speaker : Bonobo::Unknown {
ParameterList getSupportedParameters ();
string getParameterValue (in string name);
void setParameterValue (in string name, any value);
string getParameterValueDescription (in string name, any value);
void say (in string text);
void sayURI (in string uri);
void stop ();
void pause ();
void resume ();
boolean isSpeaking ();
void wait ();
registerSpeechEventListener (in SpeechEventListener l);
}
This would still mean that client would have to activate gnome-speech on
a "driver" basis, but clients could define and interact with multiple
"speakers" from a given driver (or separate drivers, of course).
Speakers would be bound to a given driver, but their settings would
persist during a session so that, for instance, if a client wished to
output two different strings with different voices it would not be
necessary to sequentially interleave calls to "setCurrentVoice",
"setParameterValue", etc.
I think this could be implemented in phases fairly easily, so that for
instance a TTS service like festival could provide its voicelist, a
client could manipulate parameters on a "speaker" based on one of those
voices, and then use the "speakers" as persistent objects; in the case
of a driver like festival which doesn't currently persist settings
per-voice, the parameters would be cached in gnome-speech driver
structures to produce the effect of concurrently-available multiple
voices/speakers.
The simplest clients could still activate a gnome-speech service by
getting an instance of GNOME/Speech/SynthesisDriver, calling
getDefaultSpeaker() or createSpeaker ("voice"), then
thus (using C++/Java-like shorthand rather than the C bindings):
speaker = driver.getDefaultSpeaker ();
speaker.say ("hello");
/* lots more calls, until client is done */
speaker.unref ();
Of course in C the code would actually look more like
GNOME_Speech_Speaker *speaker =
GNOME_Speech_SynthesisDriver_getDefaultSpeaker (driver,
"kaldiphone", &ev);
GNOME_Speech_Speaker_say (speaker,
"hello", &ev);
/* many more calls to the Speech service */
GNOME_Speech_Speaker_unref (speaker, &ev);
best regards,
Bill
[Date Prev][
Date Next] [Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]