(sort of) Hi. After studying the (uncommented) festival driver source code, I got to some conclusions: - the Italian speech synthesis doesn't like speaking nothingness: $ festival Festival Speech Synthesis System 1.4.3:release Jan 2003 Copyright (C) University of Edinburgh, 1996-2003. All rights reserved. For details type `(festival_warranty)' festival> (SayText "") #<Utterance 0xa72bbcc8> festival> (SayText " ") #<Utterance 0xa72d3d08> festival> (voice_lp_diphone) lp_diphone festival> (SayText " ") SIOD ERROR: wrong type of argument to get_c_val festival> (SayText "") SIOD ERROR: wrong type of argument to get_c_val festival> - the festival driver sends LOTS of nothingness: festivalsynthesisdriver.c:945: festival_synthesis_driver_say_raw (d, "(SayText \""); festival_synthesis_driver_say_raw (d, escaped_string); festival_synthesis_driver_say_raw (d, "\")\r\n"); festival_synthesis_driver_say_raw (d, "(SayText \"\")\r\n"); basically, after every string that is sent, and empty string is sent. Why? No idea. A comment explaining why would really have helped. - festival_synthesis_driver_is_speaking is broken: when festival has only one wave in the audio spooler, it says that the queue is empty. I enabled debugging info in the driver, and every single time it queried the audio queue, it's always been reported empty, even if it was actually speaking. - I tried simplifying the interaction a bit: I got rid of all the (useless) audio queue enquiries and I simplified the way text is sent, sending all in a single bunch: escaped_string = g_malloc (strlen (text)*2+1+20); strcpy(escaped_string, "(SayText \""); ptr1 = text; ptr2 = escaped_string + strlen(escaped_string); while (ptr1 && *ptr1) { if (*ptr1 == '\"' || *ptr1 == '\\') *ptr2++ = '\\'; *ptr2++ = *ptr1++; } *ptr2++ = '"'; *ptr2++ = ')'; *ptr2++ = '\r'; *ptr2++ = '\n'; *ptr2 = 0; [...] festival_synthesis_driver_say_raw (d, escaped_string); oh, and I also escaped \ characters, which weren't escaped before (security risk? I didn't investigate). The result was that things were a little bit more stable, but not much. Sound would stop (reliably reproducible by hitting ALT+F1 to go to the panel menu), but switching window with ALT+Tab would usually bring it back. However, sometimes gnopernicus wouldn't read its own menu entries unless one plays with ALT+Tab a bit more. So, it seems that window switching has a therapeutic influence here. When sound stops, what happens is that gnopernicus doesn't send data to the speech driver at all. I suspect that what happens is that the driver status reporting is confusing gnopernicus somehow. I tried to rewrite the festival driver using the festival C++ API instead of the pipeline to a festival server, but got stuck with the audio output: the festival audio scheduler has unreliable status report, and I'd have to implement a queryable and interruptible audio scheduler, which is something I'd spend days doing because I'm not familiar with glib even loops and esd/gstreamer programming. So, problems identified so far: - italian voices hate empty/blank strings - the driver sends lots of empty strings. The errors should be ignored by the server, though. - is_speaking report is unreliable - gnopernicus tends to stop sending data to the festival driver, possibly because of getting confused by the driver status reports. Switching windows seems to shake gnopernicus back to normal. - the way SayText strings are constructed should be improved, see the code snipped above. I'm sorry I didn't pinpoint the problems better, but the festival driver's code is full of enqueuing callbacks into lists and glib event queues, and is hard to follow for me. I would have liked to write the author and work on it together, but it's basically anonymous (it just says "Sun Microsystem"). I tried with orca as well, which seems to be more reliable in sending data to the speech driver, until at some point it caused all my desktop to hung up to the point of needing a CTRL+ALT+Backspace to restart X. I'll now try to work out how to make the festival voices work fine with empty/blank strings. I would be happy if someone can tell me what door to knock regarding the gnome-speech festival driver internals. Ciao, Enrico -- GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico debian org>
Attachment:
signature.asc
Description: Digital signature