Re: open translations database



> Dear Stephen & All

It's Stefan! ;-)

> My name is Aoife Dunne and I am the project manager responsible 
> for the GNOME Localisation at Sun.  
> 
> I am writing this mail in the hope that I can take Stephen's 
> suggestions one step future helping the open source community in 
> providing localised product versions of GNOME and similar open 
> source products thereafter.

Good thing.

> I work for Sun Microsystems who are 
> planning on shipping GNOME with the next marketing release of 
> Solaris, therefore I am writing this mail with the GNOME project 
> in mind. 

> However, we want any solution to be for general benefit 
> of free and open source software and I would be very interested in 
> offering our team assistance across all localised open source 
> software.

You have a few good points here already. Let me be so bold to say that Sun could hire professional translators for GNOME. I wouldn't be against that (hint ;-), but it would be a short-term solution, and I guess you realise that. The long-term solution would be to get the translation community self-organised. All bigger (succesful) projects have some kind of organisation, e.g. the Linux module system and the GIMP plugin system. Translation, which I think is a big project, doesn't have such a unification as of yet.


> How can Sun help:
> Stefan mentioned it would be nice to have a web-accessible  
> "database" (or just a simple file) which would contain one or more 
> set of standard English and associated translations for standard 
> words/terms.  Develops and translators of software and 
> documentation could use the terminology listings as reference.  
> 
> Terming Tool 
> ------------
> We have a script, which extracts terms from the English software  
> files, providing suitable terms for the initial database/file.  A 
> term is defined as no more than one or two words.  This script 
> extracts terms from the strings, removes duplications, ignores 
> terms such as "the, is, numbers etc.".  It is not possible to 
> extract the associated translated terms, so it would require 
> translators to provide the translated terms.   Once this is done, 
> the terminology listings can be posted to a web site, where it can 
> be updated/modified as development of applications progress.  It 
> is preferred that the suite of applications within a product use 
> the same terminology ensuring consistency, however by defining the 
> application it is possible to use different terms when 
> appropriate.  
> 
> Sample
> English Term  English Definition    Translated    Application
> Print
> Save
> Save-To
> 
> Initially it may not be possible for me to supply the source of 
> the terming tool due to licensing problems, however I can help 
> immediately by supplying a simple text file with the English 
> terms.  Would this be of help?

> Translation Memory 
> ------------------

[snip: a very interesting story about the TM]
 
> The TM system is still in development but is coming close to 
> completion. We may be able to help by providing you with a .po 
> file parser.  However, we would need to look into possible 
> licensing issues.

You describe the system I had in mind, so I can't help being very enthousiastic about the idea.

There are a few questions here. The first one, you touched the topic politely, is: how to make this useful for Free Software folks.

> 
> 
> Style Guides 
> ------------
> We have some localised versions of a style guidelines.  These 
> guidelines are used to aid the translators.  For example, in 
> France  how the date, time formats should be localised.  In many 
> countries such data is correct in many formats, however, the use 
> of style guides decide on the preferred format for the use of 
> consistency.   Our style guides could be used as reference and 
> updated to create a GNOME specific style guide for all languages.  
> Let me know if you are interested and I will send you a copy of 
> our country specific style guides.
> 
> 
> How else can Sun help,
> 
> * possible act a the host for the translation memory database, 
> populating newly translated products. 
> 
> * provide linguistic quality assurance feedback and implement 
> linguistic changes if necessary checking for grammar, spelling, 
> inconsistencies etc.
> 
> If any of the above suggestions would be of help and if you have 
> any other suggestion on what I can bring to the table, please let 
> me know.  Looking forward to getting any feedback.
> 
> Best Regards
> Aoife
> 
> > X-Unix-From: StefanRieken@SoftHome.net  Thu Oct 12 18:56:35 2000
> > Delivered-To: gnome-i18n@gnome.org
> > Subject: open translations database
> > From: Stefan Rieken <StefanRieken@SoftHome.net>
> > To: whampton@staffnet.com, gnome-i18n@gnome.org
> > Date: 12 Oct 2000 16:55:26 -0100
> > Mime-Version: 1.0
> > X-BeenThere: gnome-i18n@gnome.org
> > X-Loop: gnome-i18n@gnome.org
> > X-Mailman-Version: 2.0beta5
> > List-Id: Internationalization (I18N) of GNOME 
> <gnome-i18n.gnome.org>
> > 
> > To the folks at openstandards.org and the gnome-i18n mailing 
> list.
> > 
> > Hello,
> > 
> > This mail was sent out to give space to an idea that I developed 
> only
> > today. This idea is rough, unimplemented and untested. 
> Nevertheless, I
> > hope that it is of interest for you. This mail was sent to the 
> addresses
> > mentioned above, just because I didn't know any better place to 
> start.
> > If you believe I shouldn't have sent it to you or your list, I
> > apologise. If you believe I missed someone out, you are free to 
> forward
> > this. (But I must warn you in advance that this idea is too 
> young for me
> > to know if it will survive my busy schedule.)
> > 
> > Problem:
> > 
> > The current translation of open source software suffers from a 
> lack of
> > manpower. Thjs usually doesn't result in a lack of translations, 
> but in
> > bad translations. Half of the time translation engines such as 
> Babelfish
> > are being used. These engines often can't produce correct 
> translations
> > of small strings because of a lack of context (e.g.: the title 
> of the
> > window I am writing this message in says, directly translated 
> back to
> > English: "is composing a new message" instead of "Compose a new
> > message"). They also don't care about the size of the translated 
> string,
> > which can be important when used in a program. Translation by
> > individuals can often also cause errors. These vary from 
> inconsistencies
> > to overlooking spelling caveats common for the target language.
> > 
> > It would be helpful to have one or more sets of standard 
> translations
> > for standard words and strings. Translators of software would 
> benefit
> > from this, but also translators of larger documents that contain
> > standard words and strings (such as "radio button"; you'll be 
> surprised
> > to know how hard it is in some languages to come up with a good 
> default
> > translation for it).
> > 
> > Context:
> > 
> > I am writing this with the GNOME project in mind, because I am 
> known
> > with it. However, I want my solution to be for the general 
> benefit of
> > free and open source software.
> > 
> > There are a lot of standard strings in applications. Many GUI 
> standards
> > define which ones you can use. Desktop projects such as GNOME 
> often have
> > a set of these standard strings, and their translations, 
> included. They
> > can, however, not provide translations for less commonly 
> strings.
> > Another problem arises when standard strings are part of bigger 
> strings
> > (e.g. when "show toolbar" is standard, and a string like "show 
> main
> > toolbar" is being used). Most open source projects don't really 
> care
> > about documenting their use of standard strings, as the 
> implementation
> > should be clear enough.
> > 
> > In the past, I have done some minor translation work for ATO. 
> This is an
> > international organisation of translators of Amiga software (the 
> Amiga
> > Translation Organisation). They were pretty well organised (but 
> being an
> > Internet development newbie, it took me some time to get known 
> with the
> > organisation). One of the best parts of the organisation (of the 
> Dutch
> > division anyway), was a document that described the translation 
> process,
> > and also contained a list of common Amiga terms and their 
> translations.
> > 
> > Because I want my solution to be global, and not e.g. 
> Amiga-specific, I
> > think it is not a good idea to provide a procedure for the 
> translation
> > process. Different projects may have different standards. I also 
> don't
> > think that a small list of common terms will do the trick. 
> Again, these
> > terms may vary slightly from one project to another, and if we 
> are going
> > to sum up only a few general words, the result wouldn't be 
> really
> > useful.
> > 
> > Solution:
> > 
> > I was thinking that it would be nice to have a web-accessible 
> database
> > being set up to tackle this problem. The "database" (or just a 
> simple
> > file) would initially be empty, but it would be available for
> > modification through a CGI script. This service should be 
> neutral, so
> > that we wouldn't get duplicate attempts to solve this global 
> problem.
> > (E.g. hosting it at gnome.org wouldn't make it very neutral to 
> KDE folks
> > ;-).
> > 
> > The interesting part is how the database should look and behave. 
> I only
> > have given this part little attention as of yet. There are, 
> however, a
> > few schemes one could follow, and I imagine that one of these 
> schemes
> > would be more or less ideal.
> > 
> > The Economy Scheme:
> > Simply feed the database a list of words and their translations, 
> per
> > language. This would be the scheme of preference if it turns out 
> that my
> > time, help and knowledge are really low.
> > 
> > The Business Scheme:
> > Same as above, but now with even more features! ;-), including:
> > 
> > - an argument-based history of the translation. Example:
> >  
> >   "English: 'file', Dutch: 'bestand'
> >    Previous translation 'bestant' is wrong because of a 
> misspelling
> >    Previous translation 'document' is inaccurate"
> > 
> > - a project-specific translation. Example:
> > 
> >    "English: 'edit', Dutch:
> >   'Bewerken' (KDE standard)
> >   'Bewerk' (GNOME standard)"
> > 
> > - per-project tips and guidelines. Example:
> > 
> >   "English: 'Are you sure you want to ...',
> >    KDE tip: doubting the user is not friendly. Please use 
> 'Please
> > confirm ...' instead."
> > 
> > - per-language (and per-project?) tips. Example:
> > 
> >   "English: edit, Dutch: bewerk
> >   Dutch language tip (GNOME): always use infinitive[*]"
> > 
> > - automatic parsing of your .po files??
> > - automatic updating of a few registered .po files??
> > 
> > So this is my plan for a "translation bazaar". As said, the idea 
> is that
> > it is empty at start, and then maybe someone would dump a few 
> GNOME and
> > KDE .po files into this database, and the initial revision 
> process can
> > kick off. But the real idea is that folks supply their own 
> strings they
> > want to have translated, and the database would slowly get 
> filled, while
> > translations grow to be more accurate over time because of 
> revisions.
> > 
> > But actually I've no idea if this would become a success. I know 
> that I
> > myself have only little time and resources, so I'd be happy 
> already if I
> > only managed to get the Economy scheme. I also never worked with 
> .po
> > files and stuff. But I did do some CGI and Perl stuff recently, 
> then
> > again I can't say that I have a good cgi-bin place to put this. 
> It would
> > be really cool if folks could just file their (not too specific) 
> .po or
> > similar files into the system, and that the system automatically 
> keeps
> > these files translated and up to date. But as said, I don't know
> > anything of this .po stuff, so that really is beyond my 
> potential. But
> > if someone thinks "yeah, this is a really neat idea, and I can 
> do it!",
> > I would be delighted to form some kind of team, of course. It 
> may also
> > take some not-me expertise to support languages with different
> > alphabets.
> > 
> > So in fact, it will kind of depend on what you guys think of 
> this idea.
> > Can it succeed? Will it be popular? Will this system become a 
> standard
> > part of e.g. the rules for GNOME translation, if it works? Do 
> you feel
> > like working on it? Do you have a good CGI space?
> > 
> > I must say, I don't know if this is a good idea, or if it is 
> only a nice
> > theory with no practical value. So I really look forward to any
> > feedback.
> > 
> > Greets,
> > 
> > Stefan
> > 
> > [*] I'm not sure if this is the correct term because it's been a 
> while
> > since I had to learn it. But the problem Dutch translators have 
> to face
> > is that in English, in "I edit", the word "edit" is the same as 
> in "to
> > edit" and "you edit", while in Dutch it is not. So when 
> translating to
> > Dutch, you need to know which one to choose.
> > 
> > 
> > _______________________________________________
> > gnome-i18n mailing list
> > gnome-i18n@gnome.org
> > http://mail.gnome.org/mailman/listinfo/gnome-i18n
> 
> Aoife Dunne
> Program Manager
> European Localisation Centre
> Sun Microsystems Ireland Ltd
> Hamilton House
> East Point Business Park
> Dublin 3
> Ireland
> Tel.:         +353-1-8199-266
> Fax:. +353-1-8199-261
> Email:        aoife.dunne@Ireland.Sun.COM
> 
> 
> 
> _______________________________________________
> gnome-i18n mailing list
> gnome-i18n@gnome.org
> http://mail.gnome.org/mailman/listinfo/gnome-i18n
> 



-- 
"As for systems that are not like Unix, such as MSDOS, Windows, the
Macintosh, VMS, and MVS, supporting them is usually so much work that
it is better if you don't." -- Richard Stallman, GNU Coding Standards





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]