Proposal for declinations in gettext
- From: Danilo Segan <dsegan gmx net>
- To: translation iro umontreal ca
- Cc: linux-utf8 nl linux org, translation-i18n lists sourceforge net,GNOME I18N List <gnome-i18n gnome org>
- Subject: Proposal for declinations in gettext
- Date: Fri, 13 Jun 2003 22:14:25 +0200
Hi,
first, sorry for cross-posting (some of you will receive multiple
messages :-().
I'd like to propose a simple gettext extension which would work at least
for Serbian, but I hope it would work for many other languages.
*Background:*
Serbian language has 7 declinations of a word (nouns, pronouns, and
similar words), in recent discussions on gnome-i18n list I found out
that Finnish has 15, etc. This becomes a major problem when translating
"composed" strings, as in "move %s", where "%s" might be any of "queen",
"king",...
The usual scenario is this (Serbian latin transliteration used for
examples):
msgid "queen"
msgstr "kraljica"
msgid "king"
msgstr "kralj"
msgid "move %s"
msgstr "premesti %s"
msgid "go with %s"
msgstr "idi sa %s"
It's unfortunate (or is it?) that we'll get the form of "premesti
kraljica" which is incorrect (it ought to be "premesti kraljicu"), or
"idi sa kralj" instead of "idi sa kraljem".
The solution is simple, and I guess that it will work for at least all
Slavic languages, but probably many more.
*Solution:*
# in the header, 7 is a sample for Serbian
"PO-Number-of-noun-forms: 7\n"
msgid "queen"
msgstr<0> "kraljica"
msgstr<3> "kraljicu"
msgstr<5> "kraljicom"
msgid "king"
msgstr<0> "kralj"
msgstr<3> "kralja"
msgstr<5> "kraljem"
msgid "move %s"
msgstr "premesti %<3>s"
msgid "go with %s"
msgstr "idi sa %<5>s"
<i>, where i=0 .. (PO-Number-of-noun-forms)-1, is the index of the form
required, and it depends on the sentence construction. It is determined
by the verb, or perhaps words like "with", "whom", ... Some of
msgstr<i>'s can be omitted if it's known not to be used in composition
(most are highly unlikely to be ever used in translations, like the
"vocative" form of "hey %s").
The good side of this approach (the syntactic elements are arbitrary,
don't comment on those) is that programs that use gettext for l10n would
need no change: everything would be done on the gettext library side and
by translators (it's even better than plural-forms in that manner). Of
course, care should be taken to allow also combination of these and
plural forms, as in:
msgid "king"
msgid_plural "kings"
msgstr[0]<0> "kralj"
msgstr[0]<5> "kraljem"
msgstr[2]<0> "kraljevi"
msgstr[2]<5> "kraljevima"
Before diving into gettext code, it'd be nice to hear if this kind of
approach would work for any language other than Serbian (I repeat, I
find it likely to work for Slavic languages, and German, those being the
languages I'm at least a bit familiar with).
In any case, looking forward to hearing from all of you.
Again, sorry for crossposting, but I just wanted to reach the widest
possible audience, so as to get some *real* insight into the problem.
Cheers,
Danilo
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]