Statistics for each GNOME translator's work
- From: Simos Xenitellis <simos lists googlemail com>
- To: Gnome i18n <gnome-i18n gnome org>
- Subject: Statistics for each GNOME translator's work
- Date: Tue, 20 Jul 2010 01:35:56 +0300
Hi All,
A few months ago there was a discussion on gnome-i18n about
the issue of having translation statistics or a way to see easily how much work
each translator is doing.
Such a thing would be useful if, for example, you want to make an announcement
of the localisation of GNOME 3.0 to your language and you want to show how
much work each translator did.
I am working on such a tool and it is available at
http://github.com/simos/gnome-l10n-translator-stats
Here is a sample run for 'yelp' only (/tmp/GIT/ has only 'yelp'),
> ./gnome-l10n-translator-stats stats --language el --startdate "2009/01/01" --enddate "2010/06/30" --release gnome-2-30 --repositories /tmp/GIT/
Release : gnome-2-30 retrieved release: gnome-2-30
Language : el retrieved language: el
Repositories : /tmp/GIT/
Start date : Thu Jan 1 00:00:00 2009
End date : Wed Jun 30 00:00:00 2010
Thanos Lefteris <xxx gmail com> 165 16
Kostas Papadimas <xxx gnome org> 257 45
Simos Xenitellis <xxx gnome org> 0 2
> _
The first column is translated words and the second is 'changed
messages' (or translation fixes/updates).
What's missing is to figure out a better algorithm to count the work
when a translation is 'updated'.
Because when a translation is added for the first time, it's simple to
count the translated words.
The current algorithm is
1. Obtain the before and after versions of a PO file.
2. Use 'pocount' to count the translated strings in both, note down
the different
3. Use 'podiff' to count the 'changed' messages (message updates).
Disadvantages
a. 'pocount' shows sometimes less messages in the newer PO, so the
difference is negative.
Currently we do not count these numbers in.
b. I did not establish the significance of the podiff changed messages.
c. The figures are rough statistics. It should take several revisions
before the statistics
are exact.
Features/Advantages
a. The stats for a full GNOME release (gnome-2-30) and a language (el)
takes about 20 minutes to complete
b. You can specify which release to use or even a specific module.
c. Made my own 'python-git' class. 'python-git' was erratic. There are
many commit messages
which are messy. For example, non-UTF8 text, e-mails structured as 'me
at gmail dot com',
a field 'Merge:'.
d. Can show text colors.
e. Allows to extend prior to April 2009; before then, GNOME used to have SVN.
The actual translator name was in the comment. With this tool you can
attribute correctly
the actual translator.
f. The way the tool works is it creates a temporary branch and then it
removes the commits
so that it can find the different editions of the PO file during the
specified time period.
Once the stats have been calculated, the temporary branch is gone and the
repository switches back to 'master'.
Cheers,
Simos
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]