Re: Low memory hacks



Hi Simos,

Yesterday at 15:02, Simos Xenitellis wrote:

>> I'll like to see some real numbers on the memory usage instead of
>> numbers being thrown around.
>
> In Ubuntu 7.10, the PO files for en_GB are
> $ du
> -h /usr/share/locale/en_GB/LC_MESSAGES /usr/share/locale-langpack/en_GB/LC_MESSAGES/
> 2.3M    /usr/share/locale/en_GB/LC_MESSAGES
> 17M     /usr/share/locale-langpack/en_GB/LC_MESSAGES/
> $_ 
>
> In Ubuntu 8.04 (alpha 6), the PO files for en_GB are
> $ du
> -h /usr/share/locale/en_GB/LC_MESSAGES /usr/share/locale-langpack/en_GB/LC_MESSAGES/
> 84K    /usr/share/locale/en_GB/LC_MESSAGES
> 2.2M     /usr/share/locale-langpack/en_GB/LC_MESSAGES/
> $_ 
>
> What I am missing here is that I do not know when/how Ubuntu adds this
> functionality. It would benefit other distros as well. Did Debian
> introduce with feature? Danilo, any links?

I am not handling Ubuntu packaging stuff—it'd be worth checking with
Ubuntu guys instead.  Martin Pitt is probably the right person to ask
about it, but looking at the language pack sourcepackage should give a
clue as well.

However, I'd note that en_GB is not really the right locale to do
the metrics on.

>>>From the 2.3M + 17M MO files in Ubuntu 7.10, a typical GNOME session
> loads up a subset of the MO files,
>
> # lsof | grep \.mo\$ | awk '{print $7,$9}' | sort -n | uniq
>
> At this moment, my 7.10 is a bit messed up (I have en_GB.UTF-8 but most
> apps have en_US?!?). The figures for 8.04 with el_GR should be
> comparative of what you get now with 7.10 and en_GB:

They wouldn't be. A majority of el_GR probably uses two-byte UTF-8
sequences, while en_GB would use a majority of single byte UTF-8
sequences (i.e. ASCII).

> # lsof | grep \.mo\$ | awk '{print $7,$9}' | sort -n | uniq | awk
> '{printf "%d+",$1}' > /tmp/bc_sums
>
> Using "bc" with /tmp/bc_sums gives the figure
> 3.6M (3624412) for a standard session. This figure is a bit
> conservative, because en_GB probably did more work than el.
>
> With Ubuntu 8.04 (alpha6) and en_GB, the figure for the MO files is
> less than 600K (585375).
> Bastien, could you provide the proper figure for your system?
>
> That is a saving of at least 3M in memory.

As Bastien explained, mmap() doesn't read the entire file into memory,
but only reads it as needed.

> The stripping of "unneeded" messages is good, and should happen at the
> package generation level (not in GNOME, or when creating tarballs). 

Technically, I've opposed introducing this in intltool because of a
one incompatible difference:

  current gettext("Something") != such gettext("Something")

i.e. if "Something" was (un)translated as "Something" in the MO file,
gettext would return a static pointer with the string "Something".  If
it was untranslated, it would return the passed pointer.

That can and was used to detect whether there is a translation in some
programs (I've seen it done), so, until gains are proven to be big
enough to warrant breaking a few programs in strange ways, I wouldn't
do it on the packaging/build time.

Of course, providing numbers to show what the gains are would help
make the decision.

Cheers,
Danilo


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]