Re: Trivial patch reducing fp mults in pango-cairo



On 12/13/06, Behdad Esfahbod <behdad behdad org> wrote:
On Mon, 2006-12-11 at 12:56 -0800, Daniel Amelang wrote:
> What's left is the conversion from fixed->double, which can be done
> w/out the __mul or the __floatsidf. Basically, the number of leading
> zeros in the fixed point number can be used to determine the exponent
> value of the target double, and since the number is in fixed point,
> you'll need to use a bias that is adjusted for the size of the
> fractional part of the fixed point number. After shifting the fixed
> point the proper amount (based on the number of leading zeros again),
> you'll have your exponent and mantissa all set to pack into a union.
> Copy the double from the union into the cairo_glyph coordinate, and
> you're done. Need to watch out for some special cases, but I think
> that the approach is sound.

Well, this is kinda hitting the limit.  You are basically rewriting soft
float routines.  First, I'm not sure it's much faster (ok, you can skip
some details, so it's got to be faster), second, you are mostly shifting
time from __mul to library functions.  I'll rather leave these to the
compiler.  Has anyone tested compiling recent pango+cairo with
softfloats on small systems?

I'm going to guess that you haven't looked over the softfloat source
code very carefully :) What I'm proposing is so much simpiler, and
will pipeline so much better that to say that I'm "basically rewriting
soft float routines" is a stretch. This is pretty similar to what I
did with cairo_lround, and I saw a 5x speedup on ARM for that function
alone after I converted it to use an approach similar to the one
above. Usually, you get a bunch of simple integer instructions w/ few
little branches, if any, which is really fast on most systems. Either
way, we can't say for sure until someone codes it up :)

> Once that is done, pangocairo should be pretty much FP free for the
> typical code paths that I would expect to see on the 770. On
> timetext.c or the torturer's GtkTextView, I don't think you'll see
> _that_ much improvement (percentage-wise) from this change until you
> get Xan's XRender glyph optimization into cairo, as that is a bigger
> bottleneck ATM, I think.

Yeah, if you compare the overall profiles with pangocairo ones,
pangocairo is taking like less than 5% of the time (possibly much less).
Nothing to be gained here.

Here, I totally agree with you. This is why I haven't bother to code
it up yet. But since Jorn was looking into eliminating FP from
pangocairo, I thought I'd share what I think is the best way to do so,
given that's what you want.

Dan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]