Re: New tile unref logic and various leak fixes



On Tue, 2010-03-23 at 01:56 +0100, Jiří Techet wrote:
> On Fri, Mar 19, 2010 at 16:54, Pierre-Luc Beaudoin
> <pierre-luc pierlux com> wrote:
> > Please reinstate view->priv instead of GET_PRIVATE in
> > aa04b92d25592044c31775f16027f7c10349fa7e. Accessing a pointer is faster
> > than the G_TYPE_INSTANCE_GET_PRIVATE macro.  The priv pointer cannot be
> > read outside of champlain-view.c because the definition of the struct is
> > undefied outsite it (see the typedefs).
> > Inheriting from ClutterGroup is not a good solution: the Clutter devs
> > have said we should rather implement a ClutterActor that implements the
> > ClutterContainer interface that does the layouting.
> > The rest of the changes in this commit are good.
> 
> Yes, I agree the complete removal of priv was unnecessary and I've
> fixed that. I have also removed the last of the "public private"
> members (in ChamplainBaseMarker), which were originally the main
> reason for the change (it's evil!).
> 
> Concerning ClutterGroup inheritance I agree in principle - when we
> inherit from it, we inherit also all the container methods, which,
> however, shouldn't be used by the user. On the other hand, it's quite
> a lot of work to do and I'd postpone the correct implementation for
> 0.8. And in fact, I haven't introduced this to libchamplain - it was
> already there - the following classes were derived from ChamplainGroup
> before my changes:
> 
> ChamplainBaseMarker
> ChamplainView
> ChamplainLayer
> 
> So I'll change my classes after you fix your classes ;-)

I never made the changes because there were IMHO no clear documentation
on how do do it at the time.  The "Implementing a new actor" section of
the Clutter Reference Documentation seems to have all it takes.

Let's merge your changes and make those fix layer.

> > in 7cdde727495633976ffd210c45903ab5f9aeb7c2, the coding style we use
> > (Telepathy's) define headers this way:
> > +static void
> > +fade_in_completed (ClutterAnimation *animation,
> > +    ChamplainTile *self)
> > This may be odd, but that's this way for now every where in the code :)
> >
> 
> You know how to bore someone to death - you are responsible for the
> most boring hour in my life fixing that.

Sorry ;-)

> > In 2efa6379de623c173d825bc404e92b4ceaac1c58, if you are removeing the
> > {}, you need to move the var declaration at the top of the function :) I
> > lazily accepted the commit as is because it had the curly brackets :)
> >
> 
> Done.
> 
> > In 98ab7c94afd0edcc854102d75db57e2d0cdacf43: cool addition! :D  It'd be
> > even nicer with a go_to ;-)
> 
> Not really, I have tried it. the reason is that the speed of go_to()
> depends on the distance where you go - and with gps, the distance on
> the map is usually small, so the speed is low as well. This means that
> the point starts moving from the center until it gets far enough to
> reach equilibrium so the speed equals the speed of the point - that's
> not very nice though. We'd need something like go_to_with_speed() to
> make it look good. For GPS navigation I also wanted to avoid problems
> when you don't have signal for a while so the point gets stuck on one
> place and then, when you get the signal again 100km far from the place
> where you lost the signal with go_to _all_ the tiles in between would
> have to be filled - a bit problematic.

go_to's math should be revisited.  The intention was that if the
distance is far, it doesn't take an hour to get there.  Somehow the time
should be passed to the function, leaving that for the caller to decide.

> > Regarding the comment in ad4686924e08d024c0208652706fa38e176867a5:
> > viewport_pos_changed_cb has a view_load_visible_tiles, it should
> > download the tiles.  But I like your change anyway, it's cleaner.
> 
> I tested it and the original version didn't work - but maybe it was
> due to some changes I made in between.
> 
> >
> > In 0f32705faafff12c439c5b4bd3eb83504ac6e98d, this condition was meant to
> > save a lot of looping.  This callback is called really often with the
> > same values.  I think it should stay.
> 
> Ah, no, you missed my point (or I didn't explain it clearly enough).
> Look at the original code:
> 
>   if (x == priv->viewport_size.x &&
>       y == priv->viewport_size.y)
>       return;
> 
>   if (fabs (x - priv->viewport_size.x) < 100 &&
>       fabs (y - priv->viewport_size.y) < 100)
>       return;

You're right.

> Do you see it? The second condition catches also the case when x ==
> priv->viewport_size.x because 0 < 100. I just removed the unnecessary
> condition that was handled by the more general condition below.
> 
> (by the way, the alignment of return should be only 2 spaces here ;-)
> 
> >
> > In 2942103542c53a93ac0351183cb93250f81ce05c, these calls will make
> > features like "show the currently cached tiles on zoom out" hard to
> > do :)
> 
> Not only these, I have more comments below.
> 
> >
> > In 26fa47ea332c5e3261671778f2d860eb0c6eca80, I believe a more static
> > array for tile_map would save a lot of CPU.  It could be resized when
> > needed, but not reallocated on every run of that function, which is
> > quite often.
> 
> Well, I don't think it will make a noticeable difference. If you
> consider that you do this allocation once and inside the loop you have
> to dynamically create multiple tiles, for each of the loaded tiles you
> have to create (quite a huge) buffer into which the data is loaded
> either from hard disk or network (which adds noticeable slowdown by
> itself) and then pass the buffer into the graphics pipeline, the
> allocation of small amount of memory is totally insignificant compared
> to this and the optimization wouldn't bring any speed up.
> 
> >
> > In 1bd883f7c19432f59818e1b63927f03409e0030b, it used to be in a idle
> > callback until the animation-marker demo was created.  The animations
> > seems to eat the idle's time and tiles were sometimes not loaded.  Did
> > you experience that?
> 
> I experienced that clutter operates somewhere between NORMAL and HIGH
> priority. When I set the idle function priority to normal, the tiles
> were loaded only after the panning was completed. But after setting it
> to high, I haven't experienced this behaviour any more.
> 
> >
> > In 9ef262a7c43e47cc8d9b66435e57762bfc9770f1, it is indeed a wtf
> > moment... It seems that TidyViewport returns invalid values when we
> > update the Adjustments.  Could be worth investigating. But your later
> > commits that block the callback might be a solution, did you retry to
> > remove that fix again after?
> 
> I think the reason is quite clear now ( = get prepared for a
> complicated explanation):
> 
> 1. You call champlain_view_set_zoom_level()
> 2. champlain_view_set_zoom_level() calls resize_viewport()
> 3. resize_viewport() calls
> 
>   g_object_set (hadjust, "lower", lower, "upper", upper,
>       "page-size", 1.0, "step-increment", 1.0, "elastic", TRUE, NULL);
> 
> with updated values.
> 4. This causes that the viewport needs to be updated and because we
> are connected to the signal, viewport_pos_changed_cb() is executed.
> 5. viewport_pos_changed_cb() executes update_viewport()
> 6. In this case, viewport relocation is needed so the if-condition is
> satisfied and the contents of the block inside executed. Before my
> change we weren't blocking the signal here and there was return at the
> end of the if-block. Based on the relocation difference you update the
> viewport origin.
> 7. Updating viewport origin causes that viewport_pos_changed_cb() is
> invoked again. The viewport origin is set correctly but you forgot to
> update priv->viewport_size.x and priv->viewport_size.y based on the
> new anchor so these two still have the old values.
> 8. update_viewport() is called again, the if-block is not executed
> this time. The latitude and longitude are updated using
> viewport_get_current_longitude() and viewport_get_current_latitude().
> Look at their implementation:
> 
> static gdouble
> viewport_get_current_longitude (ChamplainViewPrivate *priv)
> {
>   return viewport_get_longitude_at (priv, priv->anchor.x +
>       priv->viewport_size.x + priv->viewport_size.width / 2.0);
> }
> 
> Clear now, isn't it? You are combining the updated anchor with the old
> priv->viewport_size.x and the result is crap.
> 
> 9. After all this is finished (update_viewport() terminates and
> update_viewport() that called it terminates too), resize_viewport()
> can resume and after it terminates we have the wrong values of
> latitude and longitude.
> 
> OK, we could fix it by setting riv->viewport_size.x and
> priv->viewport_size.y inside the if-block in update_viewport() - but
> do we want these crazy "invisible" calls? I think that disabling the
> signal makes it much easier to reason about the code - and speeds up
> things too.

I'm all good with disabling the signal.

> > All in all, those a huge improvements! And the widget really feels
> > faster and more responsive. I've looked into fixing those memory fixes
> > many many times, throwing hours at it but I was locked in my mental
> > model of the problem. I am still experiencing a memory leak, but it
> > seems to be much more contained than before (and as previously thought,
> > it could be a leak in my 3d driver). You've also neatly simplified the
> > code! :)
> 
> Yes, what I was trying was to demystify the code. I think even I can
> understand it a little now :-). For 0.8 I'd like to (at least partly)
> hide the anchor and relocation thing into a class (I was thinking
> about inheriting from TidyViewport and putting it there). I've also
> tried avoiding it completely since clutter now uses floats for tile
> coordinates. No luck. At higher zoom levels gaps start appearing
> between tiles. After thinking about it, it's actually not much
> surprising. For coordinates we need at zoom level 18 with tile size
> 256 values in 2^18 * 2^8 = 2^26, but float has only 23 bits for
> significant values. Bad luck here :-(.

Yes, if you look at the revision history from 0.2.10 up until 0.3.9x,
libchamplain had no more anchor but had ugly spacing between the tiles
at high zoom levels.  I had to revert back to the anchor hack.

> By the way, do you think it would be possible to put some traces into
> ChamplainView? What I did was that I put printf("%s", __FUNCTION__);
> at the beginning of every function there - it's very helpful for
> debugging - especially with signals and problems like the complicated
> one above. I did this about five times now - inserting this trace for
> every function and then removing for the final commit - a little
> boring. Having some macro permanently present in every function would
> be nice.

I wouldn't mind it if it's only built in DEBUG :)

> I still wonder about the memory problem. One thing is clear - it's not
> because the tile isn't deallocated. I've added one more optimization
> (based on oprofile output), where view_update_state() consumed too
> much time because it had to go through all the tiles every time tile
> changed its state. I'm now just keeping a counter of tiles that are
> being loaded and when their number drops to 0, the state changes to
> DONE. I do this by connecting to the tile's destroy signal so as long
> as the state indication works, we can be sure that all tiles are
> deallocated - quite a good leak test as well :-).
> 
> If we are not leaking tiles, we might still be leaking something else.
> I'll make some simple test program that checks whether these leaks
> appear only when using clutter textures. Personally I suspect that
> this:
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=23530
> 
> didn't get to Karmic, which I use. Will try after Jaunty is out. What
> graphics card and linux distribution do you use?

I have Karmic right now.

> > My only complain about the widget now is something that I introduced in
> > 0.4.3: before that, there never used to be a "black out" between zoom
> > levels.  It'd be nice if the previously visible tiles were still visible
> > while the new ones are loaded.  But that can be fixed after we merged
> > all this.
> 
> This one is quite simple (I promise, it'll be short now). You did use
> a kind of double buffer before that worked about this way:
> 
> new_screen <- view_load_visible_tiles()
> swap new and old screen
> 
> This worked when filling tiles was synchronous - after calling
> view_load_visible_tiles() you could be sure that all the tiles are
> loaded and swapping the old screen with the new screen did what you
> expected. With asynchronous tile loading it's different. When
> view_load_visible_tiles() finishes, _no_ tiles have been loaded yet so
> new_screen is empty when you swap the screens and you get the "black
> out". I don't see any simple solution how it fix it with asynchronous
> tile loading (which speeds up things a lot, so I wouldn't sacrifice it
> in order to get the original behaviour). Well, actually I do. Use
> white (or other light) background instead of black background like
> launcher demo does. Instead of "black out" you get "white out", which
> is much less disturbing because the maps are light too. Is it possible
> to change the background colour in Champlain-gtk? Would be nice to
> have.

But we could still delay the removal of already loaded tiles until the
new ones have been displayed. Sure there would be an initial black out
on start up, but as you said we could use another colour such as the Gtk
+ background colour for widgets.

I reviewed all of your changes and I only have one comment about the new
fade in code: it should be possible to disable it for Maemo devices.
Fade-in is really CPU intensive on the N900 and I disable it in the
builds.  I'll make that change official using the MAEMO defines.

Thanks for your changes.  Make sure to continue to work from the latest
version (with your changes!).  If you do not oppose, I'd make a 0.5.2
release with your changes this week.

Pierre-Luc

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]