Re: GSK review and ideas



Hi Alex;

On 15 December 2016 at 15:26, Alexander Larsson <alexl redhat com> wrote:
I just read through the wip/otte/rendermode branch, and I must say I
really like it. It makes it very explicit that the rendernodes are
immutable. For instance, there is no "append" operation or any setters
for any data (other than the mostly-debug name).

The mutator API was indeed a remnant of the initial design, before it
settled towards immutable nodes, and should have been removed before
freezing it.

However, I don't think the branch goes far enough. There are several
more changes I'd like to make:

First of all, I think GtkSnapshot as-is doesn't make sense in gtk. It
is a natural companion to GskRenderNode, and should live next to it
(modulo some tiny helpers for css rendering that can stay in
gtk+). I'm thinking we can call it GskRenderTreeBuilder or something,
because its very similar to the String / StringBuilder pattern.

I agree (and said this to Benjamin on IRC already :-)).

The "snapshot" API makes sense as a transitional API to migrate from
an immediate mode rendering (Cairo) to a retained one (GSK).
Nevertheless, having a "render tree builder" API in GSK is much more
appropriate, especially since I'm trying to get the "layers" canvas
API integrated into it, for applications that do not need widgets.

I know that Benjamin want to have a "unique" naming policy, but at
this point I much prefer explicit — albeit common — API; a class that
builds a tree of render nodes is a RenderTreeBuilder, not a
Frobnicator.

gsk is pretty tiny now, and it is really quite tied to gdk, in that
GskRenderer is basically they one way you're supposed to draw to a
toplevel GdkWindow. I don't really get why it has to have its own
namespace. I think we should just move all the gsk stuff to gdk. There
are no pre-existing naming conflicts, and I think it will make things
clearer.

I honestly like the idea of having a separate namespace. GDK provides
integration with the windowing system surfaces and event handling; GSK
provides a rendering and mid-level canvas API; GTK provides a rich
widget API. Ideally, you could use the GDK API and then render on it
directly with GL, similarly to how you'd use SDL.

All in all, I'd rather have GDK become smaller and smaller, instead of
merging GSK and GDK. After all, if we're really in a merging mood, we
could merge GDK, GSK, and GTK into the same namespace, and rename
GdkWindow into GtkSurface.

I think the many layers of rendering are a bit confusing, there are so
many different _begin_frame() calls. One obvious cleanup here would be
to break out the cairo fallback handling in GdkWindow into a
GdkDrawContext we could use instead of the GL one, then we can get rid
of code like:

  if (draw_context)
    gdk_draw_context_begin_frame (draw_context, real_region);
  else
    gdk_window_begin_paint_internal (window, real_region);

This would make sense as a fallback.

The only thing you use GskRenderNode objects for now is to create a
tree of them and hand that off to a GskRenderer for rendering of a
frame. All the other operations on the tree (other than the debug
stuff in the inspector) is handled purely inside gsk/gsk. For
instance, there is no reason to hang on to a render node for any other
lifetime than the rendering of the frame. Having all this code doing
atomic operations to refcount the nodes during construction, then
nothing, then atomic refs for every node after the frame is done seems
quite unnecessary.

One of the original goals for GskRenderNode and GskRenderer was the
ability to hand over a render tree to a separate thread — hence why
the immutability at render time. Atomic reference counting was a way
to ensure that a renderer could deal with the ownership of the tree
across thread barriers.

Of course, a renderer could also take the render tree, convert it into
its own internal data structure, and then hand over that to a separate
thread.

There's also the issue of the inspector being able to record the
render tree for separate frames; or even serialize the tree and send
it across process barriers to a profiler like sysprof.

In general, it's true that once committed to the renderer, a render
tree for a frame should be dropped; but there are cases where keeping
it around is beneficial for additional tools.

Instead I propose we add a new object GskRenderTree, which create and
owns all the rendernodes that are part of the frame. The nodes would
be tied to a render tree forever and have the same lifetime as the
tree. This would allow us to very very efficiently allocate and free
render nodes. You'd just allocate a chunk of memory for the tree and
hand out references to that part of that as we create new nodes.  Then
we can free this with a few calls to free() at the end.  Most of the
data in the current render node implementations is either inline or
could easily be allocated from the same chunk of memory during
creation. The only exceptions are really cairo_surface_t and
GskTexture, but we can easily make the RenderTree own them. Then we
can also drop the finalize vfunc for the render nodes.

I'd probably add more ancillary data to this structure and call it a
GskRenderFrame, which is composed of:

 - the render tree
 - the metadata attached to the target surface, like its geometry and
color space
 - the age of the frame
 - the clip region for the frame

Then we would make GskRenderer a collection and an allocator of frames.

This combined with the fact that OpenGL makes it very hard, flickerly
and generally poorly supported to do damage-style partial updates of
the front buffer means we should consider always updating the entire
toplevel each time we paint. This is what games do, and we need to
make that fast anyway for e.g. the resize case, so we might as well
always do it. That means we can further simplify clipping in general,
because then we always start with a rectangle and only clip by rects,
which i think means we can do the clipping on the GPU.

Ideally, we can use GLX and EGL extensions to swap regions of the back
buffer. GNOME Shell uses them already, and the main reason why you
should always prefer that is to reduce the bandwidth usage, which in
turn keeps things powered up only for the smallest amount of time
possible. Pushing GB/s over the GPU channel for 60 frames in a second
in order to animate a 64x64 pixels spinner is not advisable when you
are on battery power. :-)

Sadly, as we found out during the last 10 years of Clutter
development, drivers aren't always good; things like
glXCopySubBuffer() not being vsync-locked ended up adding tearing,
instead of removing it. Newer drivers and GPUs are better at that,
though; and in a Wayland-oriented future, the compositor is definitely
better suited at dealing with this stuff for us.

At a lower level, we should use scissoring to deal with the bounding
box of the clip region as much as possible, to let the GL driver drop
unnecessary geometry for us; after that, using things like
EGL_EXT_swap_buffers_with_damage to specify the damaged region would
be the best option to reduce the amount of pixels being pushed to the
GPU.

Of course, we *really* need to avoid emitting that geometry in the
first place — and that's where culling at the GTK level comes into
play.

Of course, such culling relies on the fact that the children don't
draw outside their allocation. Culling seems quite important to be
able to handle 10000s of rows in a listbox. So maybe we need to
actually clip every widget.

We need to ensure that widgets do not draw outside their clip (which
usually matches with the allocation, but not always); if they do,
you'll get artefacts when doing things like transformation — which
usually gets noticed pretty quickly.

Rendering is a matter of two things, in the GSK world:

 - attaching rendering operations as nodes
 - rasterizing content via Cairo

Currently, GSK is pretty much all about the latter — and thus there
can never be rendering outside of the render node bounds, since the
Cairo drawing surface is created with clipping information based on
the bounds rectangle.

Given that rendering operations come with their own geometry, we can
also check the transformed boundaries of each sub-tree, and we can
cull the nodes that fall outside of the visible frustum. This would be
slightly less efficient than not creating those nodes in the first
place, but we can point that out to the app developer pretty easily in
our tools.

Ciao,
 Emmanuele.

-- 
https://www.bassi.io
[@] ebassi [@gmail.com]


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]