Re: Doubts about GPeriodic



Hi,

On Fri, Oct 22, 2010 at 4:48 PM, Owen Taylor <otaylor redhat com> wrote:
> I think we're largely agreeing on the big picture here - that priorities
> don't work so there has to be arbitration between painting and certain
> types of processing.

Right, good. The rest is really just details - there are various ways
it could work.

As I wrote this email I realized I'm not 100% clear how you propose
the 50/50 would work, so maybe it's something to spell out more
explicitly. There's no way to know how long painting will take right,
so it's a rule for the "other stuff" half? Do you just mean an
alternative way to compute the max time on non-painting tasks (half of
frame length, instead of 5ms or "until frame-complete comes back")?

> But pathological or not, I think it's also common. This is where my
> suggestion of a 50% rule comes in. It's a compromise. If we're lucky
> repainting is cheap, we're hitting the full frame rate, and we're also
> using 75% of the cpu to make progress. But when drawing takes more time,
> when there is real competition going on, then we don't do worse than
> halve the frame rate.
>
> (This continues to hold in the extreme - if redrawing is *really* slow -
> if redrawing takes 1s, then certainly we don't want to redraw for 1s, do
> 5ms of work, redraw for another 1s, and so forth. Better to slow down
> from 1 fps to 0.5fps than to turn a 1s computation into a 3 minute
> computation.)

Let me think about this in terms of litl shell, which is the
real-world example I'm thinking of, and maybe we can see how
gnome-shell or some other GL-using apps differ which could be
instructive.

Just to go over what's in there, you have your compositor; the most
intensive painting it does would usually be repainting video. But also
just has various GL eye candy animations and it has a lot of clutter
actors on some screens, so that takes time. Since lots of UI is in the
shell itself, compositing is not always what we're doing, sometimes
just a regular Clutter app really. Then crammed into this process, for
better or for worse:

 - "UI chrome" for out-of-process browser (toolbar in-process, actual
web page out)
 - global UI chrome (settings, switching between apps, etc.)
 - video chat stuff mixed in to the overall UI (the video chat engine
and video/audio playback is out of process, but has a lot of chit-chat
with the shell, it isn't an "app" but mixed into the global UI)
 - photos app that downloads photos from web then does GL-ish stuff to
display them (this is in process for the bad reason that drivers are
broken and out of process GL is/was fail)
 - playing audio bleeps and bings in response to UI interaction
 - chatter with a sqlite thread
 - chatter with the litl servers (over dbus, via the browser process)
 - chatter with gio threads
 - chatter over dbus about NetworkManager and whatnot
 - chatter over dbus to GTK widgets that are out of process (don't ask
but guess why I'm interested in certain GTK work ;-))
 - "misc"

I guess gnome-shell is similar except less "stuff"

As you say in the followup mail, at some point multiple
processes/threads exist for a reason.

Agreed, but in litl shell there's only one thing I think is in-process
that shouldn't be, which is the photo app, and one thing is out of
process that shouldn't be (GTK widgets). It's just a complex app
talking to a lot of other processes. All the main shell really does is
coordinate processes and paint an assemblage of the other processes. I
don't know, I would think it's basically the same deal as the "main"
Chrome process with every tab out of process, or as the main Eclipse
process where Eclipse has lots of threads, or whatever. The main shell
doesn't do blocking IO or long computations. It does have loads of IPC
queues to talk to all the threads and processes. I almost feel like
the threads and processes are the whole reason we have queue-based
GSource.

It almost seems like this is my prototypical case, where there *isn't*
any computation in the main thread, just lots of queues to dispatch,
and the case you're worried about most is where there *is* ... ?

On Radeon hardware with indirect rendering, litl shell paint takes in
the 7-9ms area. So for 60fps (16.6ms per frame) you have about 5ms per
frame leftover with a little headroom. On 50fps then you have more
headroom.

I'm not sure exactly what you're suggesting on the 50% rule; if it
strictly said 8ms instead of 5ms for the non-paint half, then that
sort of moves the practical numbers from "60fps with a bit to spare"
to "dropping frames" right? Most likely it isn't genuinely that big of
a deal because most frames don't hit the 5ms max, and even fewer would
hit the 8ms max, and we can start painting once there's nothing to do.
But there is sort of a qualitative difference between 5 and 8, which
is whether the painting still fits in the frame or finishes too late.

Ignoring the specifics, takeaways could be:
 * there's a cliff in the chosen time to spend not-painting where we
make ourselves miss the vsync
 * there's a cliff in the theoretical framerate your app can achieve,
where if you take longer than a frame to paint you must drop frames,
and if you take less than a frame you can in theory avoid dropping
frames - as long as you limit non-painting activities "enough"
 * both cliffs depend on the hardware, its refresh rate, and on how
long the app takes to paint on the hardware

I kinda think most apps would prefer to achieve the best framerate
possible as limited by their painting, possibly slowing down
non-painting activities a fair amount "within reason." That is it
kinda makes sense to avoid the cliffs.

Unless it doesn't, I mean, in your treeview example the expander
animation is unimportant enough that maybe the user would rather fill
the treeview up faster.

But if we're watching a video or playing a game, probably the user
would rather it look as nice as possible, even if some background task
is going on.

If we err on the side of a small non-paint time, then we optimize for
framerate, if we err on the side of a large one we optimize for other
stuff, I guess. I don't know how to solve this generically :-/

I feel like I know the answer for litl shell, which is that I'd rather
do 3-4ms "non paint" than go up to 8ms "non paint," just to be sure we
hit 60fps on the hardware we care about... and as far as I know,
there's no compression that's essential, or other work that simply
must be done on this frame rather than next. I mean, we even chop off
the event queue processing after 5ms and skip that compression too.
However I don't think 8ms would be so bad in practice. The main thing
is to have some kind of limit that's below a frame length, so we
aren't dropping multiple frames at a time. That and to fix any bugs
that mean we don't get a good framerate on average.

It varies by case even within the shell I guess... if the current
animation is just a spinner, sure drop that to 30fps and finish the
operation quicker... but tuning the spinner is less important than the
cases where we really want the 60fps (e.g. video or more prominent
animations).

Is it helpful that for framerate there are only a couple logical
options... one is 50-60fps (actual refresh rate) and the other is
24-30fps (usual video framerate, lower limit to look decent, skip
every other vsync). Possibly given that there are 2 instead of N sane
framerates, there's some way to pick the best one ... everything below
24-30 is ugly and basically would be a bug, everything above 60 is a
waste of effort and basically a bug. Maybe we care about 120hz
monitors? I don't know. Anyway then 3 instead of 2.

There are also only so many reasonable amounts of time to spend on
painting or non-painting... any paint speed that isn't going to get
say 24, or at least 15, fps is not really reasonable. Any non-painting
timeslice that can't dispatch the whole event queue and a bit of other
work on every frame is not really reasonable. Any non-painting
timeslice that doesn't leave a good few ms for painting is not really
reasonable.

Anyhow I don't know exactly the right answer. The main thing I was/am
trying to argue is for some sort of arbitration here to avoid starving
either thing.

Maybe 50/50 is the best we can do. The thing that kinda gives me pause
there, is that hand-tuning 5/11 is pretty clearly a bit better than
8/8 for litl shell on the specific hardware in question. I feel like
an ideal solution would get a bit closer to the hand-tuning somehow.

> The things that are slow in the main thread for most apps that I've ever
> profiled are layout and painting. Everything else going is just setting
> that up.

For the litl shell this is true too, so our only real goal is to keep
the other stuff from messing up the painting inadvertently because it
gets lumped up badly. Painting is using most of the CPU.

> "The queue" is largely not in your control - it's whatever GIO or gdbus
> does. If they run the full queue or schedule a separate source for each
> callback, then painting won't happen until that's done. If they only do
> a fixed amount of work before yielding then compression breaks.

I'd propose that generic things like GIO and gdbus "should" be
splitty, dispatch as little as possible per iteration. The alternative
(always run the whole queue) just doesn't seem OK.

Havoc


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]