Re: Collaboration on standard Wayland protocol extensions

From: Drew DeVault <sir cmpwn com>
To: "Kwin, NET API, kwin styles API, kwin modules API" <kwin kde org>
Cc: desktop-devel-list gnome org, wayland-devel lists freedesktop org
Subject: Re: Collaboration on standard Wayland protocol extensions
Date: Mon, 28 Mar 2016 09:00:34 -0400

On 2016-03-28  2:13 PM, Carsten Haitzler wrote:

yes but you need permission and that is handled at kernel level on a specific
file. not so here. compositor runs as a specific user and so you cant do that.
you'd have to do in-compositor security client-by-client.


It is different, but we should still find a way to do it. After all,
we're going to be in a similar situation eventually where we're running
sandboxed applications and the compositor is granting rights from the
same level of privledge as the kernel provides to root users (in this
case, the role is almost of a hypervisor and a guest).

you wouldn't recreate ffmpeg. ffmpec produce libraries like avcodec. like a
reasonable developer we'd just use their libraries to do the encoding - we'd
capture frames and then hand off to avcodec (ffmpeg) library routines to do the
rest. ffmpeg doesnt need to know how to capture - just to do what 99% of its
code is devoted to doing - encode/decode. :) that's rather simple. already we
have decoding wrapped - we sit on top of either gstreamer, vlc or xine as the
codec engine and just glue in output and control api's and events. encoding is
just the same but in reverse. :) the encapsulation is simple.


True, that most of the work is in the avcodec. However, there's more to
it than that. The entire command line interface of ffmpeg would be
nearly impossible to build into the compositor effectively. With ffmpeg
I can capture x, flip it, paint it sepia, add a logo to the corner, and
mux it with my microphone and a capture of the speakers (thanks,
pulseaudio) and add a subtitle track while I'm at it. Read the ffmpeg
man pages. ffmpeg-all(1) is 23,191 lines long on my terminal (that's
just the command line interface, not avcodec). There's no way in hell
all of the compositors/DEs are going to be able to fulfill all of its
use cases, nor do I think we should be trying to.

Look at things like OBS. It lets you specify detailed encoding options
and composites a scene from multiple video sources and audio sources,
as well as letting the user switch between different scenes with
configurable transitions. It even lets you embed a web browser into the
final result! All of this with a nice GUI to top it off. Again, we can't
possibly hope to effectively implement all of this in the compositor/DE,
or the features of the other software that we haven't even thought of.

the expectation is there won't be generic tools but desktop specific ones. the
CURRENT ecosystem of tools exist because that is the way x was designed to
work. thus the srate of software matches its design. wayland is different. thus
tools and ecosystem will adapt.


That expectation is misguided. I like being able to write a script to
configure my desktop layout between several presets. Here's an example -
a while ago, I used a laptop at work that could be plugged into a
docking station. I would close the lid and use external displays at my
desk. I wanted to automatically change the screen layout when I came and
went, so I wrote a script that used xrandr to do it. It detected when
there were new outputs plugged in, then disabled the laptop screen and
enabled+configured the two new screens in the correct position and
resolution. This was easy for me to configure to behave the way I wanted
because the tooling was flexible and cross-desktop. Sure, we could make
each compositor support it, but each one is going to do it differently
and in their own subtly buggy ways and with their own subset of the
total possible features and use-cases, and none of them are going to
address every possible scenario.

as for output config - why would the desktops that already have their own tools
then want to support OTHER tools too? their tools integrate with their settings
panels and look and feel right and support THEIR policies.


Base your desktop's tools on the common protocol, of course. Gnome
settings, KDE settings, arandr, xrandr, nvidia-settings, and so on, all
seem to work fine configuring your outputs with the same protocol today.
Yes, the protocol is meh and the implementation is a mess, but the
clients of that protocol aren't bad by any stretch of the imagination.

let me give you an example:

http://devs.enlightenment.org/~raster/ssetup.png

[snip]


This is a very interesting screenshot, and I hadn't considered this. I
don't think it's an unsolvable problem, though - we can make the
protocol flexible enough to allow compositor-specific metadata to be
added and configurable. These are the sorts of requirements I want to be
gathering to design this protocol with.

no - we don't have to implement it as a protocol. enlightenment needs zero
protocol. it's done by the compositor. the compositors own tool is simply a
settings dialog inside the compositor itself. no protocol. not even a tool.
it's the same as edit/tools -> preferences in most gui apps. its just a dialog
the app shows to configure itself.


I currently do several things in different processes/binaries that
enlightenment does in the compositor, things like the bar and the
wallpaper. I don't want to make an output configuration GUI tool nested
into the compositor, it's out of scope.

chances are gnome likely will do this via dbus (they love dbus :)). kde - i
don't know. but not everyone is implementing a wayland protocol at all so
assuming they are and saying "do it the same way" is not necessarily saving any
work.


We're all writing wayland compositors here. We may not all have dbus or
whatever else in common, but we do have the wayland protocol in common,
and it can support this use-case. It makes sense to use it.

then intents are only a way of deciding where a surface is to be displayed -
rather than on the current desktop/screen.

so simply mark a surface as "for presentation" and the compositor will put it
on the non-internal display (chosen maybe by physical size reported in edid as
the larger one, or by elimination - its on the screen OTHER than the
internal... maybe user simply marks/checkboxes that screen as "use this
screen for presenting" and all apps that want so present get their content
there etc.)


Man, this is going to get really complicated. How do you decide what
display is "internal" or not? What if the user wants to present on their
primary display? What about applications that use the entire output for
things other then presentations? What if the application wants to use
several outputs, and for different purposes? What language are you going
to use to describe these settings to the user in a way that makes more
sense than the clients describing for themselves why they need to use a
particular output?

so what you are saying is it's better to duplicate all this logic of screen
configuration inside every app that wants to present things (media players -
play movie on presentation screen, ppt/impress/whatever show presentation there,
etc. etc.) and how to configure the screen etc. etc., rather than have a simple
tag/intent and let your de/wm/compositor "deal with it" universally for all
such apps in a consistent way?


No. Applications want to be full screen or they don't want to be. If
they want to pick a particular output, we can easily let them do so.

Cool. Suggestions for what sort of capability thiis protocol should
have, what kind of surface roles we will be looking at? We should
consider a few things. Normal windows, of course, which on compositors
like Sway would be tiled. Then there's floating windows, like


ummm whats the difference between floating and normal? apps like gnome
calculator just open ... normal windows.


Gnome calculator doesn't like being tiled: https://sr.ht/Ai5N.png

There are probably some other applications that would very much like to
be shown at a particular aspect ratio or resolution.

xdg shell should be handling these already - except dmenu. dmenu is almost a
special desktop component. like a shelf/panel/bar thing.


dmenu isn't the only one, though, that may want to arrange itself in
special ways. Lemonbar and rofi also come to mind.

[input is] something that many of Sway's users are asking for.


they are going to have to deal with this then. already gnome and kde and e will
all configure mouse accel/left/right mouse on their own based on settings. yes
- i can RUN xset and set it back later but its FIGHTING with your DE. waqyland
is the same. use the desktop tools for this :) yes - it'll change between
compositors.  :) at least in wayland you cant fight with the compositor here.
for sway - you are going ot have to write this yourself. eg - write tools that
talk to sway or sway reads a cfg file you edit or whatever. :)


I've already written this into sway, fwiw, in your config file. I think
this is fine, too, and I intend to keep supporting configuring outputs
like that. But consider the use case of Krita, or video games like Osu!

However, beyond detailed input device configuration, there are some
other things that we should consider. Some applications (games, vnc,
etc) will want to capture the mouse and there should be a protocol for
them to indicate this with (perhaps again associated with special
permissions). Some applications (like Krita) may want to do things like
take control of your entire drawing tablet.


as i said. can of worms. :)


It's a can of worms we should deal with, and one that I don't think it's
hard to deal with. libinput lets you configure a handful of details
about input devices. Let's expose these things in a protocol.

you have no idea how many non-security-sensitive things need fixing first
before addressing the can-of-worms problems. hell nvidia just released drivers
that requrie compositors to re-do how they talk to egl/kms/drm to work that's
not compatible with existing drm dmabuf buffers etc. etc.


Why do those things need to be dealt with first? Sway is at a good spot
where I can start thinking about these sorts of things. There are
enough people involved to work on multiple things at once. Plus,
everyone thinks nvidia's design is bad and we're hopefully going to see
something from them that avoids vendor-specific code.

I don't see these problems as a can of worms. I see them as problems
that are solvable and necessary to solve, and now is a good time to
solve them. My compositor is coming up on version 1.0. Supporting the
APIs is the driver's problem, we've described the spec and as soon as
they implement it, it will Just Work(tm).

even clients and decorations. tiled wm's will not want clients to add
decorations with shadows etc. - currently clients will do csd because csd is
what weston chose and gnome has followed and enlightenment too. kde do not want
to do csd. i think that's wrong.


What is a can of worms is the argument over whether or not we should use
CSD or SSD. I fall in the latter camp, but I don't think we need to
fight over it now. We should be able to agree that a protocol for
negotiating whether or not borders are drawn would be reasonable. Is it
a GTK app that does nothing interesting with its titlebar? Well, if the
compositor wants to draw its borders, then let it do so. Does it do
fancy GTK stuff with the borders? Well, no, mister compositor, I want to
do fancy things. Easy enough.

it adds complexity to wayland just to "not follow the convention". but
for tiling i see the point of at least removing the shadows. clients
may choose to slap a title bar there still because it's useful
displaying state. but advertising this info from the compositor is not
standardized. what do you advertise to clients? where/when? at connect
time? at surface creation time? what negotiation is it? it easily
could be that 1 screen or desktop is tiled and another is not and you
dont know what to tell the client until it has created a surface and
you know where that surface would go. perhaps this might be part of a
larger set of negotiation like "i am a mobile app so please stick me
on the mobile screen" or "i'm a desktop app - desktop please" then
with the compositor saying where it decided to allocate you (no mobile
screen available - you are on desktop) and app is expected to adapt...


In Wayland you create a surface, then assign it a role. Extra details
can go in between, or go in the call that gives it a role. Right now
most applications are creating their surface and then making it a shell
surface. The compositor can negotiate based on its own internal state
over whether a given output is tiled or not, or in cases like AwesomeWM,
whether a given workspace is tiled or not. And I don't think the
decision has to be final. If the window is moved to another output or
really if any of the circumstances change, they can renegotiate and the
surface can start drawing its own decorations.

there's SIMPLE stuff like - what happens when compositor crashes? how do we
handle this? do you really want to lose all your apps when compositors crash?
what should clients do? how do we ensure clients are restored to the same place
and state? crash recovery is important because it is always what allows
updates/upgrades without losing everything. THIS stuff is still "un solved".
i'm totally not concerned about screen casting or vnc etc. etc. until all of
these other nigglies are well solved first.


I'm still not on board with all of this "first" stuff. I don't see any
reason why we have to order ourselves like this. It all needs to get
done at some point. Right now we haven't standardized anything, and each
compositor is using its own unique, incompatible way of taking
screenshots and recording videos, and each is probably introducing some
kind of security problem.

apps can show their own content for their own bug reporting. for system-wide
reporting this will be DE integrated anyway. supporting video capture is a a
can of worms. as i said - single buffer? multiple with metadata? who does
conversion/scaling/transforms? what is the security model? and as i said - this
has major implications to the rendering back-end of a compositor.


The compositor hands RGBA (or ARGB, whatever, I don't care, we just pick
one) data to the client that's recording. This problem doesn't have to
be complicated. As for the "major implications"...

there's a difference. when its an internal detail is can be changed and
adapted to how the compositor and its rendering subsystem work. when its a
protocol you HAVE to support THAT protocol and the way THAT protocol defines
things to work or apps break.


You STILL have to get the pixels into the encoder on the compositor
side. You will ALWAYS have to do that if you want to support video
captures, regardless of who's doing it. At some point you're going to
have to get the pixels you're rendering and hand them off to someone, be
that libavcodec or a privledged client.

We can make Wayland support use-cases that are important to our users or
we can watch them stay on xorg perpetually and end up maintaining two
graphical stacks forever.


priorities. there are other issues that should be solved first before worrying
about the pandoras box ones.


These are not pandora's box. These are small, necessary features.

--
Drew DeVault

Follow-Ups:
- Re: Collaboration on standard Wayland protocol extensions
  - From: The Rasterman

References:
- Collaboration on standard Wayland protocol extensions
  - From: Drew DeVault
- Re: Collaboration on standard Wayland protocol extensions
  - From: The Rasterman
- Re: Collaboration on standard Wayland protocol extensions
  - From: Drew DeVault
- Re: Collaboration on standard Wayland protocol extensions
  - From: The Rasterman

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]