Re: Collaboration on standard Wayland protocol extensions

From: Carsten Haitzler (The Rasterman) <raster rasterman com>
To: Drew DeVault <sir cmpwn com>
Cc: "Kwin, NET API, kwin styles API, kwin modules API" <kwin kde org>, desktop-devel-list gnome org, wayland-devel lists freedesktop org
Subject: Re: Collaboration on standard Wayland protocol extensions
Date: Mon, 28 Mar 2016 23:03:00 +0900

On Mon, 28 Mar 2016 09:00:34 -0400 Drew DeVault <sir cmpwn com> said:

On 2016-03-28  2:13 PM, Carsten Haitzler wrote:

yes but you need permission and that is handled at kernel level on a
specific file. not so here. compositor runs as a specific user and so you
cant do that. you'd have to do in-compositor security client-by-client.


It is different, but we should still find a way to do it. After all,
we're going to be in a similar situation eventually where we're running
sandboxed applications and the compositor is granting rights from the
same level of privledge as the kernel provides to root users (in this
case, the role is almost of a hypervisor and a guest).


should we? is it right to create yet another rsecurity model in userspace
"quickly" just to solve things that dont NEED solving at least at this point.

you wouldn't recreate ffmpeg. ffmpec produce libraries like avcodec. like a
reasonable developer we'd just use their libraries to do the encoding - we'd
capture frames and then hand off to avcodec (ffmpeg) library routines to do
the rest. ffmpeg doesnt need to know how to capture - just to do what 99%
of its code is devoted to doing - encode/decode. :) that's rather simple.
already we have decoding wrapped - we sit on top of either gstreamer, vlc
or xine as the codec engine and just glue in output and control api's and
events. encoding is just the same but in reverse. :) the encapsulation is
simple.


True, that most of the work is in the avcodec. However, there's more to
it than that. The entire command line interface of ffmpeg would be
nearly impossible to build into the compositor effectively. With ffmpeg
I can capture x, flip it, paint it sepia, add a logo to the corner, and
mux it with my microphone and a capture of the speakers (thanks,
pulseaudio) and add a subtitle track while I'm at it. Read the ffmpeg
man pages. ffmpeg-all(1) is 23,191 lines long on my terminal (that's
just the command line interface, not avcodec). There's no way in hell
all of the compositors/DEs are going to be able to fulfill all of its
use cases, nor do I think we should be trying to.

Look at things like OBS. It lets you specify detailed encoding options
and composites a scene from multiple video sources and audio sources,
as well as letting the user switch between different scenes with
configurable transitions. It even lets you embed a web browser into the
final result! All of this with a nice GUI to top it off. Again, we can't
possibly hope to effectively implement all of this in the compositor/DE,
or the features of the other software that we haven't even thought of.


adding watermarks can be done after encoding as another pass (encode in high
quality). hell watermarks can just be a WINDOW (surface) on the screen. you
don't need options. ass for audio - not too hard to do along with it. just
offer to record an input device - and choose (input can be current mixed output
or a mic ... or both).

the expectation is there won't be generic tools but desktop specific ones.
the CURRENT ecosystem of tools exist because that is the way x was designed
to work. thus the srate of software matches its design. wayland is
different. thus tools and ecosystem will adapt.


That expectation is misguided. I like being able to write a script to
configure my desktop layout between several presets. Here's an example -
a while ago, I used a laptop at work that could be plugged into a
docking station. I would close the lid and use external displays at my
desk. I wanted to automatically change the screen layout when I came and
went, so I wrote a script that used xrandr to do it. It detected when
there were new outputs plugged in, then disabled the laptop screen and
enabled+configured the two new screens in the correct position and
resolution. This was easy for me to configure to behave the way I wanted
because the tooling was flexible and cross-desktop. Sure, we could make
each compositor support it, but each one is going to do it differently
and in their own subtly buggy ways and with their own subset of the
total possible features and use-cases, and none of them are going to
address every possible scenario.


exactly what you describe is how e works out of the box. no sscripts needed.
requiring people write script to do their screen configuration is just wrong.
taking the position of "well i give up and won't bother and will just make my
users write scripts instead" iss sticking your head in the sand and not solving
the problem. you are now asking everyone ELSE who writes a compositor to
implement a protocol because YOU wont solve a problem that others have solved
in a user friendly manner.

i've been doing x11 wm's since 1996. i've seen the bad, the ugly and the
horrible. there is no way i want any kind of protocol for configuring the
screen. not after having seen just how much it is abused when there and what a
horrible state things are left in when it's there.

as for output config - why would the desktops that already have their own
tools then want to support OTHER tools too? their tools integrate with
their settings panels and look and feel right and support THEIR policies.


Base your desktop's tools on the common protocol, of course. Gnome
settings, KDE settings, arandr, xrandr, nvidia-settings, and so on, all
seem to work fine configuring your outputs with the same protocol today.
Yes, the protocol is meh and the implementation is a mess, but the
clients of that protocol aren't bad by any stretch of the imagination.


no tools. why do it? it's built in. in order for screen config "magic" to
work  set of metadata  attached to screens. you can set priority (screens get
numbers from highest to lowest priority at any given time allowing behaviour
like your "primary" screen to migrate to an external one then migrate back when
external monitor is attached etc.) sure we can start having that metadata
separate but then ALTERNATE TOOLS won't be able to configure it thus breaking
the desktop environment not providing metadata and other settings associated
with a display. this breaks functionality for users who then complain about
things not working right AND then the compositor has to now deal with these
"error cases" too because a foreign tool will be messing with its data/setup.

let me give you an example:

http://devs.enlightenment.org/~raster/ssetup.png

[snip]


This is a very interesting screenshot, and I hadn't considered this. I
don't think it's an unsolvable problem, though - we can make the
protocol flexible enough to allow compositor-specific metadata to be
added and configurable. These are the sorts of requirements I want to be
gathering to design this protocol with.


as above. i have seen screen configuration used and abused over the years where
i just do not want to have a protocol for messing around with it for any
client. give them an inch and they'll take a mile.

no - we don't have to implement it as a protocol. enlightenment needs zero
protocol. it's done by the compositor. the compositors own tool is simply a
settings dialog inside the compositor itself. no protocol. not even a tool.
it's the same as edit/tools -> preferences in most gui apps. its just a
dialog the app shows to configure itself.


I currently do several things in different processes/binaries that
enlightenment does in the compositor, things like the bar and the
wallpaper. I don't want to make an output configuration GUI tool nested
into the compositor, it's out of scope.


and that's perfectly fine - that is your choice. do not force your choice on
other compositors. you can implement all the protocol you want in any way you
want for your wm's tools.

chances are gnome likely will do this via dbus (they love dbus :)). kde - i
don't know. but not everyone is implementing a wayland protocol at all so
assuming they are and saying "do it the same way" is not necessarily saving
any work.


We're all writing wayland compositors here. We may not all have dbus or
whatever else in common, but we do have the wayland protocol in common,
and it can support this use-case. It makes sense to use it.


gnome does almost everything with dbus. they love dbus. a lot of gnome is
centred around dbus. they likely will choose dbus to do this. likely. i
personally wouldn't choose to use dbus.

then intents are only a way of deciding where a surface is to be displayed -
rather than on the current desktop/screen.

so simply mark a surface as "for presentation" and the compositor will put
it on the non-internal display (chosen maybe by physical size reported in
edid as the larger one, or by elimination - its on the screen OTHER than the
internal... maybe user simply marks/checkboxes that screen as "use this
screen for presenting" and all apps that want so present get their content
there etc.)


Man, this is going to get really complicated. How do you decide what
display is "internal" or not? What if the user wants to present on their


at least e already knows this. its screen management subsystem is perfectly
aware of this. :)

primary display? What about applications that use the entire output for


the app can simply not request to present on their "presentation" screen... or
the user would mark their primary screen (internal on laptop maybe) AS their
presentation screen - more metadata to be held by compositor.

now ALL presentation tools behave the same -  you dont have to reconfigure each
one separately and deal with the difference and lack or otherwise of features.
it's done in 1 place - compositor, and then all apps that want to do a
similar thing follow and work "as expected". far better than just ignoring the
issue. you yourself already talked about extra tags/hints/whatever - this is
one of those.

things other then presentations? What if the application wants to use
several outputs, and for different purposes? What language are you going
to use to describe these settings to the user in a way that makes more
sense than the clients describing for themselves why they need to use a
particular output?


because this require clients DEFINING screen layout. wayland was specifically
designed to HIDE THIS. if the compositor displayed a screen wrapped around a
sphere in real life in a room - then it doesn't have rectangles... how will an
app deal with that? what if the compositor is literally a VR world with
surfaces wrapped around spheres and cubes - the point of wayland's design was
to hide this info from clients completely so the compositor decides based on
environment, not each and every client. this was a basic premise/design in
wayland from the get go and it was a good one. letting apps break this
abstraction breaks this design.

so what you are saying is it's better to duplicate all this logic of screen
configuration inside every app that wants to present things (media players -
play movie on presentation screen, ppt/impress/whatever show presentation
there, etc. etc.) and how to configure the screen etc. etc., rather than
have a simple tag/intent and let your de/wm/compositor "deal with it"
universally for all such apps in a consistent way?


No. Applications want to be full screen or they don't want to be. If
they want to pick a particular output, we can easily let them do so.


i don't know about you.. but fullscreen to enlightenment means you use up ONE
SCREEN. not all screens. and from user response.. they LOVE IT. it is correct.
it's the right way. so when an app asks to be fullscreen it gets to use the
scren its on - not all. so no. fullscreen does NOT mean they would want to span
all screens (you imply that) and then just draw different areas of their
massive window to correspond to screens (and control those screens,
resolutions, geometries etc.).

what makes sense is an app hints at the purpose of its window and opens n
windows (surfaces). it can ask for fullscreen for each. the hints would allow
the compositor to choose which screen the window/surface is assigned to.

Cool. Suggestions for what sort of capability thiis protocol should
have, what kind of surface roles we will be looking at? We should
consider a few things. Normal windows, of course, which on compositors
like Sway would be tiled. Then there's floating windows, like


ummm whats the difference between floating and normal? apps like gnome
calculator just open ... normal windows.


Gnome calculator doesn't like being tiled: https://sr.ht/Ai5N.png


i think the problem is you are not handling min/max sizing of clients
properly. :) you need to fix sway. gnome calculator is not sizing up its buffer
on surface size. that is a message "i can't be bigger than this - this is my
biggest size. deal with is". you need to deal with it. eg - pad it and make it
sized AT the buffer size :)

There are probably some other applications that would very much like to
be shown at a particular aspect ratio or resolution.


as above. buffer size tells you that.

xdg shell should be handling these already - except dmenu. dmenu is almost a
special desktop component. like a shelf/panel/bar thing.


dmenu isn't the only one, though, that may want to arrange itself in
special ways. Lemonbar and rofi also come to mind.


all of these basically are "desktop components" ala
taskbars/shelves/panels/whatever - i know that for e we don't want to support
such apps. these are built in. i don't know what gnome or kde think but these
go against their design as an integrated desktop environment. YOU need these
because your compositor has no such feature itself. the bigger desktops don't
need it. they MAY support it - may not. i know i don't want to. :)

[input is] something that many of Sway's users are asking for.


they are going to have to deal with this then. already gnome and kde and e
will all configure mouse accel/left/right mouse on their own based on
settings. yes
- i can RUN xset and set it back later but its FIGHTING with your DE.
waqyland is the same. use the desktop tools for this :) yes - it'll change
between compositors.  :) at least in wayland you cant fight with the
compositor here. for sway - you are going ot have to write this yourself.
eg - write tools that talk to sway or sway reads a cfg file you edit or
whatever. :)


I've already written this into sway, fwiw, in your config file. I think
this is fine, too, and I intend to keep supporting configuring outputs
like that. But consider the use case of Krita, or video games like Osu!


i don't know osu - but i see no reason krita needs to configure a tablet. it
can just deal with input from it. :)

However, beyond detailed input device configuration, there are some
other things that we should consider. Some applications (games, vnc,
etc) will want to capture the mouse and there should be a protocol for
them to indicate this with (perhaps again associated with special
permissions). Some applications (like Krita) may want to do things like
take control of your entire drawing tablet.


as i said. can of worms. :)


It's a can of worms we should deal with, and one that I don't think it's
hard to deal with. libinput lets you configure a handful of details
about input devices. Let's expose these things in a protocol.


input is very sensitive. having done this for years and watched how games like
to turn off key repeat then leave it off when they crash... or change mouse
accel then you find its changed everywhere and have to "fix it" etc. etc. - i'd
be loathe to do this. give them TOO much config ability anbd it can become a
security issue.

you have no idea how many non-security-sensitive things need fixing first
before addressing the can-of-worms problems. hell nvidia just released
drivers that requrie compositors to re-do how they talk to egl/kms/drm to
work that's not compatible with existing drm dmabuf buffers etc. etc.


Why do those things need to be dealt with first? Sway is at a good spot
where I can start thinking about these sorts of things. There are
enough people involved to work on multiple things at once. Plus,
everyone thinks nvidia's design is bad and we're hopefully going to see
something from them that avoids vendor-specific code.


because these imho are far more important. you might be surprised at how few
people are involved.

I don't see these problems as a can of worms. I see them as problems
that are solvable and necessary to solve, and now is a good time to
solve them. My compositor is coming up on version 1.0. Supporting the
APIs is the driver's problem, we've described the spec and as soon as
they implement it, it will Just Work(tm).

even clients and decorations. tiled wm's will not want clients to add
decorations with shadows etc. - currently clients will do csd because csd is
what weston chose and gnome has followed and enlightenment too. kde do not
want to do csd. i think that's wrong.


What is a can of worms is the argument over whether or not we should use
CSD or SSD. I fall in the latter camp, but I don't think we need to
fight over it now. We should be able to agree that a protocol for
negotiating whether or not borders are drawn would be reasonable. Is it
a GTK app that does nothing interesting with its titlebar? Well, if the
compositor wants to draw its borders, then let it do so. Does it do
fancy GTK stuff with the borders? Well, no, mister compositor, I want to
do fancy things. Easy enough.


not so simple. with more of the ui of an app being moved INTO the border
(titlebar etc.) this is not a simple thing to just turn it off. you then turn
OFF necessary parts of the ui or have to push the problem out to the app to
"fallback". only having CSD solves all that complexity and is more efficient
than SSD when it comes to things like assigning hw layers or avoiding copies of
vast amounts of pixels. i was against CSD to start with too but i see their
major benefits.

of course the shadow padding area is something i do see as optional and
something to hint at that would be useful. i can't see gnome dropping CSD
especially given how integrated to the ui it's becoming. i can tel you that i'm
strongly considering going the same way and fully integrating into CSD for many
good reasons that go far beyond just a desktop.

it adds complexity to wayland just to "not follow the convention". but
for tiling i see the point of at least removing the shadows. clients
may choose to slap a title bar there still because it's useful
displaying state. but advertising this info from the compositor is not
standardized. what do you advertise to clients? where/when? at connect
time? at surface creation time? what negotiation is it? it easily
could be that 1 screen or desktop is tiled and another is not and you
dont know what to tell the client until it has created a surface and
you know where that surface would go. perhaps this might be part of a
larger set of negotiation like "i am a mobile app so please stick me
on the mobile screen" or "i'm a desktop app - desktop please" then
with the compositor saying where it decided to allocate you (no mobile
screen available - you are on desktop) and app is expected to adapt...


In Wayland you create a surface, then assign it a role. Extra details
can go in between, or go in the call that gives it a role. Right now
most applications are creating their surface and then making it a shell
surface. The compositor can negotiate based on its own internal state
over whether a given output is tiled or not, or in cases like AwesomeWM,
whether a given workspace is tiled or not. And I don't think the
decision has to be final. If the window is moved to another output or
really if any of the circumstances change, they can renegotiate and the
surface can start drawing its own decorations.


yup. but this signalling/negotiation has to exist. currently it doesnt. :)

there's SIMPLE stuff like - what happens when compositor crashes? how do we
handle this? do you really want to lose all your apps when compositors
crash? what should clients do? how do we ensure clients are restored to the
same place and state? crash recovery is important because it is always what
allows updates/upgrades without losing everything. THIS stuff is still "un
solved". i'm totally not concerned about screen casting or vnc etc. etc.
until all of these other nigglies are well solved first.


I'm still not on board with all of this "first" stuff. I don't see any
reason why we have to order ourselves like this. It all needs to get
done at some point. Right now we haven't standardized anything, and each
compositor is using its own unique, incompatible way of taking
screenshots and recording videos, and each is probably introducing some
kind of security problem.


you aren't going to talk me into implementing something that is important for
you and not a priority for e until such a time as i'm satisfied that the other
issues are solved. you are free to do what you want, but standardizing things
takes a looong time and a lot of experimentation, discussion, and repeating
this. we have resources on wayland and nothing you described is a priority for
them. there are far more important things to do that are actual business
requirements and so the people working need to prioritize what is such a
requirement as opposed to what is not. resources are not infinite and free.

apps can show their own content for their own bug reporting. for system-wide
reporting this will be DE integrated anyway. supporting video capture is a a
can of worms. as i said - single buffer? multiple with metadata? who does
conversion/scaling/transforms? what is the security model? and as i said -
this has major implications to the rendering back-end of a compositor.


The compositor hands RGBA (or ARGB, whatever, I don't care, we just pick
one) data to the client that's recording. This problem doesn't have to
be complicated. As for the "major implications"...


let me complicate it for you. let's say i'm playing a video fullscreen. you now
have to convert argb to yuv then encode when it would have been far more
efficient to get access directly to the yuv buffer before it was even scaled to
screen size... :) so you have just specified a protocol that is by design
inefficient when it could be more efficient.

there's a difference. when its an internal detail is can be changed and
adapted to how the compositor and its rendering subsystem work. when its a
protocol you HAVE to support THAT protocol and the way THAT protocol defines
things to work or apps break.


You STILL have to get the pixels into the encoder on the compositor
side. You will ALWAYS have to do that if you want to support video
captures, regardless of who's doing it. At some point you're going to
have to get the pixels you're rendering and hand them off to someone, be
that libavcodec or a privledged client.


yes - but when, how often and via what mechanisms pixels get there is a very
delicate thing.

We can make Wayland support use-cases that are important to our users or
we can watch them stay on xorg perpetually and end up maintaining two
graphical stacks forever.


priorities. there are other issues that should be solved first before
worrying about the pandoras box ones.


These are not pandora's box. These are small, necessary features.


i disagree. i've been doing graphics for long enough to smell the nasties from
a mile off. it's not trivial. the decisions that are made now will haunt us
for a lifetime. they are not internal details that can be fixed easily. even
internal details are hard to fix once enough code relies on them...

so far we don't exactly have a lot of inter-desktop co-operation happening.
it's pretty much everyone for themselves except for a smallish core protocol.
do NOT try and solve security sensitive AND performance sensitive AND design
limiting/dictating things first and definitely don't do it without everyone on
the same page.


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    raster rasterman com

Follow-Ups:
- Re: Collaboration on standard Wayland protocol extensions
  - From: Drew DeVault

References:
- Collaboration on standard Wayland protocol extensions
  - From: Drew DeVault
- Re: Collaboration on standard Wayland protocol extensions
  - From: The Rasterman
- Re: Collaboration on standard Wayland protocol extensions
  - From: Drew DeVault
- Re: Collaboration on standard Wayland protocol extensions
  - From: The Rasterman
- Re: Collaboration on standard Wayland protocol extensions
  - From: Drew DeVault

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]