Wayland was designed from the point of view of theoretical purists. It's basically "how would a display server work in an ideal world", unfortunately, that design turns out to also be impractical and straight up developer/user hostile.
I would at least like to understand the idea of 'pureness' this API tries to aspire to.
It's definitely not Unix-like, since file handles and writes and epoll, and mmap for IPC are nowhere to be found. Instead you have 'objects' with these lifecycle methods that create/release resources (probably committing the design sin of having these for things which should be pure data, like descriptors).
What's with these XML headers? It's UNIX standard stuff, to have a C API for your code, that declares an API for a library, and then a makefile can just consume it. There's a standard way of supplying, finding and consuming them. Even binding generators are probably more comfortable with C headers, than this XML thing
And what's with the callbacks for everything, like screen resolution queries? In Win32, you can do it with a single synchronous API call that returns a struct that has all the info. It's not like you have to touch the disk or network to get this. In cases where you do, you usually have a call that dispatches a message to another window (which you can also dispatch yourself), and you have to listen to the response.
I did some X11 programming as part of work, and its entirely reasonable and conventional compared to this, much more like Win32 (even maybe a bit more pleasant, but I'm no expert on it).
The API sounds awful (and I've had ChatGPT generate me some example programs, and it's somehow even worse than the author describes), and not only that, the requirement of 'everything be an object', with chains and trees of objects being created introduces a huge source of bugs and bookeeping performance overhead on the application side.
Yes, you do have to do something like this with some things under Windows, but the reason for this is that these objects have duplicates in the Windows kernel.
But here it looks like this is just to satisfy the sensibilities of the designer.
Honestly this sounds like the most epic case of NIH syndrome. Like these guys wanted to write their own OS and userland and break with existing conventions.
Not impossible, it just needs to be implemented at a different layer. The compositor needs to expose some API for global hotkeys. For example, I found this with ~2 minutes of Googling: https://wayland.app/protocols/hyprland-global-shortcuts-v1
Not only that. A11y is also quite hard. Tools that are simple to implement thanks to good a11y apis - for example on macos, the tool rcmd or homerow - are super hard to do in Wayland.
wdotool exists, and global hotkeys are a thing under wayland, but is desktop dependent. KDE allows it by default, Gnome can be made to do it as well with an extension.
I agree that the lack of standardization around the "insecure" things is a bad idea. Insecure operations don't have to be available by default, or even universally supported, but a central registry of interfaces for e.g. retrieving all windows on a desktop would certainly help preventing fragmentation.
At the same time, most of this post really is just a rant essentially saying that a low-level library is so flexible that using it directly results in code so verbose it can hardly be read. Yes, that's how good low-level designs always are.
You can turn a generic portable asynchronous ANSI C interface into a simple, blocking and platform-specific one with an abstraction layer. You can integrate it with all sorts of existing event loops and programming frameworks. You can customize it all you like but using it directly in an application will cost you a lot of patience. At the same time, you can't go in the opposite direction; from a "simple" blocking black-box interface to something that can reasonably host a complex GUI toolkit. If you're after simplicity, go higher-level.
Seems like complaining about how difficult to use Win32 and COM are. And they are if you use them directly! You don't do that - you use libraries that others have sweated over, as you did with raylib.
>and I still don't know what's the difference between them (wl_display_roundtrip() & wl_display_dispatch()) and in what order to call them on
I've been struggling with this initially as well, it's pretty poorly explained in docs. Short explanation:
Wayland-client library implements a queues over the socket. So to get it, you have to think about when is the socket read from and written to, and when are the queues pulled from or pushed to.
There is always a default queue, but for example EGL+OpenGL creates it's own queue, which further makes it more confusing.
- wl_display_dispatch_pending() only pulls messages from default queue to callbacks
- wl_display_dispatch() also tries to do blocking read on the socket if no messages are in queue
- quite recently wl_display_dispatch_queue_timeout() was finally added, so you can do non-blocking read from the socket. earlier you had to hack the function yourself
- wl_display_flush() writes enqueued messages in queue to socket
- wl_display_roundtrip() sends a ping message and does blocking wait for response. the purpose is that you also send all enqueued requests and receive and process all responses. for example during init you call it to create registry and enumerate the objects, and you call it for second time to enumerate further protocol objects that got registered in registry callback, such as seat
- eglSwapBuffers() operates on its own queue, but reading from socket also enqueues to default queue, so you should always call wl_display_dispatch_pending() (on default queue) afterwards
There is also a way to get around being stuck in eglSwapBuffers() during window inhibition: disable the blocking with eglSwapInterval(0) and use wl_surface_frame() callback, and you get notified in callback when you can redraw and swap again. But you can't do blocking reads with wl_display_dispatch() anymore, have to use the timeout variant.
After using it this way, you can also easily manage multiple vsynced windows independently on the same thread, and even use wayland socket in epoll event loop. None of this is documented of course.
The clipboard interface is definitely compromised a bit by being shared with drag-and-drop events, but it's not that complicated. Also there is a pitfall when you copy-paste to your own application and don't use any async event loop, you can get deadlocked by being expected to write and read on the same file descriptor at the same time.
The separate process for clipboard: yep... I'm having to do this to be able to get the cursor position myself in Wayland... (This is for a screen recorder app)
I'd like to see some code to understand what it takes to write a functioning Wayland application, a bit like David Rosenthal did in his paper "A Simple X11 Client Program -or- How hard can it really be to write ‘Hello, World’?" (USENIX 1988 Winter Proceedings).
Anyway, if I was persuaded that Wayland has a rather backwards design (here my reasons: https://news.ycombinator.com/item?id=47477083), now I have the confirmation that its philosophy is something like "put surfaces on the screen and distribute events to the clients, all the other stuff is not my business", and that exploring alternative approaches to window management is still worth it.
Having applications that manage all their resources (canvases, events, decorations) is not bad per se (for example video games), but not all of them need to.
I have used quite a bit of Gtk and QT, and have had to touch X11 or Wayland very little directly, EXCEPT for one case where I wanted to provide a global hotkey...
As a user, I like wayland. X11 was a security disaster. Wayland is much better about tearing.
What scares me though are all the responsibilities passed to compositors, because what ends up happening is that each compositor may reimplement what should be common functionality in annoying ways. This is especially true for input things, like key remapping. This ultimately fragments linux desktop experiences even harder than it was before.
Reminds me somewhat of Vulkan. I think the trend of making the actual specification of something lower level and less convenient is rather logical. Why burden implements with a load of convenience functions when that could be left up to libraries?
I sidestep by using neovim as my environment for pretty much everything and you can bridge the SPICE virtio clipboard channel to Wayland. You can get clipboard sharing to work natively on wlroots compositors.
180 comments
It's definitely not Unix-like, since file handles and writes and epoll, and mmap for IPC are nowhere to be found. Instead you have 'objects' with these lifecycle methods that create/release resources (probably committing the design sin of having these for things which should be pure data, like descriptors).
What's with these XML headers? It's UNIX standard stuff, to have a C API for your code, that declares an API for a library, and then a makefile can just consume it. There's a standard way of supplying, finding and consuming them. Even binding generators are probably more comfortable with C headers, than this XML thing
And what's with the callbacks for everything, like screen resolution queries? In Win32, you can do it with a single synchronous API call that returns a struct that has all the info. It's not like you have to touch the disk or network to get this. In cases where you do, you usually have a call that dispatches a message to another window (which you can also dispatch yourself), and you have to listen to the response.
I did some X11 programming as part of work, and its entirely reasonable and conventional compared to this, much more like Win32 (even maybe a bit more pleasant, but I'm no expert on it).
The API sounds awful (and I've had ChatGPT generate me some example programs, and it's somehow even worse than the author describes), and not only that, the requirement of 'everything be an object', with chains and trees of objects being created introduces a huge source of bugs and bookeeping performance overhead on the application side.
Yes, you do have to do something like this with some things under Windows, but the reason for this is that these objects have duplicates in the Windows kernel.
But here it looks like this is just to satisfy the sensibilities of the designer.
Honestly this sounds like the most epic case of NIH syndrome. Like these guys wanted to write their own OS and userland and break with existing conventions.
https://github.com/ReimuNotMoe/ydotool
The extra security meant many automation tasks need to be done as extensions on composer level making this even worse
> how would a display server work in an ideal world
When designed by committee.
With conflicting interests.
And Veto Powers.
At the same time, most of this post really is just a rant essentially saying that a low-level library is so flexible that using it directly results in code so verbose it can hardly be read. Yes, that's how good low-level designs always are.
You can turn a generic portable asynchronous ANSI C interface into a simple, blocking and platform-specific one with an abstraction layer. You can integrate it with all sorts of existing event loops and programming frameworks. You can customize it all you like but using it directly in an application will cost you a lot of patience. At the same time, you can't go in the opposite direction; from a "simple" blocking black-box interface to something that can reasonably host a complex GUI toolkit. If you're after simplicity, go higher-level.
>and I still don't know what's the difference between them (wl_display_roundtrip() & wl_display_dispatch()) and in what order to call them on
I've been struggling with this initially as well, it's pretty poorly explained in docs. Short explanation:
Wayland-client library implements a queues over the socket. So to get it, you have to think about when is the socket read from and written to, and when are the queues pulled from or pushed to. There is always a default queue, but for example EGL+OpenGL creates it's own queue, which further makes it more confusing.
-
wl_display_dispatch_pending()only pulls messages from default queue to callbacks-
wl_display_dispatch()also tries to do blocking read on the socket if no messages are in queue- quite recently
wl_display_dispatch_queue_timeout()was finally added, so you can do non-blocking read from the socket. earlier you had to hack the function yourself-
wl_display_flush()writes enqueued messages in queue to socket-
wl_display_roundtrip()sends a ping message and does blocking wait for response. the purpose is that you also send all enqueued requests and receive and process all responses. for example during init you call it to create registry and enumerate the objects, and you call it for second time to enumerate further protocol objects that got registered in registry callback, such as seat-
eglSwapBuffers()operates on its own queue, but reading from socket also enqueues to default queue, so you should always callwl_display_dispatch_pending()(on default queue) afterwardsThere is also a way to get around being stuck in
eglSwapBuffers()during window inhibition: disable the blocking witheglSwapInterval(0)and usewl_surface_frame()callback, and you get notified in callback when you can redraw and swap again. But you can't do blocking reads withwl_display_dispatch()anymore, have to use the timeout variant. After using it this way, you can also easily manage multiple vsynced windows independently on the same thread, and even use wayland socket in epoll event loop. None of this is documented of course.The clipboard interface is definitely compromised a bit by being shared with drag-and-drop events, but it's not that complicated. Also there is a pitfall when you copy-paste to your own application and don't use any async event loop, you can get deadlocked by being expected to write and read on the same file descriptor at the same time.
Anyway, if I was persuaded that Wayland has a rather backwards design (here my reasons: https://news.ycombinator.com/item?id=47477083), now I have the confirmation that its philosophy is something like "put surfaces on the screen and distribute events to the clients, all the other stuff is not my business", and that exploring alternative approaches to window management is still worth it. Having applications that manage all their resources (canvases, events, decorations) is not bad per se (for example video games), but not all of them need to.
> Make easy things easy. Make hard things doable.
is generally unachievable. Instead, pick one:
- easy things easy, hard things impossible
- easy things tedious, hard things possible
(Unless you want to maintain two sets of interfaces in parallel.)
https://www.youtube.com/watch?v=HMKaM3FdsgY
What scares me though are all the responsibilities passed to compositors, because what ends up happening is that each compositor may reimplement what should be common functionality in annoying ways. This is especially true for input things, like key remapping. This ultimately fragments linux desktop experiences even harder than it was before.
No you don't need to reinvent the wheel thank you.
It's getting a bit boring, especially since none really does more than complain.
The API feels like a hardcore OOP/C++ developer's first C interface.
So I feel your pain. I did hear programming for Wayland is harder than X11, but I never did either so I have no idea if that is true.
wlrootsin their googling! You’re not supposed to be solving these issues yourself.It satisfies the requirement to "make easy things easy, make hard things doable" and it also gets you cross platform support.