On the Importance of On-Screen Keyboards

This is a guest post by Dorota, developer on Purism’s mobile team.

typewriter keyboard
A “Continental Standard” QWERTY typewriter keyboard. Picture from Wikipedia

The role of keyboards cannot be overstated. They originated long before computers, and survive in the smartphone era. Millions of people text their friend by tapping away on their shiny pocket computers using the venerable QWERTY layout dating back to 1873.

It is hard to imagine a phone without a way to enter text. Some of us are dreaming about Minority Report-style gesturing, but the Librem 5 continues the keyboard tradition.

Over the past several weeks, I have become the team’s keyboard specialist. I spent the time making sure that all the necessary pieces fit together nicely, and proving that with a working prototype. The prototype can be seen in the video below. It was made using rootston (wlroots), GTK and weston-keyboard—remember those names, because they will be important later!

The task took me on an interesting and educating journey. The Wayland train took me via input methods to Asia, through protocols, to FLOSS communities. I will try to describe my story for you.

“How hard can it be, really?”

On the surface of it, a keyboard is a simple device. It is an assortment of buttons attached to a computer. Press a button, and the computer performs a corresponding action. Placing a keyboard on a screen should not be too difficult: just set aside a portion of the screen for keys. Tap: action. Does it sound like your phone’s screen keyboard though? It does not appear on the screen at all times. And it corrects mistyped words. Surely, there are other differences?

In fact, on-screen keyboards are rarely considered only keyboards in the computing landscape. They are an example of a wider category, called “input methods”. Input methods can have more responsibilities than just letting someone press keys. I could use one for correcting a typo I just made, or choose the correct spelling from the dictionary, find alternative words, or even type whole phrases I use a lot, like “ahoy matey!” In Asian languages, an input method can be practically required in addition to a keyboard!

Typing in Japanese using the Anthy input method. Picture from Wikipedia

While in general, input methods give users a way to enter text, there is a great variety in how they can do it. They can pop up selections, suggest corrections, present phrases, recognize handwriting, or even construct a user-interface that differs nearly entirely from the traditional keyboard to help the disabled (like Dasher does).

Enter Wayland

For me, though, the basic task was to see what the story looks like for plain old English text entered by tapping buttons on a screen. The Librem 5 will be using native Wayland applications exclusively (thanks to GTK3 and Qt5), and that is important when designing the keyboard experience. Wayland deals with human interface devices, like keyboards, mice, touch screens, and input methods. A Wayland compositor acts as an intermediary between the devices and applications. It defines how applications can share those between each other. Imagine what a mess it would be if every application listened to your keyboard at the same time!


Here’s where the technical part starts. Wayland consists of a set of protocols. There is one, wl_keyboard, defining how applications listen to keyboard presses. Similarly, another one (“text-input”) defines how applications listen to our input method. Yet another, called “input-method”, defines how the input method app communicates with the compositor. This gives us 3 classes of software that need to work together to make on-screen keyboards possible: applications, compositors and keyboards themselves.

As a first impulse, I looked around to see how well supported they are in existing projects. The centerpiece here was the wlroots project, upon which we’re building Librem 5’s compositor. It didn’t support either protocol, and my intention to implement them was met with warm approval from the maintainers. Not everything was so rosy though. The only choice of a keyboard implementing “input-method” is the “weston-keyboard”, which is arguably not designed for a phone. The “text-input” situation is even more interesting. See, I lied a bit: there are really 3 protocols being used for this purpose. Qt and KWin mostly settled on “_v2”, and GTK with gnome-shell use “gtk_input_method” (as a side note the maliit project uses “zwp_text_input_unstable_v1”). That’s bad news, because every compositor would have to support both of them in order to communicate text input both to Qt and GTK apps.

The reasonable way to solve this would be to convince everyone to standardize on a single protocol. Before diving in the deep end of the community politics pool, I decided to get my hands dirty to get a better view of the details.

The work

I grabbed rootston, GTK 3.22, and weston-keyboard and set off implementing the necessary support in wlroots. Soon after the first code landed, “gtk_input_method” was submitted to become the new official “zwp_text_input_unstable_v3”. Meanwhile, I submitted a pull request to wlroots. The submission sparked a long series of discussions with wlroots developers. The conclusions were twofold: “text_input_v3” was very close to what the community needed, and “input-method” was half-abandoned, incomplete and duplicating the functionality of other protocols. In short, both protocols needed some work.

The good news is that the discussions were very in-depth and involved people from other projects, clarifying what needs to be done to fix them. At the moment, I’m working on new proposals for both of them, which will be subject to relentless scrutiny, after which they will hopefully be clean enough for an official blessing and inclusion to wayland-protocols.

Having good protocols that do their jobs should convince projects like GTK, Qt and Maliit to standardize on a single way of doing things and avoid unnecessary duplication of work.

You may ask: “You already have a working implementation, why do you worry about other projects?” We at the Librem 5 team realize how important of a place this project will take in the Wayland world. Once the phone is released, it will be the reference for other projects to follow. Splitting off and doing things our way would lock some projects to the Librem, while others would follow the rest of the world. Doing things right and in lockstep will make interoperability between Librem phones and everyone else effortless, and make things work well for everyone.

What’s next?

The work on protocols will continue, and their implementations will be submitted for upstreaming into crucial libraries. At some point, weston-keyboard will have to go or get radically transformed, meaning more work for the design team. The protocols should be very close to standardized by the time the development boards find themselves in the hands of the public. I am very curious what input methods you can come up with—the ideas I’ve heard during discussions nearly melted my brain! And I hope that Asians show their experience in input methods 😉

— Dorota