Discoverability in the Age of Touchscreens

When I was first getting started with computers, in the late 1970s, user interfaces looked like this:

Visicalc, the first spreadsheet program, required users to learn commands. Image: Wikipedia (https://en.wikipedia.org/wiki/VisiCalc)
Visicalc, the first spreadsheet program, required users to learn commands. Image: Wikipedia

Getting the computer to do anything required learning arcane incantations and typing them into a command line. While some — such as LOAD and LIST — were common English words, others weren’t. Moving to a new computer often meant learning new sets of incantations.

As with all software, operating systems and applications based on command line interfaces implement conceptual models. However, these conceptual models are not obvious; the user must learn them either by trial and error — entering arbitrary (and potentially destructive) commands — or reading the manual. (An activity so despised it’s acquired a vulgar acronym: RTFM.) Studying manuals is time-consuming, cognitively taxing, and somewhat scary. This held back the potential of computers for a long time.

In the mid-1980s, an important innovation came along: the WIMP (windows, icons, menus, and pointers) paradigm. Instead of requiring that the user memorize text commands, WIMP interfaces list out the commands on the screen. The conceptual model is laid out in navigation hierarchies (menus, toolbars) that allow the user to discover the application’s functionality without having to guess or RTFM. The Macintosh wasn’t the first computer to implement a WIMP user interface, but it was the first to popularize it. As a result, it’s remembered as a major milestone towards popularizing computers — and it was.

An early version of Microsoft Word for the Mac. The user can see commands laid out in menus and button bars. Image: betalogue.com (http://www.betalogue.com/2011/05/24/word51-nostalgia/)
An early version of Microsoft Word for the Mac. The user can see commands laid out in menus and button bars. Image: Betalogue

Modern touchscreen-based interfaces — such as the iPhone’s — are widely thought to be the next step towards making computers easier to use. When Steve Jobs introduced the iPhone in 2007, he presented the phone’s touchscreen-based UI as an advance comparable to that taken with the WIMP paradigm — and it was, too. The ability to reach out to the screen with your fingers removes one layer of abstraction from the interaction. With a touchscreen, you no longer rely manipulating information with a pointer on the screen. Instead of having to use your finger to move a mouse on the table to move a pointer on the screen, your finger is the pointer.

Touchscreen-based UIs have undoubtedly made computers easier to use. This ability to “reach out and touch” information has fostered user interfaces that simulate artifacts we’re accustomed to dealing with in physical space. The result: more people use personal computers today than ever before. (In the form of smartphones, mostly.)

Garageband on iOS. Image: Ask.audio (https://ask.audio/articles/record-and-edit-your-own-samples-in-garageband-for-ipad)
Garageband on iOS. Image: Ask.audio

That said, this interaction paradigm has started becoming more complicated. Fingers are less precise than mouse-driven pointers. That, combined with the fact most touchscreen-based devices are relatively small, has led to a UI paradigm that relies on hidden variations of the basic action of touching the screen.

For example, the first version of the iPhone’s operating system didn’t allow copying and pasting. In WIMP-based UIs, this basic task can be accomplished by three means:

  1. by selecting commands from a dropdown menu,
  2. by clicking on a button, or
  3. by pressing the right mouse button to bring up a contextual menu.

Smartphone screens are much too small to accommodate so much “interface”; designers want users to focus on content, not on buttons. As a result, many basic global OS tasks such as copy-and-paste are not visually exposed in the UI. (In iOS, you copy-and-paste by tapping the object you want to copy and holding it there until a selection bar appears. But you mustn’t press too hard, less you activate the system’s “3d Touch” feature. The iPad, with its bigger screen, does have copy-paste affordances, but only in some apps. It’s getting complicated!)

Apple presents the most recent iPhone — the iPhone X — as a vision the future of the platform. One of the major “innovations” of this new phone is that its screen goes all the way to the edges of the device. This decision required that the team do away with one of the most distinctive (and useful) features all iPhones have had up to now: the Home button. While it’s somewhat prosaic, this button is a very important part of what makes the iPhone so easy to use. Wherever you are in the system at any given moment, pressing Home allows you to get back to the beginning. It’s a lifejacket for new users to the system.

So what does the new iPhone replace this button with? A “gesture”: you swipe your thumb up from the bottom of the display. I use the word “gesture” in quotations because gestures are supposed to be obvious movements we do with our body to express meaning. Swiping up from the bottom of the screen is not obvious and expresses no inherent meaning; it’s something we must learn to do, much like we learned to type LOAD into those command lines of yore.

But, you may ask, we needed to learn about the Home button as well. What’s different? The difference is the number of interactions that now depend on such gestures. Over the past ten years, touchscreen-based pocket computers have become more capable, and designers have resisted the urge to add more buttons — physical or otherwise — to their UIs. As a result, there are now a lot of gestures to learn if you want to get the most out of your computer. (The Home button itself has become overloaded with gestures: pressing once quickly does one thing, pressing and holding another, double-pressing another, triple-pressing still another, and so on.)

Here’s a non-exhaustive list of basic iOS interactions that require non-obvious gestures:

  • Seeing the list of notifications
  • Deleting apps
  • Rearranging apps
  • Switching apps
  • Searching
  • Copying/cutting content
  • Undo
  • Going to Control Center
  • And now, returning to Springboard (iOS’s “Home” screen)

It’s not that learning these things is difficult; all it takes is a few minutes for someone to show you. Still, you must have them shown; you can’t discover them otherwise. (Except accidentally, which can be frustrating.) Once you know what they are, you must also remember them. (Was it swipe up, or down? Do I swipe down from the area to the left of the notch or the area to the right?) The user interface itself offers no hints that these are things you can do with the system.

To make matters worse, some of these gestures only work in some parts of the environment and not others. For example, in iOS undo — which is triggered by shaking the phone — works when entering content in some applications, such as Mail.app, but won’t work when rearranging apps on the home screen. System search — which is triggered by swiping down from the middle of the screen — only works when the user is in the home screen and some apps, but not others. And so it goes. (This is not just an issue of discoverability, but also of accessibility. I have no direct experience with this, but can only wonder: How do users with motor control difficulties deal with these gestural interfaces?)

To complicate things even further, each major operating system is developing its own vocabulary of gestures. An iOS user switching to Android will find some that are familiar, but many that won’t be. I spent almost two years using a Galaxy Note II, and could never remember how to take screenshots with the device. Windows also implements its own set of touchscreen gestures that are different from those in iOS and Android. (I recall my first experience of using Windows 8: the OS welcomed me by showing me instructions for the various swipes that were required for basic operations such as switching apps.)

In short, this new interaction paradigm​ — which holds so much promise — is becoming more difficult to use. Gestures are not discoverable; they must be learned. We can eventually develop the muscle memory to use them without much thought, but the process is less obvious than it was during the WIMP days. A macOS user could develop competence by simply interacting with the system. Not so with modern touchscreen UIs; external help is required.

An aphorism attributed to Albert Einstein says “everything should be made as simply as possible, but not simpler.” We are nowhere near the bad old days of command-line driven interfaces; today’s phones and tablets are much easier to use than computers of the 1970s. That said, the drive to “simplify” applications and operating systems by removing interaction elements that expose their conceptual models is starting to tilt things back in the opposite direction. Having to learn a non-obvious gesture to do something as essential as going back home is a sign that we’ve crossed the “not simpler” boundary.