Designing for Voice: The next phase of UX design

February 14, 2018

The next wave of user experience (UX) has arrived: Ambient computing. Enter a room, tell Alexa to turn on the light; ask Google Home what your meetings are for the day. No keys, no clicks, just voice. Given the Alexa v. Google Home smack down at the Consumer Electronics Show (CES) this year, it occurred to me that tech wasn’t the innovation this year, it was interaction. That is, the real innovation at this moment is how we will interact with our devices. While the show floor of CES tends to be filled with a lot of hardware, it was clear to me that UX design has moved from early web to mobile to the next stage: voice and voice computing.

Companies are going to fail at the first attempt at voice

In some respects, “voice-first” is overtaking “mobile-first”. The reality for many organizations, is that voice is a viable channel in which they can deliver their service. Right now, entertainment is leading the voice space, but with the launch of Alexa for business and voice assistants integrating into many environments, we are seeing the diffusion across technologies.
The interesting point, however, is that many organizations will first attempt at voice for the same reasons the early ‘jumps’ to mobile failed – because the temptation is to simply port from one channel/medium to another. Failures will happen because users don’t have the same expectations for voice user interface (UI) that they do for screens. Voice interaction (i.e., dialog) is already something users are experts at, so they are not going to blame themselves if they can’t do something, they will blame the interface.

Voice interaction is also about what people can remember and produce accurately, rather than what they can recognize and control on a screen. The challenge for organizations is to design a natural-feeling voice interaction that adapts with the customer.


Considerations when designing for voice

We find while natural language understanding has gotten remarkably better and we can respond to voice interfaces in an interesting way, people talk to machines differently than they do to other people. Users see this interaction with voice assistants as a command-driven system (e.g., “Turn on the light”) with imperative statements. With command-driven interaction, the design is a linear path. However, most human dialogs are not like that; they are give and take – turn-taking with a set of implicit rules that machines do not know.

There is also ambiguity and variability in dialogs. With a drop-down list, the items are pretty much the same every time. But if Alexa responded with the same phrases every time, it would be very artificial – you would know you’re talking to a machine. For instance, when I ask Alexa to add something to the shopping list sometimes she says, “Carrots added to your shopping list” and sometimes she says, “I’ve put carrots on your shopping list”. These little variances are more indicative of human response and make for a richer interaction.

This is where an understanding of how dialogs work is critical. We must retain information from prior parts of the conversation, know the context (in linguistic terms ‘pragmatics’) and know the goal of the dialog.

The whole point is to design the voice interaction so it is frictionless. Back to the notion of ambient computing – I don’t need glasses or even light or fingers to interact. I should be able to just ask Alexa to turn up the thermostat and it’s done. But, of course, it’s not that easy – it requires a lot of hard work to design the voice UI.


The next stage in UX design

The current market uptake, and the splash that voice tech made at CES, is a reflection that technology is ready for the next wave of UX design. Organizations should start simply and answer the question, “what can I build that adds value to customers?”. Why do people call the call center? Why do people send emails? Begin to unpack some of the frequent things that customers do and then design voice services around that. Then look at where interactions go wrong: Design the error conditions, design the input conditions, design better output conditions, how do you provide feedback, understand the when quantity of the information is sufficient before the quality of the information.

In order to get to a frictionless UI design, UX research is not all that different from building other UIs. It’s about the people, the environment and the tasks. UX researchers do need to understand the idiosyncrasies of the voice channel: What we learn, and how those insights are translated into a voice interface design is a new challenge for UX design.