Starting with the earliest TabletPC enhancements to Windows, we have
been working on “on-screen keyboards.” With Windows 8, we started fresh
and took a "first principles" approach to developing the touch keyboard.
Given the amount of experience many of us have with touch keyboards for
phones, and the myriad of touch devices we interact with these days, we
set a very high bar for the quality of the experience and effectiveness of input with the new Windows 8 touch keyboard. In this post, Kip Knox, a member of the Windows User Experience program management team, details this work.
When we began planning how touch and new types of PCs might work on
Windows 8, we recognized the need to provide an effective method for
text entry on tablets and other touch screen PCs. Since Windows XP SP1,
which had Tablet PC features built in, Windows has included a touchable
on-screen keyboard. But those features were designed as extensions to
the desktop experience. For Windows 8, we set out to improve on that
model and introduce text input support that meets people’s needs,
matches our design principles, and works well with the form factors we
see today and expect to see in the future.
I’m writing this blog post on our Windows 8 touch keyboard using the
standard QWERTY layout in English. As I look at it, the keyboard seems
very simple and sort of obvious. This comes partly from having worked on
it for a while, but also because keyboards are familiar to us. But
there is more here than meets the eye (or, fingertips).
We started planning this feature area with no preconceived notions.
As we do with all our features, we began the text input design project
with a set of principles or goals. On a Windows 8 PC using touch, we
want people to be able to:
- Enter text quickly, reasonably close to the speed with which they type on a physical keyboard
- Avoid errors, and be able to easily correct mistakes
- Enter text comfortably, in terms of posture, interaction with the device, and social setting
You might note that none of those goals explicitly assumes a
keyboard. And when we started the project, we cast a broad net across
possible approaches to text input. We found that of all the methods of
text input we considered, none met the goals above as well as a
keyboard. The majority of people are simply faster, more accurate, and
more comfortable typing than they are writing any other way. Windows has
highly accurate handwriting recognition in several languages, as well
as advanced speech recognition, for example. But without a great touch
keyboard, we were not going to be able to fulfill people’s needs and
expectations for touch-screen devices running Windows. So we set out to
create the best touch keyboard on any device.
Optimizing for comfort and posture
There are many ways to imagine touch keyboards on a tablet, and we
sketched a lot of them—large keyboards, tiny keyboards, floating
keyboards, circular keyboards, swipe keyboards. But our initial design
process was grounded in research we did into the ways that people
interact with tablets. Our researchers conducted an in-depth study in
which they observed people “living with” tablets over a period of time.
Through these observations and interviews, we saw a set of three
postures that are most common among people using tablets:
- One hand holding the device, with one hand interacting with the user interface
- Two hands holding the device, with thumbs interacting
- Resting the device on table, lap, or stand, and interacting with both hands
In these postures, people felt most natural and most likely to use
the tablet for longer periods of time. We’ve made many design decisions
in Windows 8 to optimize for these postures, and that includes how
people intuitively input text. When typing on a tablet, most people
either set it on their lap or a table and multi-finger type, or hold it
in their hands and type with their thumbs, or hold it with one hand and
“hunt and peck.”
Our standard touch keyboard layout is optimized for laying the tablet
down and multi-finger typing, and also works well for typing with one
hand. We also introduced a new layout we call the thumb keyboard (which
we showed for the first time at our very first preview of Windows 8
about a year ago), which is designed for holding the tablet with two
hands and typing with your thumbs. This keyboard is adjustable in size,
to accommodate different hand sizes. An interesting observation from our
posture research is that people frequently switch postures, and that
posture switch is often seen as a positive thing, as we move about to
remain comfortable. So in our keyboard layouts we also considered what
it would be like to type for a period of time—say, an email to your
mom—and switch postures while you do it. You might start by typing with
the tablet lying on the coffee table, for example, but then you might
tire of that posture and pick up the tablet, lie back on the couch, and
interact with two thumbs.
Further research into posture and comfort helped us to understand how
people hold tablets, and how far our thumbs typically reach. In a
follow-up study, we had a wide selection of people with different hand
sizes use a tablet with sensors that would indicate where their thumbs
could reach most comfortably, where they could extend to, and where
reach was just uncomfortable. These results helped us optimize the use
of the system with thumbs, and helped shape the thumb keyboard layout.
Typing on glass
The next challenge we considered was the experience of typing on the
glass display of a tablet. At least one of the key postures—laying the
tablet down—is analogous to typing on a physical keyboard. So unlike
typing text on a phone, we were faced with direct comparisons with the
physical keyboard experience. When you type on your laptop or desktop,
you enjoy some real benefits. You get a lot of sensory feedback as you
type. First, you can position your hands quickly on your home keys, and
most keyboards have small bumps on the J and F keys (in English QWERTY
keyboards) to confirm that position. Then, as you type, the shape of the
keys reinforces where your fingers are as they move about. The keys
have “travel,” or small up-and-down movement, which confirms that you
struck them. And because the keyboard is mechanical, there is a tapping
sound that confirms your key strikes (perhaps to your chagrin, if your
colleagues are checking email during meetings J).
If you lay down a piece of glass and type on it, you get no feedback;
there is no indication for where to position your hands, and there is
no indication of whether you’ve hit a target or not. Recognizing this,
we made a few decisions. We needed to provide some type of feedback, and
we needed to recognize that people will be more “sloppy” when typing on
a touch keyboard. But we also observed that a touch keyboard can do
things that a physical keyboard can’t, and we should bring those
functions out.
The feedback you see in the touch keyboard comes in two forms—the
keys change color when you touch them, and they trigger a subtle sound.
This is similar to what you see on most phone touch keyboards. We
considered other forms of feedback, but ruled them out as too disruptive
or unnatural. For example, we explored haptic feedback (a vibration of
the device based on input) which you also find on many phones. But most
people find the current state-of-the-art haptics somewhat irritating
when typing pieces of any length and a buzz can feel as much like a
punishment as a reassurance.
Our two forms of feedback—visual key changes and sounds—are not
without controversy either. Visual key changes are not always ideal when
you are entering a password, for example, and for that reason we enable
you to suppress feedback in these cases. Some people have argued that
key press sounds are irritating and artificial. But user testing
confirmed our assumption that people clearly find the sounds reassuring
and confidence-inspiring when typing on glass. The specific sounds we
use (which are very similar to those on the Windows Phone) are designed
to be “residual,” where you quickly forget that they are there, but
would notice if they were turned off.
Both forms of feedback may be used more when people are first getting
used to the experience. We have done eye-tracking studies in the lab,
which showed that as people become more proficient with the touch
keyboard, they spend more time looking at the input field, and less time
looking at the keyboard itself. So the appearance of each character
becomes the best feedback when you are typing efficiently. I’ll tell you
a little more about these eye-tracking studies later in this post.
But even when you “get good” at typing on a touch keyboard on glass,
you will still be sloppier and slower than you would be with a physical
keyboard. The Windows 8 touch keyboard has some special accommodations
to address this reality. The most interesting one is what we call the
“touch model.”
When you tap a key on the touch keyboard, we detect the coordinates
of your touch, and we can map it to the geometry of the keys. But as
your fingers move about across the glass, your press is likely to
migrate outside the boundaries of the key you intended to touch. If we
relied simply on the geometry mapping of the keys, you would see a lot
of errors. To account for this, the key press is first compared against a
model that assesses the likelihood that you intended to strike that key
or a key near it. This processing is informed by two things. First, we
use data from many people’s typing pangrams, or phrases that use
every letter of the alphabet, recording trends where peoples bias their
touch away from the intended target. For example, they might intend to
type a p, but often strike the o, because most people’s
fingers curve inward. Based on a set of characteristics, including
typing speed, the model weights the likelihood that you intended to type
one key over another. Secondly, we use lexical data representing
letters and words that are likely to be strung together in writing. This
is the same system that enables spelling correction—the system “knows”
what you probably intended to type even if you made a mistake.
Based on the touch model, the keyboard is often able to quietly correct cases where you intended to type a p for example, but inadvertently struck the o, on a QWERTY layout. Or consider the example where you are typing the word “the.” If you type t then h and then touch between the e and w but slightly more on the w, the touch model adjudicates this, knows that t-h-e is the common character combination in English rather than t-h-w, and appropriately outputs the e. But if you touch the w
fully, the keyboard respects that input and assumes you know best. This
all happens while you are typing, so the right character goes into the
input field and doesn’t require further correction. When this works
best, you don’t realize it’s even happening, increasing your confidence
in typing on glass.
typing
Once we accounted for feedback and provided “guard rails” for
inevitable mistakes, we still had to determine the specific keyboard
layouts—what keys go where. Key positions have a big influence over
typing speed and accuracy, and people have very strong—and often
conflicting—opinions about keys. But the design problem broke down
logically, based on our observations of interaction and some physical
realities. For example, we confirmed our assumptions that:
- Most people have developed very strong habits based on the conventions of physical keyboards. When you break these conventions, it slows their typing down appreciably. This even applies to very young folks or dedicated T9 typists, for example, as most of us learn to touch-type in some form at a young age.
- There are optimal targetable sizes of keys. The extensive research Microsoft has done into physical keyboards applied here too. For example, the letter keys on our touch keyboard are 19mm wide, the same as on most physical keyboards, because people showed faster typing speeds with targets of that size (rather than smaller or larger).
- The more keys you include, the more likely people are to make mistakes. This is partly because more keys mean the keys need to be smaller and there’s a greater likelihood of hitting a key you didn’t intend. More keys also create visual clutter and distraction and slow your ability to scan and find a key.
- You don’t want to obscure more than half the display with a keyboard. A too-large keyboard creates a claustrophobic experience and you lose context. However, there is a counter rule that says obscuring about half the display works fine. This is because entering text is most often a “modal” activity, where your focus is very much on typing something and not on the periphery. Your area of focus outside the keyboard is relatively small, and directed toward the characters you’re typing. Our eye-tracking studies, illustrated in this post, demonstrate this.
- People use some keys more than others. We deduce this from analyzing passages of text written in real-world circumstances. There are clear patterns of frequency in the use of letters and symbols.
- People will learn to do new things—and learn quickly—if they don’t interfere with habits.
So in the end, the layout of a touch keyboard in any language becomes
a balancing act of the different factors. You want to reduce the number
of keys in the default layout, for example, but if you remove a key
people rely on in typing every day, you will frustrate them. The layout
needs to be big enough to support accuracy, but not so big it obscures
the application.
There was one more overall rule or principle that we applied to the keyboard layouts specifically: They must be great for typing.
That seems obvious but it’s clarifying when you recognize that
keyboards are used for a lot of things other than writing
words—shortcuts to UI, for example, or sending commands, or entering
codes. Our keyboard is optimized for typing, because that is its primary
purpose and it must do it well above all other things. Let’s take a
look at a few of the decisions we made that fit within these parameters.
Numbers
We get a lot of questions about why we don’t include a number row in
the default keyboard layout. We use numbers frequently in our jobs, and
we’re used to finding number keys on the top of our physical keyboard.
The Windows 7 on-screen keyboard has a number row, for example. This is
consistent with the overall design of that keyboard—it is essentially a
software emulation of a physical keyboard. It has not been optimized for
a world of touch.
Some of our early designs and prototypes had a number row too. But
when we brought these designs in front of people, the feedback was
strong that the keyboard felt “cramped” compared to what they were used
to. We observed frequent errors and accidental invocation of keys,
especially around the perimeter of the layout. This resulted in a number
of changes, and it confirmed the decision to not include a number row.
Here’s why: Including a number row meant adding a fourth row of
character keys. When we optimize for keys with a targetable size, that
means the keyboard must be that much higher. On a typical tablet device
(say with a screen size of 10.6 inches) adding a number row would mean
that more than half of the display would be covered by the keyboard.
When we combined this with the observation that numbers are typed less
frequently than most letters and common symbols, and you recognize that
the extra keys are causing accidental key presses, we settled on
including numbers on the separate number and symbol view.
That settled, we still had debates about whether to display numbers
as a row across the top of the numbers and symbols view, or to display
it as a numeric pad. We chose the numeric pad for a few reasons:
- People often enter multiple numbers at once.
- It’s easier to scan an organized group than a long row.
- People type number sequences much faster when the numbers are clustered.
We also decided to include the numbers in 1,2,3 order from the top,
rather than 7,8,9, as it appears on many extended computer keyboards or
cash registers. This is an interesting case where the physical keyboard
convention didn’t matter as much, because people have become familiar
and very comfortable with the order of number pads on phones, ATMs,
remote controls, and other modern devices. 1,2,3 order is simply easier
for the eyes to scan and the brain to process than any other order.
Tab key
The tab key has a similar story. It’s a key we use a lot—for
formatting documents, but also for things like navigating input fields
on a webpage. For that reason, we included it in one of our early
touch-optimized layouts, after we had removed a lot of other keys
typically found on physical keyboards. It looked like this.
You might observe that on the right and the left, there are borders
of keys that aren’t letters or symbols. This layout yielded the results
described above—people experienced a cramped feeling. And worse than
that, they frequently missed character keys and inadvertently touched
one of the border keys. When we removed them, people raved about the
openness and comfort of the layout, their errors went down, and their
speed went up. With the Tab key on the numbers and symbols view, it was
harder to reach—but the keyboard was better for typing, and so the Tab
key’s peregrinations were over.
Downshift: a mistake to learn from
The last example we’ll share involves a feature we had in the product
and have subsequently cut. This is a feature inspired by our desire to
make punctuation easier to get to, without a complete view switch. In
this design, the left shift key acted as the shift key does today—it
enabled capital letters and access to alternate symbols from the default
view. We used the right shift key differently—it provided a “peek” into
frequently-used symbols or punctuation. The idea was that you would
“downshift” briefly to select punctuation, for example, but not lose the
context of the main view, and thus be faster. We theorized that this
was a place where we could deviate from convention and provide value you
could only get with software. Here’s a picture of the “downshift”
keyboard.
Suffice to say this prototype did not succeed in the lab.
Participants continually struck the right shift key for the usual
reasons you’d use a shift key. And when the keyboard showed the “peek”
to symbols, they were confused and their typing came to a halt. So this
was a case where we had to stick with the convention of a physical
keyboard.
There is an interesting counter example in press-and-hold behavior.
On a physical keyboard, when you press and hold a character, it repeats.
On our touch keyboard when you press and hold, we show alternate
characters or symbols. This is something a touch keyboard can do well
and a physical keyboard can’t. If you don’t know the specific key
combination to show ñ or é or š, for example, it’s painful to type on a
physical keyboard. It’s easy to find on the touch keyboard. Practically
no one has complained about this departure from convention. We built on
it, in fact. You might discover that you can simply swipe from a key in
the direction of the secondary key, and that character will be entered,
without an explicit selection from the menu. So if you use accented
characters a lot, you can get pretty fast with this. Try it out!
Testing and validating
We’ve been conducting a series of eye-tracking studies, where cameras
record the direction of the participants’ gaze as they are interacting
with the system. These studies help us determine a few things: Where do
people look when typing on a touch keyboard? Does visual gaze change
over time? Are these patterns consistent across different views or
layouts? And is visual gaze correlated to speed of typing?
We’ve found very consistently that people primarily look at the text
field where their characters appear, and they look at the keyboard. This
is so consistent that we designed our text suggestion experience to
optimize for this tendency. Text suggestions (words that are predicted
as you type) appear right by the cursor in the text field, and you
insert them by touching the “Insert” key on the touch keyboard. This is
optimized for where we saw people putting their attention as they typed.
It is notably different, for example, from text suggestion UI you see
on many phones, where there is a band of possible words that run across
the top of the keyboard. On a PC with a full-sized keyboard, people just
don’t look there, and they don’t want to stop typing and change their
posture to select these words.
We also found that our gaze does change over time, and as the gaze
changes, we type faster. You can see this very clearly in the gaze plots
of the eye-tracking studies. A full range of people show this
tendency—from slow typists unfamiliar with tablets to skilled typists
who spend a lot of time with tablets. In all cases, at first, there is
more attention on the keyboard, and the speed is slower. Over time—say,
about 90 minutes over a few days—there is markedly less attention paid
to the keyboard, more to the text field, and words per minute go up
significantly.
Continued refinement
Lastly, below is a picture of the current English QWERTY layout,
which we have in the Windows 8 Release Preview. It is intentionally
spare and open, and the keys that remain are there for explicit reasons.
Each of these has its own story, but we can call out a few highlights:
- The backspace key is there because it’s used very frequently on physical keyboards and touch keyboards. If we removed it, you would find your finger groping for it repeatedly.
- The mode switch key is essential to moving between views and languages and for hiding the keyboard. IME users will find that this is how you switch to Windows IMEs, which also feature touch-optimized keyboard layouts.
- The CTRL key and the right and left arrow keys are intended for text editing operations. You can move your input cursor and cut, copy, and paste without moving your hands from the keyboard. (Note that the CTRL key works just as it does on a physical keyboard—so any supported combination will work. We include labels for things like cut, copy, paste, and bold, because they are related to text editing. The touch keyboard is not intended for “commanding,” which is why you don’t see things like the Windows key or function keys. That is a deliberate decision to stay focused on the goal of being really great for typing.
- The space bar is centered and wide. Physical keyboard research shows that about 80% of strikes on the space bar occur on the right (if you look at older keyboards, you will notice the wear on that side). This holds for touch keyboards too, where people will miss the spacebar if it’s not ample-sized, and this creates errors that are hard to recover from.
- The “emoji” or emoticon key switches you to emoji view, where we support a full set of Unicode-based emoji characters. The use of emoji continues to grow worldwide, and has become a part of how people write and express themselves.
- We also include an option for a standard keyboard layout, which can be useful on a PC without a keyboard when using desktop software that requires function keys or other extended keys. This is easily enabled from the settings Charm, in the General Settings section of PC Settings.
As you use the keyboard, we hope you also discover some extra
features we’ve added to make things easier. For example, if you hold
down the &123 key, you can select symbols or numbers with
your other hand, and when you release, you return to your original view.
The team calls this “multi-touch view peek.
These optimizations apply across the input languages we have in Windows,
as we support a touch-optimized typing experience worldwide. We expect
to make a few more improvements to the typing experience, and we are
really grateful and delighted by the feedback we’ve received so far.
Thanks!
No comments:
Post a Comment