Why Johnniac Can Read
WHAT’S THE BEST WAY TO CONVERT ONE’S THOUGHTS INTO data? For millennia, the answer was writing and drawing by hand. But when computers came along, this time-honored method was supplanted by keyboards, punch cards, and other intermediaries. Today handwritten input to computers, while still far from perfect, is becoming increasingly common—for handheld devices, Asian languages with thousands of characters, design software that combines words and images, and signature verification, among other uses. As new as these applications are, however, the technology behind them goes back more than 40 years.
It began with a visionary research group at the RAND Corporation, the think tank based in Santa Monica, California. In 1964 RAND’s “digital tablet” allowed users to write with a penlike stylus and have the writing converted into electronic text. They could also draw simple graphics and edit their work. While limited in its capabilities and far too expensive for commercial use, the RAND Tablet nonetheless showed the way for the Palm Pilots and Tablet PCs of today.
The RAND Tablet was part of a larger effort called the GRAIL Project, which investigated the fundamentals of communication between people and machines. (GRAIL was short for Graphical Input Language, which the system used for image interpretation.) How could such interactions be made as seamless as possible? What would help users focus their attention on the work they were using the computer for, without being distracted by the computer itself? After noting the possibilities that had been opened up by video displays, a 1969 report said: “The flexibility of the output channel … suggests that the input channel should have the same freedom.”
The RAND Tablet was not the first device of its sort, but it was the first to be practical. In 1957 Tom Dimond of Bell Labs invented a system for handwritten input that required users to carefully draw their letters and numbers around a printed set of dots, but it never advanced beyond the prototype stage. Another system had a pen that emitted tiny sparks as it moved across the tablet. Microphones situated along the edges of the tablet picked up the sound of the sparks and used the time delays to calculate the pen’s position. Besides the obvious problems associated with emitting a large volume of sparks, the resolution was too poor to make these tablets useful for writing.
RAND’s invention, developed under the direction of Thomas O. Ellis and Malcolm Davis, was elegant by comparison. The tablet contained an underlying array of tiny printed circuits, 100 per inch in each direction. Each circuit had a digital code assigned to it. These circuits, which exploited recent advances in fine-line photo etching, were printed directly on both sides of a copper-clad Mylar sheet 1/2000 of an inch thick. The sheet was a little more than 10 inches square and had 1,024 lines in each direction, allowing the position codes to be expressed with 10 bits. The circuits were connected to the computer with extremely fine wires, one per row or column.
Touching the tablet with normal writing pressure tripped a pressure-sensitive switch in the pen, and as the pen moved across the surface, each integrated circuit that it passed over emitted a sequence of pulses identifying its position. These pulses were sensed by a device in the pen’s point and transmitted through the pen to a computer—at first a Johnniac, based on a design by the mathematician John Von Neumann, and later an IBM 360/40. The computer processed the input and displayed it as lines of “ink” on a cathoderay terminal (CRT) screen.
When the pen was lifted, indicating that a character was finished, the computer “read” the character and converted it to a standard typeface and size (approximately 3/16 inch, or 13 points). If further input changed the character to a different one—for example, turning an O into a Q or an F into an E—the new character replaced the old one. Boxes, which were automatically squared off into neat rectangles, and other simple shapes could also be input, and text could be placed inside them. This capability made it easy to draw flow charts, which were commonly used at RAND and elsewhere for mapping out computer programs.
The RAND Tablet did require some concessions from users. They had to write in all uppercase, in a specific size, and they had to lift the pen from the paper between characters and avoid overlapping. They also had to modify some characters, putting a diagonal line through the letter O to avoid confusion with the number O, a horizontal line through the diagonal part of a Z to distinguish it from a 2, and so forth. On the other hand, the writer could correct a character by just writing directly over it and could erase a character or even a whole line with a simple “scrubbing” motion.
At first, developers wondered if users would be able to handle the spatial disconnect between what the hand wrote on the tablet and what the eye saw on the screen. They were surprised to learn that this was not even an issue, though the temporal disconnect —the delay between a stroke of the pen and the “ink” showing up on the screen—took some eettine used to.
The software that allowed the tablet to read handwriting was created by the RAND researcher Gabriel Groner. It could interpret 53 symbols, including the 26 letters of the alphabet, numerals 0 to 9, and symbols and punctuation marks. The program stored a pattern for each character in the form of a time series of points. As the user wrote, the computer compared each time series received from the pen with its library of characters and selected the one it most closely resembled.
Before making this comparison, the computer condensed and smoothed the data by averaging groups of points. This step was the result of Groner’s work with neural nets, which had taught him that more data does not necessarily lead to better results. In interpreting the curvature of a line, for example, he used only four directions—up, down, left, and right — instead of the eight that many other computer-graphics researchers were using at the time. This reduced curves to zigzags, but it also made the computer’s analysis a lot simpler.
The RAND Tablet’s success showed that direct input of handwritten text and hand-drawn images was feasible, but it was hardly practical. As Willis Ware, a central force in RAND’s early computer research, pointed out recently, the biggest problem was the absence of a hugely popular and commercially successful “killer application” that would sell thousands or millions of units. In the mainframe era, when computers were big and expensive and usually operated by a few specialists, no such application was possible.
Into the mid-1970s researchers in various laboratories continued to experiment with handwritten input. Some bought RAND Tablets and adapted them to their systems, while others developed tablets or other schemes of their own. Notable research efforts took place at Sylvania and at MIT’s Lincoln Laboratories, among other places. As people grew accustomed to computers, however, keyboarding came to feel just as “natural” as writing to many users. For this reason, as well as the difficulty of reading natural handwriting in all its complexity, research in the field experienced a lull in the late 1970s and early 1980s. Until recently, pen-oriented approaches to computer interface were notoriously unsuccessful in the commercial marketplace. But the current proliferation of handheld devices, which were inconceivable when a typical computer was the size of several refrigerators, has led to a revival.
Leonard Kitainik, a developer of Apple’s ill-fated Newton handwriting-recognition project and now the general manager of a company called Pen & Internet, insists that the technology remains essential because handwriting is “a powerful part of the culture.” Today’s pen-and-tablet input devices use a variety of methods to go from pen stroke to screen image, some similar to RAND’s grid of wires and some using different principles. Penlike devices currently in development can input text and graphics using plain paper or any other surface, without a special pad.
Character-recognition algorithms have progressed greatly as well. With varying degrees of success, programs now interpret printing or script using a combination of methods, such as comparing the input with standard letter shapes, consulting a prior sample of the subject’s writing, looking up doubtful words in a dictionary, and using linguistic clues that suggest which letters, words, and prefixes or suffixes make the most sense in the context.
These developments could not have occurred without enormous increases in computing power and the present-day ubiquity of computers, which have made mass production of tablets possible (RAND Tablets were mostly handmade and cost $18,000 apiece to build). Gabriel Groner, who has worked chiefly in voice-recognition research since his work on the RAND Tablet, concedes that “the RAND Tablet depended on hardware that was not economical.” But after four decades, he says, “I keep being surprised that the world is still catching up.”