September 26, 2002 4:00 AM PDT

Microsoft looks to decipher your scribblings

news analysis Microsoft is looking for a better way to make sense of your chicken scratch.
Read more about Tablet PCs

The current version of its Tablet PC operating system, which lets computer users control their machines with a penlike device instead of just a mouse and keyboard, checks people's scrawls against the company's own database of handwriting samples. The idea is to ensure that the unlimited variety of T's that human hands can produce will, in fact, be recognized as T's.

Even though the software, due to appear in devices Nov. 7, can try to make a match with the many different types of handwriting in its database, it can't adapt to an individual user's style of writing and learn, for example, that the person never crosses his or her T's.

That's been a source of some contention within the company. Chairman Bill Gates, who is also chief software architect, is pushing to let the operating system begin to recognize an individual's handwriting quirks and expand the database on that user's machine. But others within Microsoft say such a capability could do more harm than good, and the company would be better off just updating the operating system's master database periodically.

"There's a big debate about it," said Jeff Raikes, Microsoft group vice president. "On the one hand, it seems obvious, (but) there is a question of if you (let users) put new samples into (their) database, are you going to improve the recognition, or are you going degrade the recognition?"

By maintaining central control of the database, Raikes said Microsoft is able to judge whether a particular addition actually improves the accuracy of the handwriting recognition--an ability the company loses if the software adapts on an individual basis.

Another issue on the table is how strictly to adhere to the software's internal dictionary, the collection of words it uses to identify what someone has written. The current version is good at recognizing words it already knows but has trouble extrapolating when it encounters others, such as names. The question is whether to allow the OS to throw out the dictionary in these cases and focus instead on individual letters and combinations of letters.

"If I ink something and (the OS) doesn't correctly recognize it, we should relax the dictionary relationship on my second attempt," Raikes said. "I think a large percentage of the time where it didn't get it, is because something I was inking wasn't in the dictionary."

The software giant hopes to resolve the various issues at a meeting in December, Raikes said.

Search for a Rosetta Stone
Microsoft is not the first to grapple with the question of how best to deal with handwriting recognition. Earlier pen-based devices have taken various approaches to making sense of what people write.

Apple Computer's 1990s-era pen-based handheld, the Newton, included both handwriting-recognition software and a program that scanned block letters. The initial effort wasn't that good, though, and made Apple the butt of several jokes, including a memorable Doonesbury cartoon in which the gadget produced ludicrous translations. Although fans say the technology improved greatly in subsequent generations, the enduring impression in consumers' minds was that handwriting recognition was not ready for prime time.

Although the Newton was canceled, Apple has continued to work with the underlying handwriting recognition, which has been re-dubbed Inkwell and is alive again as part of the latest version of the Mac OS X operating system.

One of the greatest commercial successes in the field has been Graffiti, a specialized sort of handwriting developed by Palm founder Jeff Hawkins.

Graffiti is based on block English letters, but with some tweaks designed to make it easier to recognize letters as well as minimize the number of strokes required to enter a letter. The language requires some learning, but it lets the relatively minimal processor included in Palm's handhelds do a pretty decent job of recognizing letters.

Despite the initial popularity of Graffiti, Hawkins' current company, Handspring, is largely moving away from pen-based input in favor of tiny keyboards, an indication of the challenges that remain for handwriting recognition.

Microsoft has itself used a variety of handwriting-recognition methods in its handheld computers. The current generation of Pocket PC-based devices supports Graffiti, as well as another type of character recognization and a cruder version of the handwriting recognition built into Tablet PC.

Poor marks in penmanship
Peter Glaskowsky, Microprocessor Report's editor-in-chief and an avid Newton fan who still uses one of Apple's later Newton devices, said today's computers should be able to do better than the discontinued handheld, but often don't.

Handwriting software should take greater advantage of the processing power of today's machines, much as the best voice-recognition software does, he said.

"My feeling is that they have to both learn and be capable of excellent recognition without learning," Glaskowsky said.

But people who want a computer to do a good job of recognizing their handwriting should be prepared to adapt as well.

"I wasn't very effective with the Newton until I changed a few things about how I write," Glaskowsky said. "There is a give and take. It can learn something from you, but you can learn something from it."

Still, where you are in the world has a lot to do with whether you find today's handwriting recognition adequate, IBM researcher Jay Subrahmonia said. In a good number of Asian languages, it takes many strokes of a keyboard to enter a single character, a fact that makes the quirks of recognition software more palatable.

"People (writing Asian languages) are willing to write more neatly," Subrahmonia said.

Handwriting recognition is also better suited to specialized tasks, such as filling out forms, than it is to general note-taking. The software can be made more accurate when it has a better idea what type of answer to expect.

IBM strongly believes that creating a standard for digitized handwriting could allow the whole industry to collaborate better and improve recognition. The company is actively pushing a format known as InkXML.

"Different engines could work together," Subrahmonia said. "The field as a whole would benefit."

One conclusion that many companies, including Microsoft, haven't ruled out is that maybe it's best not to try to convert handwritten notes at all.

"People will learn that a lot of the value is just having the ink," said Microsoft's Raikes, referring to digitized handwritten notes that haven't been converted to text. Raikes said he has his last three months worth of such notes stored on his Tablet PC-based device.

IBM, which has shipped handwriting recognition software with many products--including its discontinued TransNote laptop--said its thinking has also evolved. Initially, the company shipped software focused on learning and recognizing everything that was written. More recently, the company has focused on understanding just a few key words to index and archive written documents, Subrahmonia said.

"Our whole thinking shifted to keeping ink as ink," Subrahmonia said.

While the debate continues over how to improve recognition--or, indeed, whether to bother with it at all--Microsoft is unified in its efforts to tackle multiple languages.

The initial version of the Tablet PC operating system will recognize writing in English, French, German, Korean, Japanese and both simplified and complex Chinese. That leaves it blind to all others, including Spanish, one of the world's most common languages, spoken by close to half a billion people.

"We are aggressively investing in more languages now for future releases," said Alexandra Loeb, general manager of Microsoft's Tablet PC effort.

 

Join the conversation

Add your comment

The posting of advertisements, profanity, or personal attacks is prohibited. Click here to review our Terms of Use.

What's Hot

Discussions

Shared

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.