He's still saying that. In an interview with CNET News.com, Gates talks about some of the ways that speech recognition has already made inroads and discusses some of the places it will eventually go.
Following the launch of Microsoft's new corporate telephony software, Gates discussed how come the business phone remained the same for so long and how much it can change once it is made part of the same network as the PC. Gates also talked about the possibilities of touch-screen computing, noting how popular the idea of multitouch has been, both on Microsoft's tabletop computer, Surface, and on the iPhone.
Although he plans to shift to part-time work at Microsoft next year, Gates has said he will keep a few key projects under his purview and suggested the natural language interface push is one he'll probably keep working on. Search and the future of Office are also on the short list.
Q: When did you really first see the possibilities of voice? Was there a real early demo you saw years ago that sort of--you saw it and could really see the possibilities?
Gates: Well, certainly the idea that computers should deal with voice has been around a long time. It's kind of a natural way to communicate. In the 1970s, DARPA was funding people, including people at Harvard, to do speech recognition. And so people kind of thought, hey, this should be easy to do. The dream of computers understanding voice goes way back. And the dream that the data network and the voice network would be one and the same goes way back as well.
Microsoft early on took it that, hey, the magic of software would come to bear on both--not just data networks, but also voice networks and video networks, and we got very involved in that. The real surprise to us, frankly, was that because that world was essentially satisfactory, people were so unwilling to take a risk to move, particularly (moving) business phone calls over onto a new platform.
These PBXs (the private branch exchange systems businesses use to manage phone calls) that are really--they're just computers--have existed alongside the normal infrastructure. Their wiring has stayed there, their directory, their server piece. And so we've been patiently sort of investing in this. In fact, in 1999 we got our first large-scale voice, PBX-type work under way.
Gates discusses future of tech
In the coming years, the conference table will be a computer, the whiteboard will be a computer, says Microsoft Chairman Bill Gates.
Gates: Phones should call person, not number
Bill Gates discusses how Unified Communications software will finally modernize the business phone. Just click to see if and how people are available, whether via phone, e-mail or IM.
And so I assume at that point you thought it was going to happen sooner?
Gates: As we take the magic of software to new things, it's OK to be too early. We don't want to be in too late. And so we saw that the pieces were starting to come together. And so it made sense for us to invest. We wanted to be there, particularly as Exchange and Outlook and Office had gotten so strong, you know, people used us to do everything but the telephony piece. The idea that, OK, now we should encompass telephony and do that kind of sat there as a clear, big opportunity for us.
The thing that's happened over the last eight years is this willingness that we now have enough customers who have had very good experiences using Internet transport, bringing the PC into the picture.
With speech recognition, one of the ideas is that there are some applications where it can pay off, even if it is not getting 100 percent recognition. Is finding some of those areas one of the keys to speech recognition being mainstream?
Gates: That's right. Remember, the stuff we're doing with unified communications, speech recognition is not actually a very key element of what goes on. There are some aspects of it. For example, when you're doing audio conferencing in our world, we can tell you who's speaking. And that's very frustrating today in traditional audio conferencing that you don't know who's come and gone, and somebody can speak up and you don't know who that is.
Or with RoundTable (Microsoft's 360-degree video conferencing camera), we use video and audio clues to tell who's speaking and bringing the focus on that. And you always have the full room view at the bottom, but you have that zoomed-in view as well. And so, you know, if it gets it slightly wrong, you can look at the full-room view and see exactly what's going on. And just like if the cameraman was focusing on something different you were interested in, well, the wide view takes care of that.
When you want to search something (in a meeting) if a word sounds like one of three things, for the search case, you can just index all three. And the fact that you might get some false positives, that is, when you do a search, you might get some part of the speech where a similar sounding word was being used, it's not that big a deal. You'll just look at it, skip past it. And so not being perfect is not a huge problem.
13 commentsJoin the conversation! Add your comment