If you've ever been immersed in a book and felt that some of its literary references were sailing over your head, a proposed software standard unveiled on Tuesday could make you a whole lot more erudite.
Known as Smartwords, the would-be standard is spearheaded by a start-up called Wordnik, and supported by media big shots like The New York Times, Forbes, the Huffington Post, O'Reilly Media, Vook, Ibis, Scribd, and the Internet Archive. The goal? To build a map of the English language, said Wordnik CEO Erin McKean.
Officially, the company refers to Smartwords as "a lightweight, easy-to-use standard for retrieving and publishing real-time, contextually-aware information about words."
Already, the Burlingame, Calif.-based Wordnik has been generating buzz for its early work, a new, partly crowdsourced take on how to build a dictionary that has, McKean said, resulted in the company compiling information on more than 8 million English language words. Good magazine called the efforts "the steady editorial hand of a traditional dictionary with the user-supplied chaos of the Web."
But now, with Smartwords, Wordnik and its partners are aiming to bring deep levels of context to any kind of electronic text--be it in e-books on readers like the iPad, Kindle, or Nook, or on computers or mobile devices--by examining words and the words around them and linking readers to potentially vast amounts of information about them.
"A lot of people say you can know what kind of person someone is by looking at their friends," said McKean. "It's the same with words. And in the same way that people can get a pretty good idea of a person's demographic--are they the kind of person that goes to Dunkin' Donuts or to Starbucks--you can say the same thing about words. Are you the kind of word that appears in (a popular entertainment) magazine or are you the kind of word that appears in The New Yorker?"
The first stabs at Smartwords should roll out this summer, McKean said, and will be geared toward helping people get a much more personalized experience out of their digital reading. There's no way to know, of course, whether it will catch on, but with the support of Wordnik's partners and a likely surge in interest in e-readers, Wordnik clearly hopes it is hitting a sweet spot at precisely the right time.
One of the most important utilities of Smartwords is expected to be a system that will highlight words a reader isn't familiar with, offering them a definition and other contextual information.
"Right now, your e-reading device is as dumb as a book shelf [which] doesn't know what books are on it," she said. "But why doesn't your e-reader know that you've already read tons and tons of books about [one topic or another] and when you start reading a new one, it only highlights the words it thinks you haven't seen before."
Another important tool, and the one that could help with people's contextual understanding of what they're reading, could offer readers historical, literary, scientific or other kinds of references to phrases or terms they're not familiar with. So, McKean explained, if an e-book contains some sort of allusion to Shakespeare or the Bible, a reader could be alerted--and might gain more insight into the text than they would had they not understood the reference.
Similarly, McKean imagines Smartwords being used to alert readers that a concept called one thing by an author might be called something else entirely by everyone else who has written about it. It would be able to highlight that connection, she said, by analyzing the relationships of the words used to illustrate the concept and comparing that to other texts. As well, Smartwords could be helpful for things like helping people understand which Henry they're reading about in a book on British history.
For Wordnik, launching Smartwords this summer is clearly crucial to getting applications like these onto digital devices everywhere. Once people have the standard in their hands, McKean said, anyone involved in putting out words digitally, from major book publishers to small bloggers, could implement the Smartwords application programming interfaces to provide more information to readers, and could, in theory, come up with new applications extending the standard's usefulness.
It might not seem like the most intuitive thing, but McKean argued that a very important element of Smartwords will be its social components. Basically, she said, readers will be encouraged to share "snippets" of text. They will be able to highlight a word, and Smartwords will find the context for the word, or phrase, and share that on Facebook, Twitter, or other social media.
Our society may not seem like one where people would be eager to share things like words, but McKean said that when you pay attention to people's behavior, you begin to notice that people are reading everywhere you look, whether it's a book or magazine on the train, news articles on their iPhone or e-mails on their computers.
Smartwords could also share information about where readers are in certain kinds of texts, or where they've quit reading, she said. That could be immensely valuable to publishers trying to gain more information about how people use their content. And while such a system raises the obvious privacy concerns, McKean said that addressing those concerns will be one of the goals of the Smartwords advisory board, which includes the Internet Archive, a nonprofit well-known for looking out for privacy.
One might also ask how a company trafficking in information about words plans on making money. McKean said Wordnik is essentially planning to roll out Smartwords using a traditional API model and therefore will be able to make money from things like freemium services, metered use. "The best way to monetize is to make stuff that people can't live without," McKean said.