Amid speculation that Microsoft is looking to make an acquisition, Powerset launched a public beta of its Wikipedia search engine. It brings a new, rich semantic dimension via natural language query processing to Wikipedia that greatly improves the search and reading experience.
The company calls it a first step in changing the way users search and consume Web content. "It's a complete shift. You see this and you want to experience all content in this way," Barney Pell, co-founder and CTO of Powerset, told me. "And, as an introduction, it will drive huge investment in semantic and linguistic technology, just as investments were made in information retrieval and scalable databases in the past. People working in this space will be very marketable."
Users can enter keywords, phrases, or simple questions in Powerset's search box. Like many Web startups, Powerset is currently free of advertising.
Powerset's natural language search technology is based on patents licensed exclusively from PARC and its own proprietary indexing. Powerset's engine has read 2.5 million Wikipedia pages and extracted "meaning" from the sentences, creating a navigation and semantic layer on top of the popular Web encyclopedia. Following is a pictorial tour of Powerset features:
Powerset has also indexed Freebase, Metaweb's evolving, open database of structured information. The search result page presents Factz, a summary of key information extracted from Wikipedia pages.
Factz can be expanded to display more of the extracted verbs and their associated words and concepts.
Powerset creates a summary of information, or Dossier, on the right side of the page with Freebase and Wikipedia to give users a quick outline view about a topic. Clicking on an item takes the user to the location in the article and highlights the reference.
Powerset generates a summary of the key Factz to create a kind of Cliff's Notes version of Wikipedia article. Clicking on a summary item takes the user to the reference location in the article and highlights the key words. Powerset also includes a page for disambiguation of queries.
Powerset also shows a tag cloud of things and actions found by its linguistic analysis engine on the page. Clicking on a word shows related Factz in the outline.
Powerset can provide direct answers to queries from its Wikipedia and Freebase index, and highlight the most relevant search results based on the meaning of the query. Hakia, another semantic search engine, as well as Google can also surface the date Picasso was born at the top of their results pages.
Powerset's Wikipedia search engine isn't going to slow down the Google in the near term, but it will raise the bar on the search experience for all players. "There are implications beyond Wikipedia," Pell said. " Search is not done. You can see the emerging Semantic Web with our integration of Wikipedia and Freebase. We will add other components with structured data and ways to answers questions."
Powerset has said that the longer term plan is to read, linguistically analyze and index 20 billion documents on the Web, which will be a costly and ambitious undertaking. (Getting acquired by Microsoft would be helpful for that project. Powerset has received $12.5 million in Series A funding from Foundation Capital, Founders Fund, and angel investors in 2006.)
While Powerset is preparing for the public rollout of its unique, semantic search engine, Microsoft may be interested in acquiring the start-up, according to sources.
I asked Barney Pell, Powerset co-founder and CTO, whether there was any truth to a Microsoft-Powerset deal rumors. He said, "No comment," and noted his policy of not commenting on rumors. Microsoft also declined to comment on rumors.
Powerset co-founder and CTO Barney Pell
(Credit: Dan Farber)Bringing Powerset, which has no revenue and a tiny user base at this point, into the fold would be spare change for Microsoft compared with spending $45 billion to $50 billion on Yahoo. But, it could bring something useful to Microsoft--and Yahoo, if their union were consummated--in the battle for search users with arch rival Google.
Powerset raises the bar on search based on a preview that I had of the service last month. Powerset differs from the Google in that it extracts and indexes concepts, relationships, and meaning, rather than keywords. It's able to create connections and pivot in some cases in ways that elude Google's proficient engine, which favors more of a statistical approach.
Powerset uses a sophisticated natural language parser (licensed from Xerox PARC) to find subjects, verbs, objects, synonyms, and other elements for indexing.
Initially, Powerset is performing its magic on the 3 million pages of Wikipedia content, enabling a new kind of search and navigation experience on the popular information resource.
A next step would be to index the Web, which would be of great interest to Google rivals. Powerset has garnered $12.5 million in Series A funding from Foundation Capital, Founders Fund, and angel investors. Given the cost to scale up a semantically rich index of 20 billion Web pages, Microsoft would be a good match for Powerset. Then again, so would Google. Stay tuned...
- prev
- 1
- next





