(Credit:
Wolfram Research)
Stephen Wolfram has a track record of scientific breakthroughs and some controversy. He received his Ph.D. in theoretical physics from Caltech in 1979 when he was 20 and has focused most of his career on probing complex systems. In 1988 he launched Mathematica, powerful computational software that has become the gold standard in its field. In 2002, Wolfram produced a 1,280-page tome, A New Kind of Science, based on a decade of exploration in cellular automata and complex systems. The book stirred up a lot of debate in scientific circles. Legendary physicist Freeman Dyson described the tome as "a case of style over substance." (See Steven Levy's Wired profile of Wolfram).
In May, Wolfram will unveil his latest creation, now called Wolfram Alpha. It applies his work with Mathematica and NKS (A New Kind of Science) to Web search. "All one needs to be able to do is to take questions people ask in natural language, and represent them in a precise form that fits into the computations one can do," Wolfram said in a recent blog post. "I'm happy to say that with a mixture of many clever algorithms and heuristics, lots of linguistic discovery and linguistic curation, and what probably amount to some serious theoretical breakthroughs, we're actually managing to make it work...It's going to be a website: www.wolframalpha.com. With one simple input field that gives access to a huge system, with trillions of pieces of curated data and millions of lines of algorithms," he added.
It follows the Google principle, with a simple input box, but takes a different approach to rendering search results. Nova Spivack, CEO of Radar Networks, which developed Twine, an ambitious "interest network" Web application based on semantic Web technologies, said that Wolfram Alpha may be as "important for the Web (and the world) as Google, but for a different purpose."
Spivack shared his initial impressions of Wolfram Alpha based on a two-hour conversation with Wolfram.
"Wolfram Alpha is like plugging into a vast electronic brain. It provides extremely impressive and thorough answers to a wide range of questions asked in many different ways, and it computes answers, it doesn't merely look them up in a big database."
"In this respect it is vastly smarter than (and different from) Google. Google simply retrieves documents based on keyword searches. Google doesn't understand the question or the answer, and doesn't compute answers based on models of various fields of human knowledge."
Spivack gave some insight as to how the Wolfram's search engine works:
Wolfram Alpha is a system for computing the answers to questions. To accomplish this it uses built-in models of fields of knowledge, complete with data and algorithms, that represent real-world knowledge.
For example, it contains formal models of much of what we know about science -- massive amounts of data about various physical laws and properties, as well as data about the physical world.
Based on this you can ask it scientific questions and it can compute the answers for you. Even if it has not been programmed explicity to answer each question you might ask it.
But science is just one of the domains it knows about--it also knows about technology, geography, weather, cooking, business, travel, people, music, and more.
It also has a natural language interface for asking it questions. This interface allows you to ask questions in plain language, or even in various forms of abbreviated notation, and then provides detailed answers.
The vision seems to be to create a system which can do for formal knowledge (all the formally definable systems, heuristics, algorithms, rules, methods, theorems, and facts in the world) what search engines have done for informal knowledge (all the text and documents in various forms of media).
Wolfram's engine isn't going to replace Google, according to Spivack, although he suggests Google would like to own it.
"You would probably not use Wolfram Alpha to shop for a new car, find blog posts about a topic, or to choose a resort for your honeymoon. It is not a system that will understand the nuances of what you consider to be the perfect romantic getaway, for example--there is still no substitute for manual human-guided search for that. Where it appears to excel is when you want facts about something, or when you need to compute a factual answer to some set of questions about factual data."
For now, we'll have to wait until May to see whether the Web and scientific worlds embrace Wolfram's Alpha as a major mathematical and engineering breakthrough.
Read Nova Spivack's "Wolfram Alpha is Coming -- and It Could be as Important as Google"
See also: VentureBeat: Wolfram Alpha -- it's like plugging into an electronic brain
For the last few years AdaptiveBlue has offered a semantically rich Web application that understands things such as books, movies, and music. Clicking on text, such as a company or movie name, brings up a context-sensitive menu of related links. The company is taking its technology a step further, adding a social dimension and renaming the product, "Glue." Along with Radar Networks' Twine and Powerset's Wikipedia search engine (acquired by Microsoft), Glue offers a compelling glimpse into how the Semantic Web will add a new, powerful level of intelligence to the Internet.
Rather than just connect things to related data and services, it also connects things to people and people to people and their things. For example, when a Glue user visits a site with things the software recognizes, such as a movie, artist, wine book, restaurant, or stock quote, a bar appears at the top of the screen with a list of friends and other people in the Glue network who looked at that object. Users can leave brief comments to share an opinion with others.
Glue allows users in its social network to discover what friends share interests with them without going to a central site.
"Glue works as a contextual filter," said Alex Iskold, founder and CEO of AdaptiveBlue. "We show relevant information from friends about the things they visit. They don't have to sift through lengthy lifestreams. For example, if you have 100 friends in FriendFeed, you are a human filter trying to sift through it and the information is completely out of context. The idea is to get the useful information 'chunked' contextually on the pages you visit. We are not asking people to change their habits."
The people surfaced in the Glue bar could have seen the object, such as a movie title, on a variety of sites. "People look at movies at different times and places, but the core semantic technology can understand the same thing and correlate it. As a movie fan, you just want to know what your friends think. It doesn't matter when or where the user visits things; Glue automatically connects them. There is no Glue destination site--the network is the user's context across the Web," Iskold said.
Glue allows users to add comments and indicate a "like" or favorite.
Glue also taps into existing social networks, such as Facebook and Twitter, to add friends, or to "follow" other people. The Glue Navigator allows users to browse the network of people and things, and what friends have identified as a "like" and what they have to say about objects. Glue can display all the music that a friend has viewed and drill down, offering contextual shortcuts to find out more, such as reviews and shopping links, about things on the Web. Glue remembers only the last 20 last things visited, and the things "liked" or commented upon.
Each user has a profile page that shows likes and the number of followers and who the user is following. "It's a way of cross-pollinating interests. You can see what I am interested in and perhaps it is the same books or wine with which you have an interest," Iskold said. "Glue also allows you to claim pages that represent you, such as a blog, FriendFeed, or Twitter. It's an outlet where people know where to find and connect with you. For example, other Glue users could see what you are up to recently on your personal blog."
Glue impressed investors at RRE Ventures and Union Square Ventures (Series A Lead) enough to fund a $4.5 million series B round recently. The company has a good chance of making it through the meltdown.
After less than a year, Radar Networks is going from beta to version 1.0 of its Twine "interest network" Web application based on Semantic Web technologies. "We are not spending four years in beta," said Radar Networks CEO Nova Spivack. "We have a minimal set of features ready for prime time."
The minimal set of interest network features allows Twine users to track and discover relevant organizations, products, people, places, tags, and items, such as photos, documents, and recipes, that match their interests. Twine has a social dimension in the way it leverages the wisdom of its members, via bookmarking, tagging, and shared connections. Underlying Semantic Web technologies provide concept mapping (such as interrelationships between topics or people) and more relevant and structured search results.
In the last six months of beta testing, 500,000 users visited Twine and 50,000 remain active, Spivack said. Half of the Twines created are public and members have added about one million items to the database. "The most interesting statistic is time spent--which has risen in the last month to 12 minutes per session and continues to trend up. This is more than tracking and discovery sites like delicious, Digg and StumbleUpon receive," he claimed.
In order to keep the 50,000 active users and grow its base, the key change from the beta in version 1.0 is a simplification of the user experience. "When we started beta, Twine was about collecting, organizing, discovering, and sharing, and all were equal. It turns out that tracking interests is the most important, so that is what we are emphasizing," Spivack said. Among the more consumer friendly enhancements, the site navigation is streamlined, site performance is faster, moderation features are improved, users can invite people from their e-mail address books, and the recommendation engine explains why an item was recommended and allows a user to opt out.
The Semantic Web aspect of Twine, which was touted when it first launched, has been relegated to the background.
"When we first launched, semantic technology was the story," Spivack said. "It was novel then, but now we have to show the value, and to do that we can't emphasize the semantic technologies in Twine. It's under the hood and that is where it belongs. We surface the value of semantic in lots of ways, such as with the recommendations. Next year users will be able to create their own data types and build an ontology without knowing it's an ontology."
In about three weeks, an update to Twine 1.0 will add a more advanced bookmarking tool and natural language crawling to improve relevancy. "Every page added to Twine will use natural language processing to determine what is the content versus ads, navigation, and other elements. We'll put the full text in our search index and generate tags and create a summary and then crawl every link in the text one hop out and bring that content in as well," Spivack said.
Next year Twine will unleash more of its semantic power, with richer support for structured data and a two-way API for getting data in and out of Twine that will attract application developers, Spivack said. In addition, Twine will introduce a new monetization scheme. "Twine will be to marketing what Google was to advertising," he boasted.
"Advertising is pull-based, passive, and on the side of pages. Marketing is sponsored and highly relevant content that is targeted and delivered to someone in their in interest feeds. It will be clearly marked and users have to option to accept or reject it." The concept is similar to what Facebook has attempted to do with its Beacon program. Spivack said that company has filed patents for its monetization concepts, including a way for users to market semantic objects. "We can provide interesting socio-economics around the content that people collect, share, and buy, and build a one-to-one channel between marketers and users."
Radar Networks is in good shape to weather the economic storm. The company raised $13 million in Series B funding from Velocity Interactive, Draper Fisher Jurvetson, and Vulcan Capital earlier this year.
Below is a video from Radar Networks outlined the new features of Twine 1.0:
Twine Overview from Twine Official on Vimeo
In the midst of the financial meltdown and a contentious upcoming election, you might think the U.S. government and taxpayers are just funding wars, bank bailouts, and bridges to nowhere or somewhere. But this is the same government that funded the Internet way back when and is also funding the next generation of technologies that will make the current Internet seem like a Model-T.
Over the last several years, the U.S. government--via DARPA (Defense Advanced Research Projects Agency) grants--has invested hundreds of millions of dollars in PAL, an acronym for "Personalized Assistant that Learns." Smarter software and networks and augmenting human intelligence are useful in times of war and peace.
As part of the PAL project, more than $200 million of DARPA money has been poured into CALO (Cognitive Assistant that Learns and Organizes) over the last five years. CALO has been run out of SRI International with the assistance of 25 research organizations and 400 researchers.
At this point, Siri's management is being secretive about what the company is developing. The elevator pitch goes something like, "Users' online lives are becoming more complicated and getting out of control for mainstream users. What if there was an easy way for normal users (non-power users) to ask the Internet to help them."
According to the Siri PR pitch, the product is "a new interaction paradigm for the consumer Internet experience that applies intelligence at the interface." The company expects to release a beta version of its initial product in the first half of 2009, according to Dag Kittlaus, a former Telenor Mobile and Motorola executive who is a co-founder and CEO of the company.
"We have to be careful at this stage," Kittlaus told me. "We don't like to play these games, but we need to keep a tight lid on what we are specifically doing. We have some original ideas of what the product is going to do, but we don't want to spark ideas among potential competitors." Those competitors would likely be masters of the Internet with large Internet footprints and research prowess like Google, Microsoft, and Yahoo.
Kittlaus did allow that Siri has more than a dozen partners, presumably large, well-established distribution players that can help build a consumer market for Siri's product. Unlike most Web start-ups, Siri has a business model, Kittlaus claimed. "We have good business models, both existing and emerging. We think CPA (cost per action) is the future, and this specific application is good for CPA and we are partnering on that."
He also touted the pedigree of the company's current cadre of 19 employees. "They are mostly engineers from Yahoo, Google, SRI, NASA, and Xerox PARC," he said. The chief architect of the CALO project, Adam Cheyer is a co-founder and vice president of engineering at Siri, and Tom Gruber, a well-known artificial intelligence and semantic Web expert, is a co-founder and CTO.
Cheyer described CALO as superset of what Siri is developing. "The CALO project is building an automated assistant to help manage and improve your life. The technology spans all aspects of interaction--natural language processing, speech recognition, and planning and reasoning capabilities--and interfaces with all kinds of systems, such as email and contacts," he said.
(Credit:
SRI International)
"Learning in the wild is core focus," he continued. "We want it to improve over time and learn from users with no coaching and without changing any code. We are taking the key elements from the project to commercialize it in a form that will delight users. We are not building systems that do things but that learn how to do things."
CALO sounds like a representation of the famous Apple Knowledge Navigator video from 1987.
"Siri is a subset of that concept," Cheyer said. "We have to keep in mind existing user behavior. It will feel like something close to what people use a lot. We will add speech recognition and other features as we go. We don't want to take such a leap that people cannot identify with it. We'll do things similar to but more advanced than what we do now. The longer term vision is the Knowledge Navigator, although it is an early chapter now and it might look different than that."
According to Gruber, intelligence at the interface allows the computers to make recommendations, like a personal assistant:
The interfaces we use to interact with the world's information are getting smarter. Web portals gave us someone else's idea of the content we should see. Then came search engines, which let us tell the system what we want, one query at a time. We are about to see the next wave -- intelligence at the interface -- in which the system knows about us, our information, and our physical environment. With knowledge about our context, an intelligent system can make recommendations and act on our behalf.
(Credit:
Tom Gruber)
Siri may be working on more intelligent Web interfaces that can make inferences based a wide variety of user activities (the "lifestream"), learning over time on its own, and then taking actions on behalf of users. For example, if you are booking travel or looking for a restaurant, Siri would know your preferences and about travel sites or restaurants, integrating data and context from multiple sources to deliver personal assistance. This could be especially useful in mobile scenarios where you don't want to wade through pages of search results or deal with complex interactions.
Tom Gruber: "If we want our technology to have world-changing impact, bring it to the interface: get useful knowledge from all those intelligent people on the Internet give the benefit of this knowledge to everyone. "
(Credit: Tom Gruber)We'll have to wait for next year, if the company stays on schedule, to see whether Siri can really define a new paradigm for experiencing the Web.
In March, Radar Networks launched Twine, an application that organizes information and connects people, places, companies, products, Web pages, videos, and photos. Along with Metaweb's Freebase, Powerset (sold to Microsoft), Hakia, Reuters' Calias, AdaptiveBlue and a few other start-ups, Radar Networks is trying to crack the code on building a piece of the semantic Web.
In a Times Online article, Web creator Tim Berners-Lee gave an example of how the semantic Web would work:
"Imagine if two completely separate things--your bank statements and your calendar--spoke the same language and could share information with one another. You could drag one on top of the other and a whole bunch of dots would appear showing you when you spent your money."
Twine won't provide that futuristic capability but it attempts to build a "semantic graph" of relationships between content, tags, people and Twines (the collection of items of an individual or group on the service). Each piece of content is a "semantic object," Radar Networks CEO Nova Spivack said, using Twine's underlying ontology and database, which applies semantic technologies such as RDF for storing data.
Spivack told me that public Twines are now visible to visitors to the site and to search engines. So far in the beta phase nearly 15,000 Twines have been created and 354,000 pieces of user-contributed content have been added into the system. More than 50,000 users signed up (34,000 are active) for the service, spending 13 to 15 minutes per session on the site, he said.
A major new release of the Twine platform is slated for release in the fall to address shortcomings and introduce new features. "We have worked on a lot of simplification, reducing the clutter, and we still need to reduce more. Twine has a lot of powerful features nobody uses, so we are moving some of the advanced features out of the way," Spivack said. "The fall release will bring more intelligence and semantics to the surface. For example, we will let anyone define a new type of thing, such as a recipe or baseball team form, to author. It's more like what Freebase does, and we will also likely integrate with Freebase over time."
In addition, performance improvements and algorithms to improve search as well as mining and crawling content are in the works. "A major focus of our work is on personalization and recommendations," Spivack said. "Ultimately, Twine is about 'interest networking' and is a content distribution network. People declare their interests, add content, join Twines and connect with people. As users work with the system it learns about their interests, using artificial intelligence and semantic Web technologies to provide more relevance. We are not attempting to index the whole Web, just the best stuff of interest to users. Ninety-nine percent of what's on the Web is not interesting to a user, so it's more about high signal to noise."
On the business front, Spivack believes that Twine can be an intermediary for users, delivering more targeted marketing messages in addition to content. It's similar to the way Facebook is creating a new kind of environment for advertising based on knowing member interests and their social or semantic graph. "The goal for Twine is to be the place on the Web that best understands your interests and represents them to others. The key is to give users control and privacy," Spivack said.
Twine is a work in progress. It's ambitious and has the potential to demonstrate how a more semantic Web could benefit users. The biggest challenge will be scaling the back-end infrastructure and attracting users, which means Twine will have to become far more easy to configure and use. We'll see in the coming months whether the forthcoming changes to Twine help open the floodgates.
Updated numbers on users and usage, 6:30 AM PST, August 1
On this week's EIC Squared podcast, ZDNet's Larry Dignan and I discuss this week's big stories. It was a busy week on the search front. Adobe is providing Google and Yahoo with Flash Player technology that allows their search engine crawlers to find and index SWF content, including Flash "gadgets" such as buttons or menus and self-contained Flash Web sites. It's good to make more information accessible via search engines. However, Microsoft has been silent on whether Live Search would index Flash content.
In addition, Microsoft bought Powerset for about $100 million to enhance its search platforms. It's not a substitute for acquiring market share via Yahoo Search, but it provides a foundation for making the search experience far more compelling and precise in fewer clicks.
Of course, the Microhoo drama continues this week with the latest rumors. Larry is ready for this opera to be finished.
Finally, we discuss a judge's ruling in Viacom's $1 billion copyright infringement suit against Google and YouTube.
U.S. District Judge Louis L. Stanton ruled that records of every video watched by YouTube users, including login names and IP addresses, should be given to Viacom's lawyers. Larry said it was like combining the worst aspects of a fishing expedition and a witch hunt. Viacom is maintaining that it won't look at personal data and Google is asking for time to anonymize the information. If Judge Stanton's ruling stands, the last shreds of personal privacy on the Web could be thrown out the window.
As expected (see previous reports), Microsoft scooped up Powerset to buttress its search efforts.
Barney Pell, Powerset co-founder and CTO
(Credit: Dan Farber)It's not a replacement for increasing market share by acquiring Yahoo Search, but it gives Microsoft some differentiated search technology and top engineers for less than $100 million. Ramez Naam, group program manager of Live Search, said the Powersoft negotiations happened in parallel with the Yahoo talks over the last few months. Google and Yahoo may also have been interested in Powerset, but no one is talking.
Whether Microsoft can leapfrog Google over the long term with this semantic engine remains to be seen.
Powerset had done a good job of creating a rich semantic layer on top of Wikipedia, but bringing natural language and slick semantic-based interfaces to the entire Web is a long-term and very costly endeavor.
"With an existing search infrastructure, incredible capital resources, unlimited data, a leading search team, and clear mission to revolutionize the search landscape, Microsoft can rapidly accelerate our progress in building semantic search technology and bringing it to full Web scale," Powerset's Mark Johnson said in a blog post about the acquisition.
Powerset can provide direct answers to queries from its Wikipedia and Freebase index and highlight the most relevant search results based on the meaning of the query.
According to a blog post from Satya Nadella, Microsoft's senior vice president of Search, Portal, and Advertising, Powerset's engineers will join the Search Relevance team and remain in San Francisco.
Back to the leapfrogging Google question. Much of what Powerset has enabled with its technology is a superior user experience for searching. Powerset's Wikipedia search, which surfaces concepts, meanings, and relationships (like subject, verbs, and objects in a language), is the very small tip of the iceberg.
If Microsoft can succeed in extending Powerset's technology to key parts of the Web corpus, Google will have to figure out a way to match the quality and user experience. And, there is little doubt that if Google decided that what Powerset and Microsoft are doing as one is important, the company dedicated to dominating search through its engineering prowess will circle the wagons.
A few months ago, Powerset co-founder and CTO Barney Pell told me that his start-up company's software was a first step in changing the way users search and consume Web content. "It's a complete shift. You see this and you want to experience all content in this way. And, as an introduction, it will drive huge investment in semantic and linguistic technology, just as investments were made in information retrieval and scalable databases in the past," he said.
During a conversation after the announcement, Pell told me, "Natural language search will be the center of innovation for the next 20 years." It will likely take 20 years to engineer the semantic, natural language Web that Tim Berners-Lee envisioned in his 2001 essay in Scientific American.
Beet.tv's Andy Plesser has an interesting interview with Michael Zimbalist, vice president of R&D at The New York Times Co. He describes how the newspaper company is finding ways to link print and online in the mobile arena and how rich annotation of content will lead to more personalized delivery of information and the Semantic Web.
Amid speculation that Microsoft is looking to make an acquisition, Powerset launched a public beta of its Wikipedia search engine. It brings a new, rich semantic dimension via natural language query processing to Wikipedia that greatly improves the search and reading experience.
The company calls it a first step in changing the way users search and consume Web content. "It's a complete shift. You see this and you want to experience all content in this way," Barney Pell, co-founder and CTO of Powerset, told me. "And, as an introduction, it will drive huge investment in semantic and linguistic technology, just as investments were made in information retrieval and scalable databases in the past. People working in this space will be very marketable."
Users can enter keywords, phrases, or simple questions in Powerset's search box. Like many Web startups, Powerset is currently free of advertising.
Powerset's natural language search technology is based on patents licensed exclusively from PARC and its own proprietary indexing. Powerset's engine has read 2.5 million Wikipedia pages and extracted "meaning" from the sentences, creating a navigation and semantic layer on top of the popular Web encyclopedia. Following is a pictorial tour of Powerset features:
Powerset has also indexed Freebase, Metaweb's evolving, open database of structured information. The search result page presents Factz, a summary of key information extracted from Wikipedia pages.
Factz can be expanded to display more of the extracted verbs and their associated words and concepts.
Powerset creates a summary of information, or Dossier, on the right side of the page with Freebase and Wikipedia to give users a quick outline view about a topic. Clicking on an item takes the user to the location in the article and highlights the reference.
Powerset generates a summary of the key Factz to create a kind of Cliff's Notes version of Wikipedia article. Clicking on a summary item takes the user to the reference location in the article and highlights the key words. Powerset also includes a page for disambiguation of queries.
Powerset also shows a tag cloud of things and actions found by its linguistic analysis engine on the page. Clicking on a word shows related Factz in the outline.
Powerset can provide direct answers to queries from its Wikipedia and Freebase index, and highlight the most relevant search results based on the meaning of the query. Hakia, another semantic search engine, as well as Google can also surface the date Picasso was born at the top of their results pages.
Powerset's Wikipedia search engine isn't going to slow down the Google in the near term, but it will raise the bar on the search experience for all players. "There are implications beyond Wikipedia," Pell said. " Search is not done. You can see the emerging Semantic Web with our integration of Wikipedia and Freebase. We will add other components with structured data and ways to answers questions."
Powerset has said that the longer term plan is to read, linguistically analyze and index 20 billion documents on the Web, which will be a costly and ambitious undertaking. (Getting acquired by Microsoft would be helpful for that project. Powerset has received $12.5 million in Series A funding from Foundation Capital, Founders Fund, and angel investors in 2006.)
The Semantic Web has been just around the corner for a few years. It turns out that bringing a semantic layer of metadata to the Internet is like climbing a mountain in flip-flops.
Tuesday night, Semantic Web mountain climbers Powerset, Radar Networks, and Metaweb participated in a salon at Powerset's San Francisco office, where I talked with them about their product plans.
Powerset gives wings to Wikipedia
I got a preview of Powerset's search engine, which is due to go into beta in the coming weeks, according to co-founder and CTO Barney Pell and as reported by TechCrunch.
Powerset differs from Google and other mainstream search engines in that it linguistically parses sentences, finding subjects, verbs, objects, synonyms, and other elements using a highly sophisticated, language-independent parser licensed from Xerox PARC).
Powerset then extracts and indexes concepts, relationships, and meanings, rather than keywords. (I wrote about Powerset when it first came out of stealth mode, in June 2007.)
Rather than trying to boil the search ocean, compete with Google, and deal with spam and 20 billion documents, Powerset has focused its initial efforts on giving wings to the 3 million pages of Wikipedia.
Hakia's semantic search engine also indexes Wikipedia and other sources. However, Powerset returns a more comprehensive dossier of results for queries, based on deep analysis of Wikipedia pages and other content, and also provides new ways to navigate and discover facts on the individual Wikipedia pages. More details to come when Powerset officially launches its public beta version.
Powerset plans to index the Web at some point (at a significant cost, in terms of servers and bandwidth). For now--or more precisely, when the company allows the public access to its technology--Wikipedia users will be the beneficiaries of a powerful semantic index and user experience.
True KnowledgeI also got a look at True Knowledge's search engine. Company CEO William Tunstall-Pedoe said the search engine is in private beta for now, with about 7,000 users.
Unlike Powerset and other search engines, Cambridge, England-based True Knowledge is building its own knowledge base. Users input facts, as in Wikipedia, but in a more structured manner. In addition, True Knowledge imports data from sources, including Wikipedia, in the form of discrete facts, such "Sacramento is the capital of California."
Queries, including those in natural language, are parsed for machine reading, and they access the repository of facts accumulated. True Knowledge can make inferences, such as in the following example.
(Credit:
True Knowledge)
The capability to infer truths based on the data repository would be a welcome feature for Wikipedia, which doesn't have an automated method for dealing with contradictions.
Barney Pell (Powerset), William Tunstall-Pedoe (True Knowledge), Nova Spivack (Radar Networks), Paul Davison (Metaweb)
(Credit: Dan Farber/CNET News)
Metaweb
Another San Francisco Semantic Web start-up, Metaweb, was also a participant in the salon. The company's Freebase is more similar to True Knowledge than Powerset.
Freebase is an community-built database with a large corpus of open data sets, including Wikipedia and MusicBrainz. Powerset includes some Freebase-structured content in its index, and True Knowledge could add Freebase data to its knowledge repository.
Radar Networks' Twine
I also chatted with Nova Spivack, co-founder and CEO of Radar Networks. His company created Twine, an application combining bookmarking, blogging, and RSS reading, with an underlying semantic engine to tie the pieces of data together.
Spivack said Twine has about 7,000 users in private beta, as well as 40,000 standing in line for access. Half of the users have created private Twines, with corporations and closed communities of interest using the service for collaboration.
Major enhancements are planned for the summer and fall, including allowing for complete customization of the user interface. "We have only surfaced a bit of the platform so far. Twine as a platform will integrate with other applications, such as blogs, catalogs, social communities, and corporate sites," he told me.
"It's an enormous multiyear project," Spivack said. It's not like a Google beta or a 1.0 version masquerading as a beta." The same could be said of the other Semantic Web services in the room. It's going to be a very long beta cycle.






