Steve Wozniak
(Credit: Stephen Shankland/CNET News)Apple co-founder and "Dancing With the Stars" celebrity du jour Steve Wozniak has joined the advisory board of DeepDyve, a "deep Web" search company that aims to discover hard-to-find information on the Internet that mainstream search engines overlook.
"The deep Web holds an almost limitless wealth of data, yet most of that information is collecting dust because nobody's come up with a way to mine the data in a way that's useful to researchers and consumers," Wozniak said in a statement Tuesday.
DeepDyve, formerly named Infovell, specializes in searches for biotechnology, pharmaceuticals, patents, physical-science areas, and Wikipedia.
"Steve's place in the history of computing is already well established. But what sets him apart is his passion for technology and his commitment to mentoring and fostering the next generation of technology companies," DeepDyve Chief Executive William Park said in a statement.
A search start-up called Infovell has renamed itself DeepDyve, begun offering a free "deep Web" search tool, and expanded its search technology to the domains of computing, clean tech, and energy.
Infovell announced its search business plans in September, with search technology for the domains of biotechnology, pharmaceuticals, patents, and Wikipedia. Now the search site has begun expanding into physical-science areas.
DeepDyve is designed to reach areas of the Internet not indexed by Google, Yahoo, and other major search engines, the company said. The company has indexed 500 million pages so far and hopes to expand to 1 billion by the end of the year.
The free DeepDyve technology requires registration, and the more elaborate premium product, which offers more complicated visualization and filtering of search results, costs $45 monthly per user.
Google's ever-active search bots, which scour the Web constantly for new pages, have begun a new, more active phase of their indexing jobs.
In a blog post Friday, Jayant Madhavan and Alon Halevy of Google's crawling and indexing team said the company has begun an experiment in which its indexing software experimentally enters text in Web site forms to see what previously undiscovered pages may appear.
"In the past few months, we have been exploring some HTML forms to try to discover new Web pages and URLs that we otherwise couldn't find and index for users who search on Google," they wrote. "This experiment is part of Google's broader effort to increase its coverage of the Web. In fact, HTML forms have long been thought to be the gateway to large volumes of data beyond the normal scope of search engines."
The new Google indexing practice involves only "high quality" Web sites and doesn't run on sites with "robots.txt" files or other standard mechanisms of warding off indexing software.
To decide what words to "type" into the forms, the indexing software samples from among words on the Web page with the form, Google said.
The technology looks related to a company called Transformic that Google acquired, according to a blog post by Anand Rajaraman, who was involved with the technology earlier in his career, while working for Halevy.
- prev
- 1
- next




