• On TechRepublic: Five super-secret features in Windows 7

Cutting Edge

Read all 'search' posts in Cutting Edge
August 26, 2009 9:25 AM PDT

E-paper sales expected to hit $9.6 billion in '18

by Lance Whitney
  • 6 comments

Electronic paper is stacking up to be a high-growth market, according to a new report.

Sales of e-paper displays are projected to soar from $431 million this year to $9.6 billion in 2018, market researcher DisplaySearch said Wednesday.

The number of units sold is forecast to grow 22 million this year to 1.8 billion in 2018.

E-books are currently the main use and sales driver for e-paper. Most e-book readers, such as the Amazon Kindle and Sony Reader, use the electrophoretic display technology from E Ink. A few e-readers, such as Fujitsu's Flepia, use a different technology called cholesteric LCD. Fujitsu's device offers a color display but is more expensive than the Kindle or Sony Reader.

"E-paper displays are taking off with consumers due to their low power consumption and ease of reading, especially in sunlight," said Jennifer Colegrove, director of display technologies at DisplaySearch. "In addition, e-paper displays are 'green' because they reduce paper consumption."

The number of e-book readers on the market has risen steadily, starting with one model in 2003, three in 2006, five in 2007, and around 20 this year, notes the report.

Despite the visual appeal of Fujitu's color Flepia e-book reader, DisplaySearch asserts that the high price and technical challenges of color e-books will limit their sales volume until 2011. The more popular electrophoretic display technology is likely to continue to lead the market and generate sales of $5.8 billion in 2018.

But other display technologies are poised for growth, the report said. Electrochromic displays, most commonly used in windows and other glass products, will target the market for smart labels and card displays. By 2013, electrochromic displays will be the leading technology for e-paper displays, DisplaySearch is forecasting.

Another competing technology called MEMS (micro-electro mechanical system) is expected to shift its market from cell phone displays to color and medium-sized e-books over the next few years.

Originally posted at Digital Media
Lance Whitney wears a few different technology hats--journalist, Web developer, and software trainer. He's a contributing editor for Microsoft TechNet Magazine and writes for other computer publications and Web sites. You can follow Lance on Twitter at @lancewhit. Lance is a member of the CNET Blog Network, and he is not an employee of CNET.
June 11, 2009 9:24 AM PDT

The many ways to access Wolfram Alpha

by Lance Whitney
  • 5 comments

Say what you will about Wolfram Alpha, the creators are hard at work trying to drum up interest in the site.

On Tuesday, the WA crew launched a number of updates to its service, some of which I tested. Now the team's Thursday blog points you to the many "cool tools" you can use to access the site--buttons, widgets, gadgets, and more. You can grab them from the Wolfram Alpha download page, where you'll find the tools organized by operating system and browser. I took them all for a spin to see how they fared.

Toolbars
Wolfram Alpha toolbars are available for Firefox 2 and 3 and Internet Explorer 6 or higher. I installed separate toolbars on both Firefox 3 and IE 8. After setup, the toolbar popped up displaying a text field where I could type my search term directly.

Since Wolfram Alpha's forte is mathematical questions, I asked the question: "What is the value of pi?" (I'm sure we all remember from high school that pi is the ratio of the circumference of a circle to its diameter.) As expected, the traditional WA search told me it was 3.14 followed by more digits than I cared to count. So far, so good.

Then I experimented with the other toolbar buttons and learned that each one pointed me to different results on the same question. One button points to a page called the Wolfram Demonstration project to illustrate mathematical concepts. Here I could see the value of pi in action by watching a 3D globe with changing dimensions. A button for the Mathematical Documentation center showed me links to complex equations involving pi. Buttons for Wolfram MathWorld and Wolfram Research linked to mathematical and scientific articles on pi. More than I'd ever want to know about pi, but I know it'd make my old algebra teacher smile.

Wolfram Alpha helps me with pi

Wolfram Alpha helps me with pi.

Windows Deskband
The Deskband installed a Wolfram Alpha search tool on my Windows taskbar. Here the same options were available as with the browser toolbar but conveniently accessible from my desktop. Another handy tool.

Windows Desktop gadget
I next tried the Windows Vista Desktop gadget, which plopped a Wolfram Alpha search field on my Vista sidebar. This came without links to the other sources that were accessible from the toolbar, so I didn't find it quite as useful.

Search engine add-ins
This tool added Wolfram Alpha to my browser's list of default search providers. Quick and easy to install, and it worked well in both Firefox and IE.

iGoogle gadget
I use iGoogle as my personalized home page and rely on all of the gadgets available, so I liked this one. The Wolfram Alpha gadget is similar to the sidebar gadget--displaying a single text field for my query.

Internet Explorer 8 accelerator
I installed the Wolfram Alpha Accelerator for Internet Explorer 8. Accelerators let you select text on a Web page to quickly search on it using different sources. Most of the text I found on a typical Web site didn't lend itself to a Wolfram Alpha computational search, so I found little value here.

Mac OS X Dashboard widget
I couldn't test the OS X Dashboard widget because I don't yet have have a Mac. (No comments from Mac users please; it's on my shopping list.) But it should work similarly to its Windows counterpart. I did get the screenshot below from my colleague Stephen Shankland.

Wolfram Alpha Mac widget

Wolfram Alpha Mac widget

Some of the tools did help me see more value in Wolfram Alpha. Of course, they also serve to promote the site, but that's okay by me as long as they work. Of all the downloads, the WA toolbar and deskband demonstrated more of the scope and versatility of Wolfram Alpha by pointing me to different sources--something I didn't know about just from using the Web site.

advertisement
 
Business supplies and services can get expensive. Get smart spending tips and learn about new cost-saving opportunities for your business
June 9, 2009 9:36 AM PDT

Wolfram Alpha rolls out core updates

by Lance Whitney
  • 7 comments

Though only three weeks old, Wolfram Alpha is already showing growing spurts.

On Monday, the project unleashed a variety of updates to its computational search engine, according to the latest blog from the Wolfram Alpha team. The updates include 1,850 changes to its code base and 1.1 million updates to its data.

In one sense, Wolfram Alpha is in constant update mode, since new data is flowing into the system all the time. But this is the company's first major release of so many core updates in one shot.

The blog post lists about 20 of the many updates. Some of the descriptions are abstract, such as "Additional linguistic forms for many types of data and questions." But several of the updates intrigued me, so I took them for a test drive to see how they fared with my own questions.

Here's a list of the updates I tried, along with my results:

More "self-aware" questions answered (e.g. "how old are you")
When I asked Wolfram Alpha its age, it told me it was 24 days, 10 hours, 21 minutes, and 38 seconds old. When I asked how old President Obama is, it said he was 47 years, 10 months, and 4 days old. Simple enough.

Improved linguistic handling for many foods (e.g. "love apple")
I typed "Big Mac," and Wolfram Alpha displayed a nutritional label, breaking down the amount of fat, cholesterol, sodium, and other nasty items in the burger. Definitely makes me want to stick with a salad.

Big Mac Facts from Wolfram Alpha

Big Mac Facts from Wolfram Alpha

Combined time series plots of different quantities (e.g. "germany gdp vs population")
I entered "US population vs. oil prices." The site displayed a single chart tracking and comparing the rise in the number of people in the U.S. with the jump in the price of oil since 1970. I can see this as a useful capability to compare two distinct but possibly related items.

Additional support for stock prices with explicit dates
I typed "IBM June 1 1980," and Wolfram Alpha popped up a chart tracking IBM's stock price for the four-year period from 1979 to 1982, with the June 1, 1980, price highlighted. But Wolfram Alpha's stock data only goes back so far. When I entered a date for IBM stock earlier than June 1970, no chart or data was found. Other stocks also came up empty for data prior to the '70s.

Support for planet-to-planet distances and "nearest planet", etc.
Typing "Earth Saturn" brought up detailed facts and figures for both planets, including mass, radius, number of moons, distance from the sun, and distance from each other. This proved a quick, easy way to get all those facts in one shot.

More comparisons of composite properties (e.g. "US military vs. UK")
I entered "US unemployment vs Spain." The site told me the U.S. jobless rate is 7.2 percent, but Spain's is 13.9 percent. To double-check, I ran Google searches for the two countries' jobless rates. I found the current U.S. rate is around 9 percent (sourced to the U.S. Bureau of Labor Statistics, and I double-checked on the BLS site), while Spain's is 18 percent. Checking further, I discovered that Wolfram Alpha was showing me data from the last three months of 2008 rather than the latest figures.

When I searched Wolfram Alpha for "US employment" and "Spain unemployment" by themselves, I got the same 2008 figures, but this time Wolfram Alpha told me the numbers were for 2008, a fact omitted in my first search results. Finally, I searched for "US unemployment 2009," but the site simply told me no data was available.

Based on my limited searches, my impression of Wolfram Alpha was mixed. I was intrigued by some of the obscure data it conjured up. I also liked the convenience of finding all relevant facts on one page.

But I was disappointed by the gaps in its knowledge. Why no stock information prior to 1970? (I'd like to check stock prices from the crash of 1929). Why no unemployment data later than 2008? And why the lack of consistency in telling me the jobless numbers were from last year? I can more easily Google "unemployment rates" to grab the latest results.

Granted, Wolfram Alpha isn't like a traditional search engine, serving up links to other sites. It relies on its own research and database to deliver results. The service has even said it's trying something different here. And at only three weeks old, the site is still experiencing growing pains.

The latest updates can help track down certain types of data. But Wolfram Alpha may have a way to go before it offers truly complete and up-to-date information.

May 15, 2009 6:03 PM PDT

Wolfram Alpha launches amid glitches

by Tom Krazit
  • 11 comments

This post was updated at 9:10 p.m. PDT to note that Wolfram Alpha is now up and running.

Wolfram Research founder Stephen Wolfram (blue shirt, center) convenes a meeting Friday night live on the Internet to discuss launch problems with Wolfram Alpha.

(Credit: Screenshot by Tom Krazit/CNET)

Wolfram Alpha struggled to get up and rolling Friday evening under difficult conditions, as the company scaled back expectations for its performance this weekend.

The new search engine attempted to make its debut literally in the middle of the perfect storm: a tornado watch had engineers on edge in Champaign, Ill., where Wolfram Research attempted to bring the service online. However, networking and database problems also prevented the engine from launching as of 6 p.m. PDT, an hour after the company said it would go live to the world.

And to top it all off, uplink problems with the Justin.tv service prevented Wolfram from explaining exactly what was going on for nearly half an hour, while commenters in the chat room mercilessly heckled the company with the 21st century Bronx cheer: EPIC FAIL. Eventually, around 5:30 p.m. PDT, Wolfram founder Stephen Wolfram appeared on camera to explain that glitches were holding up the launch, as claps of thunder sounded in the background.

After initially claiming that the service would go live Friday evening, Wolfram lowered expectations by only promising a test launch over the weekend, with full service expected by Monday. Searches could not be conducted through the main home page as of 6 p.m., but it was possible to get in through a back door posted on Twitter and in the Justin.tv chat room and start searches, although performance was spotty.

Wolfram Research was forced to show videos showing off the data center servers and power-redundancy systems taped earlier in the day while Stephen Wolfram convened an emergency meeting to figure out what was going on. The broadcast was somewhat less-than-polished, with audio engineers talking over Wolfram's initial address to the audience and rendering much of his speech incoherent.

Please let us know if you're able to get into the search engine Friday night, and we'll update our coverage over the weekend if Wolfram is able to bring the service online.

Update at 9:10 p.m. PDT: Wolfram Alpha is now up and running and seems to be working just fine. A spokesman for Wolfram Research called at 7:30 p.m. PDT to say it was about to launch, but its exact launch time is unclear.

(Credit: Screenshot by Michelle Meyers/CNET)
May 15, 2009 2:29 PM PDT

Wolfram Alpha encounters 'snag,' launch could be delayed

by Michelle Meyers
  • 10 comments

This post was updated at 4:08 p.m. PDT with information from a Wolfram Alpha blog post and again at 5 p.m. PDT with info from a Wolfram spokesman.

Wolfram Alpha, the new "computational knowledge engine" set to debut publicly Friday, has hit a technical snag that could delay its launch, a spokesman for Wolfram Research confirmed.

The online tool--which some say could give Google a run for its money--supplies answers to factual, data-intensive questions but also does math in the process. It was set to go live to the public at 5 p.m. PDT.

However, in an interview with the Los Angeles Times (and confirmed by Wolfram public relations director John Ekizian), Stephen Wolfram said a large-scale traffic simulation test had failed. "We ran into a small snag, which hopefully won't turn into a big snag," Wolfram said.

Here's more on how Wolfram explained it to the Times:

We have several supercomputer-class compute clusters. One of our tests was to use one cluster to simulate traffic and run it against the other cluster. And when we did that last night, we found that the through-put we got degraded horribly when we increased the amount of traffic that we were pushing from one cluster to the other.

Ekizian said, for now, he could only confirm the accuracy of Times' story. But stay tuned for expected updates related to the planned 5 p.m. launch.

Update at 4:08 p.m. PDT: A Friday afternoon post on the Wolfram Alpha blog about the countdown to launch doesn't offer any specifics on time frame. Rather, it says the team has "been switching on more and more compute capacity, with the expectation of having full capacity available on Monday."

As for the snag, it only says the team is in "full-court-press resolving network infrastructure, database, and all sorts of other challenges, made particularly interesting by the sheer scale and complexity of launching two supercomputer-class clusters along with three other locations."

The post concluded with a reference to a tornado watch in Illinois where Wolfram Research is based: "It will be an interesting evening!"

Update at 5 p.m. PDT: Ekizian confirmed that Wolfram expects Wolfram Alpha to go live between 5 p.m. and 8 p.m. PDT Friday, "for some time." He warned not to be surprised if the service is up and down over the weekend, but it should in "full throttle" by Monday morning.

advertisement
 
Business supplies and services can get expensive. Get smart spending tips and learn about new cost-saving opportunities for your business
May 12, 2009 12:17 PM PDT

Wolfram Alpha gets supercomputer boost

by Stephen Shankland
  • 3 comments

One of my concerns with the public launch of Wolfram Alpha later this month is withstanding the crushing load the Internet can impose. But Wolfram Research revealed Tuesday it's building the service on the world's 66th-fastest supercomputer.

The machine, built out of Dell hardware by a company called R Systems, can sustain performance of 39.6 trillion mathematical operations per second, according to the November 2008 list of the top 500 supercomputers. That muscle will come in handy for Alpha, which I think of as a combination of a graphing calculator, search engine, and reference library that not only supplies some answers to factual, data-intensive questions but also does math in the process.

"There is no way to know exactly how much traffic to expect, especially during the initial period immediately following our launch, but we're working hard to put reasonable capacity in place. Will we have enough computing power to provide computable knowledge for everyone who visits? We hope so," Wolfram Research said on its Wolfram Alpha blog Tuesday

The system, called R Smarr, has 4,608 processor cores using 576 quad-core "Harpertown" Xeon machines, 65,536GB of memory, and high-speed InfiniBand data-transfer connections, according to the Top500 site and a Dell case study on the system (PDF). It also uses both the Red Hat Enterprise Linux and Microsoft Windows HPC Server operating systems, according to the Dell paper.

Alpha requests will be served from five co-location facilities, Wolfram Research said. There actually are two supercomputers in the project, with nearly 10,000 processor cores total and hundreds of terabytes of hard drives.

R Smarr doesn't use ordinary Dell servers. Instead, custom-made machines were ordered through Dell's Data Center Solutions division. "We evaluated the standard Dell PowerEdge servers, but at the time, those systems did not offer a server board that could deliver the high memory bandwidth necessary for our client," said Brian Kucic, R Systems' vice president of business development, in the Dell paper. Kucic.

Some of what goes on behind the Alpha covers is use of Wolfram Research's Mathematica software, which can perform a wide variety of mathematical and graphical operations. A massive number-crunching utility freely available over the Web sounds like a recipe for disaster, but in my tour of a preview version of Wolfram Alpha, I encountered a timeout limit of about 5 seconds in searching all occurrences of a particular sequence in the human genome, so it looks to me like Wolfram has the ability to throttle usage.

May 5, 2009 12:04 PM PDT

Wolfram Alpha shows data in a way Google can't

by Stephen Shankland
and
Rafe Needleman
  • 19 comments

Wolfram Alpha is like a cross between a research library, a graphing calculator, and a search engine. But does Wolfram Research's "computational knowledge engine," set to debut publicly later this month, live up to its hype as a Web site that Google needs to be afraid of?

Wolfram Alpha creator Stephen Wolfram on Tuesday gave a demo of the service to a crowd of online reporters. Few have access to the private test version of the service itself, but we got access Monday night. We found it compelling, if limited.

We're eager to see this site develop. It does things with online information that Google does not. Here are our impressions of the current version of Wolfram Alpha.


Who's it for?

CNET reporter Stephen Shankland: Today at least, Wolfram Alpha is for the tech crowd--the kind of people who want to dig into the data. It's a great exploration tool to find out whether somebody who's 5 feet 5 inches and 160 pounds is overweight, the chemical properties of boron, and whether you're going to get a full moon during the evening of September 4 in Buenos Aires when you want to propose to your fiancee.

Wolfram Alpha will show you when the next eclipse will occur over San Francisco.

Wolfram Alpha will show you when the next eclipse will occur over San Francisco. Click above for a gallery of screenshots.

(Credit: Screenshot by Stephen Shankland/CNET)

It'll tell you the family, genus, species, and caloric value of an apple, and it'll forecast Apple's stock price, but it won't give you apple pie recipes. It'll tell you the box office take of the first "Star Trek" movie, but it won't tell you the theater where you can see the newest "Star Trek" movie.

But a technical audience is still big. This could unlock a lot of data that students, research assistants, lawyers, marketing managers, financial analysts, and scientists might not have readily available. And those folks are important, too--just the kind of influential folks people with Web sites like to reach.

CNET Editor Rafe Needleman: I wouldn't dream of pointing my parents at this. It's too picky about syntax and not intuitive to get into. When I saw Stephen Wolfram give a demo of the system I was blown away. He ran through dozens of demos from weather to genetics to calculus to finance, each resulting in beautiful and informative results. But when I tried the service I'd say maybe only 10 or 20 percent of my queries actually worked.

Shankland: On the other hand, my dad has a Ph.D. and I most definitely will point him at it. He bought Wolfram Research's all-purpose computation software, Mathematica, though, so for him Wolfram Alpha is like preaching to the choir.

Needleman: My dad has a Ph.D., too, but in philosophy. There is no Wolfram Alpha for that.

Shankland: Yet. Alpha handles numeric data well, but loosey-goosey stuff like art or philosophy is tough. But maybe in some glorious future Alpha will be able to chart the trains of thought from the Enlightenment to the present.


Is it easy to use?

Needleman: You need a clear mind to take advantage of this service. Again, it's picky about syntax, and in the pre-release version we tried, if you got a query wrong--if it didn't return what you were looking for--it wouldn't offer you much in the way of help to refine the query. I kept trying to figure out how to correlate weather with earthquakes in San Francisco. I can get the data for weather. I can get it for earthquakes. So I know that Alpha has the information. But I can't figure out how to show them together.

Curious about how closely NetApp's stock price has correlated with EMC's? Wolfram Alpha will tell you.

Curious about how closely NetApp's stock price has correlated with EMC's? Wolfram Alpha will tell you. Click above for a gallery of screenshots.

(Credit: Screenshot by Stephen Shankland/CNET)

What the system does know is beautifully presented. Type in the name of a city, for example, and it will give you some fun stats on a clean and clear Web page. But from that sort of page you'll probably want to start exploring the data available: Maybe you want to know about population growth, economic information, or weather trends. Alpha doesn't give you hints as to what's available, nor a good way to drill into data. You have to take stabs at re-typing your query. I tried a variety of queries like "test scores san francisco schools" and "population of portland by year" and got, respectively, no result and a pointless result (533,429 person years: what is that?). The system that interprets Wolfram Alpha queries needs a little bit of help. It may be improved by the time the system is opened to the public, later this month, but I think that this will be the product's Achilles heel.

... Read more
Originally posted at Webware
May 4, 2009 12:31 PM PDT

Patent reveals Google's book-scanning advantage

by Stephen Shankland
  • 14 comments

Sometimes overlooked in the Sturm und Drang about Google Book Search is any consideration of the mechanics of economically scanning the books in the first place, but a patent awarded to Google gives insight into how the search behemoth accomplishes the task.

In short, Google has come up with a system that uses two cameras and infrared light to automatically correct for the curvature of pages in a book. By constructing a 3D model of each page and then "de-warping" it afterward, Google can present flat-looking pages online without having to slice books up or mash them onto a flatbed scanner.

This diagram shows patented Google technology for correcting for curved pages while scanning books.

This diagram shows patented Google technology for correcting for curved pages while scanning books.

(Credit: Google)
... Read more
April 28, 2009 1:45 PM PDT

Google crashes Wolfram Alpha debut party

by Stephen Shankland
  • 9 comments

Updated at 3:12 p.m. PDT with further detail.

Wolfram Research founder Stephen Wolfram publicly debuted his company's forthcoming online "computational knowledge engine" Tuesday--but search Goliath Google launched a service of its own that bears significant resemblance.

Wolfram Research CEO Stephen Wolfram

Wolfram Research CEO Stephen Wolfram

(Credit: Stephen Wolfram)

The Wolfram Alpha engine is a Web service designed to process data from controlled, vetted sources of data--many not on the Web--then present the results in a way that lets people dig deeper into the subject. It's something of a cross between a graphing calculator, repositories of scientific data, and a system to interpret questions posed in human terms.

"Like interacting with an expert, it'll understand what you're talking about, do the computation, and present the results in such a way you'll be able to understand what the consequences are," Wolfram said in a talk at Harvard's Berkman Center for Internet and Society Tuesday.

For example, people can ask about the molecular weight of caffeine, about the location of a gene in the human genome, the number of people named Andrew born in a particular year, the amount of fish produced in France, the life expectancy of 40-year-olds, and the performance of Microsoft stock--and then dig into the results. The height of Mt. Everest can be expressed in terms of the length of the Golden Gate Bridge.

Wolfram has deep technical chops. He's a MacArthur "genius grant" recipient who got his Ph.D. in theoretical physics at age 20, founded Wolfram Research to commercialize mathematics software called Mathematica that can perform a wide variety of computational and graphing chores. He also spent a good portion of the 1990s writing "A New Kind of Science," a 1,200-page tome (also available online) that seeks to transform science by presenting a computational view of physics.

The Alpha site will be publicly available "in a few weeks," with free access to all users supported by sponsors and subscriptions for heavy-duty users who want the system to process their own data, Wolfram said.

Gatecrashing Google
But another similar service is available today: a Google feature that can search public data and present the results graphically.

"We just launched a new search feature that makes it easy to find and compare public data," Ola Rosling said of the service in a blog post. "The data we're including in this first launch represents just a small fraction of all the interesting public data available on the web. There are statistics for prices of cookies, CO2 emissions, asthma frequency, high school graduation rates, bakers' salaries, number of wildfires, and the list goes on."

The service is based on Google's 2007 acquisition of Trendalyzer, Rosling said.

Google now lets people search public data sets.

Google now lets people search public data sets.

(Credit: Screenshot by Stephen Shankland/CNET)

One example: "When comparing Santa Clara county data to the national unemployment rate, it becomes clear not only that Santa Clara's peak during 2002-2003 was really dramatic, but also that the recent increase is a bit more drastic than the national rate," he said.

Thus far, Google's service includes data only from U.S. Bureau of Labor Statistics and the U.S. Census Bureau's Population Division.

"We hope people will find this search feature helpful, whether it's used in the classroom, the boardroom or around the kitchen table. We also hope that this will pave the way for public data to take a more central role in informed public conversations," he said.

Google didn't immediately comment about whether the timing of its launch was coincidental, and Wolfram Research didn't immediately comment on the Google product.

Alpha's underpinnings
Alpha has four main components, Wolfram said.

• Data curation. Wolfram Alpha uses public and licensed proprietary data sources, and the company uses automated processes and human choices to prepare the data. "At some point you need a human domain expert in front of it," Wolfram said.

• Algorithms. Alpha must pick the right computational processes to present its results. "Inside Wolfram Alpah are 5 million to 6 million lines of Mathematica code that implement all those methods and models," he said.

• Linguistic analysis to understand what a person typed. "I thought one of many things that could have gone wrong was that short, lazy things would (have) huge amounts of ambiguity," for example figuring out whether "50 cent" had to do with musical artists or money. "That turned out to be not nearly as much of a problem as we expected."

• Presentation. "There are tens of thousands of possible graphs. What do you want to show people?" Wolfram asked.

Wolfram hopes the tool will help researchers perform scientific chores that before were possible but not necessarily worth their time.

"What's the angle of sun at particular moment? Given 20 minutes, I could compute it and get it right, but I probably wouldn't bother," Wolfram said. "What Wolfram Alpha does is take that piece of scientific knowledge and make it immediately accessible to everybody."

October 10, 2008 4:00 AM PDT

Academics sink teeth into Yahoo search service

by Stephen Shankland
  • 6 comments

SUNNYVALE, Calif.--It only took a few years for the science of information retrieval to move from an obscure academic niche to the secretive research departments at the heart of multibillion-dollar Internet companies.

But one of those companies, Yahoo, is trying to give a little more power back to the professors and grad students through a program called BOSS (Build Your Own Search Service). The service lets academics and start-ups build their own search sites around Yahoo's search engine for free, manipulating results however they want.

Two dozen researchers and students from Stanford, the Massachusetts Institute of Technology, Purdue, and other universities met here at Yahoo for a day in September to hear the company's BOSS pitch, show off some ideas they've had for how to use it, and try to coax Yahoo into sharing even more information through BOSS. Overall, their response to Yahoo's program was favorable.

MIT's Harr Chen

MIT's Harr Chen would love even more data from Yahoo.

(Credit: Stephen Shankland/CNET News)

"It enables a lot of research that we wouldn't otherwise be able to do," said Harr Chen, an MIT researcher at the event.

If it works out as hoped, Yahoo will make some money out of the program: corporate users who reach large scale with BOSS will have to show Yahoo's search ads. The academic side is a step removed from direct revenue, instead giving Yahoo some prominence with potentially influential thinkers in a market Google dominates. Piquing the interest of researchers at universities with a reputation for incubating the next big ideas is smart, though, and Yahoo and Google themselves both grew out of Stanford.

And honestly, with Google hogging 63 percent of the U.S. search market to Yahoo's 19.6 percent, what does Yahoo have to lose?

"We're not a market leader," said Prabhakar Raghavan, chief strategist for Yahoo Search. "From a strategic standpoint, it does make sense to let other people innovate on top of us. If the pie grows, our share of the pie grows at the expense of somebody else."

The ultimate hope is that BOSS will mean money, too.

Yahoo has made the investment in a massive infrastructure that constantly scans and re-indexes the Web, filters out some of the dreck, interprets search queries, and provides search results in high volume in very short order. This infrastructure is prohibitively expensive for start-ups, just as it is for academic researchers, so Yahoo is letting companies use BOSS as well. Those operating on a small scale may use BOSS for free, but Yahoo requires larger efforts to either show ads or sign a custom revenue-sharing deal.

Mashing up Yahoo results
One possibility for BOSS is that Yahoo's search results can be combined with other data sets. "Other parties may have more info about their users," said BOSS engineer Vik Singh. For example, a social-networking site can track movies or the activities of friends that could be useful in shaping search results. "This is stuff we may or may not have," Singh said.

Prabhakar Raghavan, chief strategist for Yahoo Search

Prabhakar Raghavan, chief strategist for Yahoo Search

(Credit: Stephen Shankland/CNET News)

Chengxiang Zhai and Bin Tan of the University of Illinois at Urbana-Champaign showed one example of BOSS in action that uses this idea of modifying Yahoo's search results. Their application steered Yahoo's search engine in particular directions based on the data stored on a user's own computer.

In the example, the computer was able to discern what type of jaguar the user was more likely to be looking for--the cat, not the car, or the version of Mac OS X--based on evidence on the computer.

"We believe the client side of personalization has a few advantages over the server side," Zhai said. "It can alleviate concern over privacy and it can provide more information about user activity. And it can naturally distribute computation," so a search company's machines share work with the user's own computer.

Qualitatively different
Researchers could investigate search and related technologies such as natural-language processing (NLP) without BOSS. But with it, that research is vaulted into a different domain. It isn't just a matter of taking more time; with BOSS's vast index of the Web, the possibilities are qualitatively different.

"You gain enormously from access to the data. There are all sorts of things you can do with tons of data" that you can't with a smaller set, said Stanford's Christopher Manning.

Manning works in the active field of natural-language processing, technology that aims to let computers discern the meaning of real human speech or text and that's behind search technology from search start-up Hakia and Microsoft-acquired PowerSet. NLP benefits tremendously from having large-scale data sources, Manning said.

"To understand what words mean, you look at how they're used. We do that on a large scale, (examining) usage and context to learn about meaning," Manning said.

Please, sir, I want some more
It also was clear the researchers' appetites were whetted by BOSS. Nobody sounded ungrateful, but heck, as long as Yahoo is sharing some important data, why not share a little more?

Yahoo is headed that direction. On the research day, it opened up access to another slice of search-related "prisma" data.

Vik Singh, an engineer behind Yahoo BOSS

Vik Singh, an engineer behind Yahoo BOSS

(Credit: Stephen Shankland/CNET News)

Prisma powers Yahoo's search assist feature that suggests searches based on what people have begun to type into the search box, which can make searching more convenient for users, but for researchers trying to build more technology atop Yahoo search results, prisma data is bigger than that. For example, it can show a search term's variations, its membership in categories such as place names, movies, and government, and the likelihood that people search for the term by itself or as part of a larger query.

"That's got a lot of potential," said Dan Ramage a natural-language processing Ph.D. candidate at Stanford. Ramage said BOSS is useful for his research, which focuses on determining the various relationships that can connect a pair of words, he said, but he'd like it better if he could get better control over the snippets of text Yahoo shows with its search results.

Yes, Yahoo will share more
Yahoo plans to release more. "Over time you'll see we'll offer a lot more ingredients, a lot more power," said Ashim Chhabra, senior product manager with the BOSS project.

Some researchers are hungry for as much as they can get. Chen, for example, hoped Yahoo could become an engine to run software supplied by researchers that plumbs its entire Web index.

"We give you a little code, you run that code on every document, then you give us a number," Chen suggested. It would be useful, for example, "to track evolution of themes and memes on the Web, different buzz trends."

Graham Mudd, product marketing manager for Yahoo search, said the idea is "not as crazy as you think," though he also gave the impression that researchers shouldn't hold their breath for that level of access. But Yahoo clearly wants to offer what he could.

When it comes to search research, "The pool of talent is divided between a half a dozen companies," Raghavan said. "We think it behooves us to open up."

  • prev
  • 1
  • next
advertisement

The browser battles go on and on

roundup From Firefox to IE and from Chrome to Opera and Safari, there's no sitting still for browser makers looking to keep their products fresh and competitive.

3G wireless still holds promise

The next generation of 4G wireless may get all the headlines, but advanced 3G technology will likely dominate services for the next few years.

About Cutting Edge

Keep up-to-date on cutting-edge research and what's new in a wide range of areas from robotics, space ventures and general science to automobile design and solar energy.

Add this feed to your online news reader

Cutting Edge topics

Most Discussed



advertisement

Inside CNET News

Scroll Left Scroll Right