July 25, 2008 4:00 AM PDT

Microsoft tries to one-up Google PageRank

by Stephen Shankland
  • Font size
  • Print
  • 38 comments

Though a distant third place to Google, Microsoft thinks it can teach its rival a thing or two about searching the Internet.

A big part of Google's rise to search engine leadership was an algorithm called PageRank that assesses a specific page's importance by how many other Web pages link to it and by the importance of those linking pages. Microsoft researchers and academic collaborators, though, detailed an idea this week it calls BrowseRank that seeks to bring more of a human touch to that assessment.

Microsoft likes the results BrowseRank, which assigning Web page priority based on how people actually use the site.

Microsoft likes the results BrowseRank, which assigning Web page priority based on how people actually use the site.

(Credit: Microsoft ResearchA Asia)

Essentially, the researchers tested out a system that replaces PageRanks' link graph--a mathematical model of the hyperlinked connections of the Internet--with what they call a user browsing graph that ranks Web pages by people's behavior.

"The more visits of the page made by the users and the longer time periods spent by the users on the page, the more likely the page is important. We can leverage hundreds of millions of users' implicit voting on page importance," the researchers said in BrowseRank: Letting Web Users Vote for Page Importance, a paper from the SIGIR (Special Interest Group on Information Retrieval) conference this week in Singapore. Authors are Bin Gao, Tie-Yan Liu, and Hang Li from Microsoft Research Asia and Ying Zhang of Nankai University, Zhiming Ma of the Chinese Academy of Sciences, and Shuyuan He of Peking University.

Search is of tremendous importance to the Internet for many reasons. For one thing, search engines are highly influential middlemen that steer users to Web sites they may not be able to find on their own. For another, queries typed into search engines can be powerful--and in Google's case highly profitable--indications of what type of advertisement to place next to the search results.

But Microsoft lags leader Google and No. 2 Yahoo in search. It's trying hard to catch up, for example with unsuccessful proposals to acquire Yahoo or its search business that would cost the company billions of dollars. And Microsoft just bought search start-up Powerset.

Google isn't putting all its eggs in the PageRank basket, though.

"It's important to keep in mind that PageRank is just one of more than 200 signals we use to determine the ranking of a Web site," the company said in a statement. "Search remains at the core of everything Google does, and we are always working to improve it."

PageRank shortcomings
The Microsoft researchers argue that PageRank has a number of problems. For one thing, people can game the system by building bogus Web sites called link farms. Those sites feature hyperlinks point to a Web page whose importance a person wants to inflate so it appears higher in search results. Another PageRank issue is that the indexing process doesn't take into account the time a user spends on a particular site.

But user behavior, monitored in anonymous form by Web servers and Web browser plug-ins, can be better, the authors argue.

"Experimental results show that BrowseRank can achieve better performance than existing methods, including PageRank...in important page finding, spam page fighting, and relevance ranking.

The researchers gathered their data from "an extremely large group of users under legal agreements with them," according to the paper.

There's no denying PageRank is useful, though, and such algorithms could be added into a larger formula for determining which sites come out on top of search results.

"It is also possible to combine link graph and user behavior data to compute page importance," the researchers said. "We will not discuss more about this possibility in this paper, and simply leave it as future work."

Bringing research to fruition
It can be a long time before research comes to fruition, but funding a group of researchers can be much less expensive than acquiring other companies. No doubt Microsoft, especially after years of effort and its thwarted overtures to Yahoo, would like to see its in-house search efforts bring Google to its knees.

When accused of being dominant, Google representatives often argue the company could lose its search dominance if somebody else builds a better mousetrap and Internet users divert their path to that other door door. "If Microsoft or Yahoo are successful in providing similar or better web search results or more relevant advertisements, or in leveraging their platforms or products to make their Web search or advertising services easier to access, we could experience a significant decline in user traffic or the size of the Google (ad) Network," it said in its most recent quarterly report.

The top players are a moving target, though. Yahoo is hoping to improve search with three efforts: BOSS (build your own search service), which lets others employ Yahoo search results along with its search ads; SearchMonkey, which lets content publishers build elaborate mini-Web pages into search results; and Glue Pages, which present a smorgasbord of related content alongside search results.

And Google invests heavily, too. Its biggest research team is devoted to search, and the company updated its search formula more than 100 times in the second quarter. And researchers have huge infrastructure at their disposal to try new ideas.

"My group at Google has at its disposal many thousands of machines, with storage measured in petabytes," Udi Manber, head of Google's search quality, said of Google's search research infrastructure in a June talk. And, he added, engineers are empowered to try their results, with meetings once or twice a week to see how well they worked: "There is no separation of research and development. Everyone does both."

Stephen Shankland writes about a wide range of technology and products, but has a particular focus on browsers and digital photography. He joined CNET News in 1998 and since then also has covered Google, Yahoo, servers, supercomputing, Linux and open-source software, and science. E-mail Stephen, or follow him on Twitter at http://www.twitter.com/stshank.
Recent posts from Digital Media
Online holiday sales hit $27 billion
Amazon touts top products of 2009
Teen Muziic founder chastised by Vevo
Microsoft, Yahoo help keep India away from porn?
Zuckerberg spends Christmas dethroning Google
The secret behind the Kindle's best-selling e-books: They're not for sale
Scam probe casts harsh light on Web retail
E-tail Scrooges and how one woman defeated them
Add a Comment (Log in or register) Showing 1 of 2 pages (38 Comments)
by tomkinite July 25, 2008 6:38 AM PDT
Awesome. So now I could write a bot that visits my site several times a day while retaining a long session and get my page rank up that way.

Thanks, Microsoft. You did it again.
Reply to this comment
by Gaurav_k July 28, 2008 3:04 AM PDT
what human aspects are considered in providing the human touch?
by Fil0403 September 8, 2008 7:27 AM PDT
Awesome. So for years I can write a webpage with many links to my site without having to write any bot nor retain a long session and get my page rank up that way.

Thanks, Google. You did it again.
by Fil0403 September 8, 2008 7:29 AM PDT
@ Gaurav_k: time spent in the site (in case you don't know how to read).
by jamalystic July 25, 2008 7:04 AM PDT
I'm glad that microsoft is regaining its senses. Innovation and not aquisition is the only way MSFT could increase its online relevance. As for this browserRank, i won't rule it as as of yet since the human element in search is now been given enough attention: Will Human-Powered Search Be the Google Killer?( http://www.internetevolution.com/author.asp?section_id=466&doc_id=148885&F_src=flftwo)
Reply to this comment
by NWLB July 25, 2008 7:59 AM PDT
Wow, they created a report that says "if we do it better than Google, we'll make money."

I wonder if the report was titled "Like Duh."

Now if they could apply the same thinking to their own products!

To be fair and serious, the web is a fickle thing. If they could really create something people actually get value out of, sure, they could topple Google, as much as Google could then return the favor.

What MS needs to get into its head, is that as more and more younger users mature and enter the marketplace, they are coming to it with deeply rooted and increasingly negative feelings about the company. MS isn't cool. It doesn't do cool things. Its "your daddy's OS," the evil empire, that which must be distrusted, that which won't do a good job, that whose products are always buggy out of the box. MS has to stop trying to fix MSN, MS, or Windows to ever bloody thing it does. It copied that marketing tool years ago trying to be "cool," and it just plain sucks. Right now, exactly what they don't want is to instantly brand everything with a name a lot of folks either hate, don't like, distrust, don't use, or only use because they have too.

If I could get a decent native client for my game software on a Mac, or a Linux system, I'd drop MS entirely.
Reply to this comment
by r3f3rindia May 13, 2009 7:53 AM PDT
You are right.
PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important".
I have already improved my page rank on my site :
( <a class="jive-link-external" href="http://www.r3f3r.com" > REFER.COM </a> )
by CoalJim July 25, 2008 8:20 AM PDT
I sincerely hope that the unreleased information regarding this 'breakthrough' in search engine algorithm has more to offer. While Google pagerank may not be perfect, it is far more complex, and intuitive than described here. By the way has anyone mentioned to microsoft that link farms are no longer viable link partners, and that if they find anything like that they should submit to Google the offending sites so that their ranking is adjusted accordingly. I want my 2 minutes back.
Reply to this comment
by r3f3rindia May 13, 2009 7:56 AM PDT
YES YOU ARE RIGHT

(http://www.r3f3r.com)
by ctbcctbc July 25, 2008 8:28 AM PDT
Re: What MS needs to get into its head, is that as more and more younger users mature and enter the marketplace, they are coming to it with deeply rooted and increasingly negative feelings about the company. MS isn't cool.

I would argue that Microsoft's XBOX divison doesn't have that image. In fact, there's lots of good hype for XBOX-exclusive games (i.e. Halo 3, Gears 1 and 2, etc.). Lots of kids are into these games. XBOX is a long-term investment by Microsoft to enter the living room and compete (boot?) Sony. Search is in the same area: long term investment.
Reply to this comment
by masonx July 25, 2008 8:37 AM PDT
Clearly Google Search has peaked and declined in its usefulness in recent years and something better is drastically needed for serious and professional search efforts. Over recent years GS's increasing inaccurate search returns and lack of specificity have become more than frustrating to people who use it as a constant research tool. I'm not exactly sure how site popularity based site ranking tool will improve specificity - I can see how it will sell advertising - lots of commercial spam. It will likely only show topical popularity and the searcher will still not have the more specific information they were seeking. More than likely we are heading to two separate types of search engines - one for very specific information existence and one for social searching to see what Britainy or Paris were (or weren't) wearing yesterday. Somehow the clutter has to be reduced. Google Scholar seems to be more specific effort toward greater accuracy to some degree, but basically it's just a search of professional journals and misses entire fields of information and is no more specific mechanically than Google's regular search.
Reply to this comment
by r3f3rindia May 13, 2009 7:43 AM PDT
Yes, Really masonx,

You are right.
PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important".
I have already improved my page rank on my site : <a href="http://www.r3f3r.com"> R3F3R.COM </a>
by cohaver July 25, 2008 8:49 AM PDT
The Security problems in the DNS is in Windows live Search Look at the History in your Systems UPDATES. Note we should all look at making all types of Search services safer from Phishing and spam type Code.
Reply to this comment
by the_piano_man July 25, 2008 9:15 AM PDT
Anytime ms comes out with some "new" idea there is an overpowering miasma that is inherent in the announcement - much like the beginning and end of many of Edgar Allen Poe's stories - they often start out very cheerful but end up grotesque and horrifying. Anyone who knows the real history of Google and ms, knows that Google exists as it is through hard work, good programming, and innovation; whereas ms exists as it is (and got its reputation) through hard work, good lawyers, and copying code and then calling it original - big difference. Get a clue, ms - in spite of all your wonderful deeds - you still resemble (in a somewhat lame way) the main character in the "Cask of Amontillado" - you know the one who built the brick wall . . . read the story - Google won't drink your wine - fools.
Reply to this comment
by joeynp July 26, 2008 9:19 PM PDT
Give me a break, man. You talk as if you run both companies and really know the difference.
by whiterabbit--2008 July 25, 2008 10:02 AM PDT
I'm surprised no one else has mentioned it, this BrowseRank system requires relatively personal data to work with. Could this be a privacy concern?
Reply to this comment
by miroslodki July 25, 2008 10:16 AM PDT
I agree with masonx

search has become too diffuse
I would rather have a search that returns 10 pages of relevant stuff
and the millions of pages that google returns - especially when one is trying to find content, information as opposed to the latest deal on whatever widget or store

who digs deeper than 10 pages anyway
what if you took some of that cycle time to refine the search
- or even taken 2 more seconds??
what's the rush guys

at the same time I dont think google will fall asleep at the wheel

there is no reason that the google landing page couldn't have a variety of search vehicles
and let people select the one most appropriate

also not sure whether behavior targeting ads will eclipse the gamesmanship of search rankings, key words etc...

great article nonetheless
Miro
http://miroslodki.wordpress.com
Reply to this comment
by Tadpole667 July 25, 2008 10:30 AM PDT
I have heard a lot about how Google came in and took over the search industry with its wonderful PageRank system; oddly I've heard nothing about the fact that they do it quickly. There's no argument that the PageRank system works well, but if you look back into the time when Google appeared and then started to take over and try to think about the reasons they began to dominate a fairly wide arena of players, you might realize the other major factor...response times in "(0.18 seconds)". If others are going to break into this industry, they'd better learn to do it FAST as well...
Reply to this comment
by kimbalm July 25, 2008 10:46 AM PDT
Very interesting, although this has been tried before. DirectHit had a search engine built entirely on clickstream data (Acquired by Ask.com in 2000). They got the data from ISPs in those days. The end-result is really not that much better than Page-Rank.

We at Me.dium on the other hand (http://me.dium.com/search) are processing our user's clickstream data in real-time to create a different lens based on what's going on now. e.g. do a search for John Edwards on Google or Live, and you get johnedwards.com and wiki/johnedwards. Do the same search on Me.dium and you learn that today people care about his love child, pictures of his mistress, etc.

The difference is real-time (what people are browsing now) vs. historical (what they browsed in the past). Social vs. Old School. Check it out and let us know your thoughts. http://me.dium.com/search.
Reply to this comment
by cptpizza July 25, 2008 12:37 PM PDT
I for one use google, as opposed to Yahoo, MSN, or any other search engine for one reason. It has nothing to do with pagerank. Its simply because Google's site sticks to the K.I.S.S. method of web pages.

When I want to search, I don't want to wait for 20 minutes for the multitude of advertisement's, news, and other miscellaneous garbage to load. When I want news, I will go to a news site. I think this is a contributing factor to why Google is #1. In terms of search results, unless I am searching for something completely obscure, I typically get similar results from all three of the major search engines, so I go to google for the reasons mentioned above.

Also, just because people go to the site a lot, doesn't make the information accurate or relavent. I don't feel this will give Microsoft any more of a leg-up on Google than they have now. If they believe that this is the holy grail of searching, than I have a few copies of Vista to sell them....
Reply to this comment
by edwardbeckettx July 25, 2008 2:48 PM PDT
BrowseRank? Wait a minute ... Isn't Browse the name of Larry Page's Brother? ... Hmmm ...
Reply to this comment
by kenstech_com July 25, 2008 3:34 PM PDT
Microsoft is so dammed slow off the mark. How many years has PageRank been around, 8,9 years? And now they are finally coming up with something to compete with. It's about time, but probably too late.

The reason PageRank has always been such a popular thing is that it gives people a single, simple number to judge a site by. I KNOW that is overly simplistic and inadequate to judge page popularity by, and I know Google has more than the PageRank algorithm to go by, but that is beside the point. It is a matter of human psychology. People want that single metric, and that is what PR has done for Google.

This highlights the basic problem that Microsoft has, not that they don't have sufficient technology to compete with Google, they don't have a sufficient grasp of their market and people in general.

-Ken
www.kenstech.com
Reply to this comment
by paullee357 July 25, 2008 3:49 PM PDT
Just as Google understands that these link farms try to GoogleBomb their way to a higher rank, Google constantly comes up with ways to defeat it.

Now we are going to see Microsoft tricked by link-and-browse farms that are going to inflate a page's "Browse Rank".
Reply to this comment
by digiprod--2008 July 25, 2008 4:02 PM PDT
Who cares! With MS search being close to last, they are not relevant and neither is the Page Rank so-called killer!
Reply to this comment
by zenflow1 July 26, 2008 6:23 AM PDT
The FAROO P2P Search Engine has been doing this for some time already.
http://www.faroo.com/english/technology/architecture.html

FAROO's "If users spend a long time on a page, visit it often, put it to bookmarks or print it out, this page goes up in ranking."
http://altsearchengines.com/2007/10/02/great-debate-peer-to-peer-p2p-search-part-i/
sounds very familiar to Microsoft's
"The more visits of the page made by the users and the longer time periods spent by the users on the page, the more likely the page is important."
http://research.microsoft.com/users/tyliu/files/fp032-Liu.pdf
doesn't it?

A very significant difference is though, that FAROO maintains the privacy of the user because it calculates the PeerRank in a decentralized manner, while Microsoft would collect all click streams of all users in a central server.

It's great to see that Microsoft research paper confirms that attention based ranking is able to outperform PageRank both for relevancy and for spam suppression.
Reply to this comment
by July 28, 2008 12:04 PM PDT
DirectHit tried to pull this off years ago and the SEO community was there, ready and waiting, with large networks of clickbots that made DirectHit think some sites were more popular than they actually were.

How does Microsoft propose to solve this problem, since those networks are still around today (and there are more than ever)? Ask announced it had added the DirectHit technology into its Edison algorithm last year and the change doesn't seem to have helped them.

Speaking of search engine market share, it's unfortunate that you're using flawed data based on number of queries performed. Traffic estimates from sites like Quantcast and Compete indicate that Microsoft receives more search traffic than Yahoo! and that Google still controls less than 40% of the search market.

It's important to understand that search is not about the number of queries performed (many queries at Google and Yahoo! are automated queries intended to check rankings and backlinks) but rather about the number of people who use the search engine.
Reply to this comment
by seositus July 31, 2008 9:19 AM PDT
i think the BrowseRank and PageRank would be a good combination!
http://www.seositus.com
Reply to this comment
Showing 1 of 2 pages (38 Comments)
advertisement

15 sites that went kaput in 2009

Web sites launch all the time, but they also shut their doors. We highlight 15 that bit the dust this year.

Top 10 news stories of the decade

Let the debate begin: Was the iPhone more important than iTunes? Was anything bigger than Google finding a great business model? CNET offers its list of the 10 most important stories of the '00s.

About Digital Media

The Web is now the place to go for news and entertainment. Look here for the latest on blogs, music, video, virtual worlds, social networking and more.

Add this feed to your online news reader

Digital Media topics

advertisement
advertisement

Inside CNET News

Scroll Left Scroll Right