Microsoft tries to one-up Google PageRank
Though a distant third place to Google, Microsoft thinks it can teach its rival a thing or two about searching the Internet.
A big part of Google's rise to search engine leadership was an algorithm called PageRank that assesses a specific page's importance by how many other Web pages link to it and by the importance of those linking pages. Microsoft researchers and academic collaborators, though, detailed an idea this week it calls BrowseRank that seeks to bring more of a human touch to that assessment.
Microsoft likes the results BrowseRank, which assigning Web page priority based on how people actually use the site.
(Credit: Microsoft ResearchA Asia)Essentially, the researchers tested out a system that replaces PageRanks' link graph--a mathematical model of the hyperlinked connections of the Internet--with what they call a user browsing graph that ranks Web pages by people's behavior.
"The more visits of the page made by the users and the longer time periods spent by the users on the page, the more likely the page is important. We can leverage hundreds of millions of users' implicit voting on page importance," the researchers said in BrowseRank: Letting Web Users Vote for Page Importance, a paper from the SIGIR (Special Interest Group on Information Retrieval) conference this week in Singapore. Authors are Bin Gao, Tie-Yan Liu, and Hang Li from Microsoft Research Asia and Ying Zhang of Nankai University, Zhiming Ma of the Chinese Academy of Sciences, and Shuyuan He of Peking University.
Search is of tremendous importance to the Internet for many reasons. For one thing, search engines are highly influential middlemen that steer users to Web sites they may not be able to find on their own. For another, queries typed into search engines can be powerful--and in Google's case highly profitable--indications of what type of advertisement to place next to the search results.
But Microsoft lags leader Google and No. 2 Yahoo in search. It's trying hard to catch up, for example with unsuccessful proposals to acquire Yahoo or its search business that would cost the company billions of dollars. And Microsoft just bought search start-up Powerset.
Google isn't putting all its eggs in the PageRank basket, though.
"It's important to keep in mind that PageRank is just one of more than 200 signals we use to determine the ranking of a Web site," the company said in a statement. "Search remains at the core of everything Google does, and we are always working to improve it."
PageRank shortcomings
The Microsoft researchers argue that PageRank has a number of problems. For one thing, people can game the system by building bogus Web sites called link farms. Those sites feature hyperlinks point to a Web page whose importance a person wants to inflate so it appears higher in search results. Another PageRank issue is that the indexing process doesn't take into account the time a user spends on a particular site.
But user behavior, monitored in anonymous form by Web servers and Web browser plug-ins, can be better, the authors argue.
"Experimental results show that BrowseRank can achieve better performance than existing methods, including PageRank...in important page finding, spam page fighting, and relevance ranking.
The researchers gathered their data from "an extremely large group of users under legal agreements with them," according to the paper.
There's no denying PageRank is useful, though, and such algorithms could be added into a larger formula for determining which sites come out on top of search results.
"It is also possible to combine link graph and user behavior data to compute page importance," the researchers said. "We will not discuss more about this possibility in this paper, and simply leave it as future work."
Bringing research to fruition
It can be a long time before research comes to fruition, but funding a group of researchers can be much less expensive than acquiring other companies. No doubt Microsoft, especially after years of effort and its thwarted overtures to Yahoo, would like to see its in-house search efforts bring Google to its knees.
When accused of being dominant, Google representatives often argue the company could lose its search dominance if somebody else builds a better mousetrap and Internet users divert their path to that other door door. "If Microsoft or Yahoo are successful in providing similar or better web search results or more relevant advertisements, or in leveraging their platforms or products to make their Web search or advertising services easier to access, we could experience a significant decline in user traffic or the size of the Google (ad) Network," it said in its most recent quarterly report.
The top players are a moving target, though. Yahoo is hoping to improve search with three efforts: BOSS (build your own search service), which lets others employ Yahoo search results along with its search ads; SearchMonkey, which lets content publishers build elaborate mini-Web pages into search results; and Glue Pages, which present a smorgasbord of related content alongside search results.
And Google invests heavily, too. Its biggest research team is devoted to search, and the company updated its search formula more than 100 times in the second quarter. And researchers have huge infrastructure at their disposal to try new ideas.
"My group at Google has at its disposal many thousands of machines, with storage measured in petabytes," Udi Manber, head of Google's search quality, said of Google's search research infrastructure in a June talk. And, he added, engineers are empowered to try their results, with meetings once or twice a week to see how well they worked: "There is no separation of research and development. Everyone does both."
Stephen Shankland writes about a wide range of technology and products, but has a particular focus on browsers and digital photography. He joined CNET News in 1998 and since then also has covered Google, Yahoo, servers, supercomputing, Linux and open-source software, and science. E-mail Stephen, or follow him on Twitter at http://www.twitter.com/stshank. 





Thanks, Microsoft. You did it again.
Thanks, Google. You did it again.
I wonder if the report was titled "Like Duh."
Now if they could apply the same thinking to their own products!
To be fair and serious, the web is a fickle thing. If they could really create something people actually get value out of, sure, they could topple Google, as much as Google could then return the favor.
What MS needs to get into its head, is that as more and more younger users mature and enter the marketplace, they are coming to it with deeply rooted and increasingly negative feelings about the company. MS isn't cool. It doesn't do cool things. Its "your daddy's OS," the evil empire, that which must be distrusted, that which won't do a good job, that whose products are always buggy out of the box. MS has to stop trying to fix MSN, MS, or Windows to ever bloody thing it does. It copied that marketing tool years ago trying to be "cool," and it just plain sucks. Right now, exactly what they don't want is to instantly brand everything with a name a lot of folks either hate, don't like, distrust, don't use, or only use because they have too.
If I could get a decent native client for my game software on a Mac, or a Linux system, I'd drop MS entirely.
PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important".
I have already improved my page rank on my site :
( <a class="jive-link-external" href="http://www.r3f3r.com" > REFER.COM </a> )
(http://www.r3f3r.com)
I would argue that Microsoft's XBOX divison doesn't have that image. In fact, there's lots of good hype for XBOX-exclusive games (i.e. Halo 3, Gears 1 and 2, etc.). Lots of kids are into these games. XBOX is a long-term investment by Microsoft to enter the living room and compete (boot?) Sony. Search is in the same area: long term investment.
You are right.
PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important".
I have already improved my page rank on my site : <a href="http://www.r3f3r.com"> R3F3R.COM </a>
search has become too diffuse
I would rather have a search that returns 10 pages of relevant stuff
and the millions of pages that google returns - especially when one is trying to find content, information as opposed to the latest deal on whatever widget or store
who digs deeper than 10 pages anyway
what if you took some of that cycle time to refine the search
- or even taken 2 more seconds??
what's the rush guys
at the same time I dont think google will fall asleep at the wheel
there is no reason that the google landing page couldn't have a variety of search vehicles
and let people select the one most appropriate
also not sure whether behavior targeting ads will eclipse the gamesmanship of search rankings, key words etc...
great article nonetheless
Miro
http://miroslodki.wordpress.com
We at Me.dium on the other hand (http://me.dium.com/search) are processing our user's clickstream data in real-time to create a different lens based on what's going on now. e.g. do a search for John Edwards on Google or Live, and you get johnedwards.com and wiki/johnedwards. Do the same search on Me.dium and you learn that today people care about his love child, pictures of his mistress, etc.
The difference is real-time (what people are browsing now) vs. historical (what they browsed in the past). Social vs. Old School. Check it out and let us know your thoughts. http://me.dium.com/search.
When I want to search, I don't want to wait for 20 minutes for the multitude of advertisement's, news, and other miscellaneous garbage to load. When I want news, I will go to a news site. I think this is a contributing factor to why Google is #1. In terms of search results, unless I am searching for something completely obscure, I typically get similar results from all three of the major search engines, so I go to google for the reasons mentioned above.
Also, just because people go to the site a lot, doesn't make the information accurate or relavent. I don't feel this will give Microsoft any more of a leg-up on Google than they have now. If they believe that this is the holy grail of searching, than I have a few copies of Vista to sell them....
The reason PageRank has always been such a popular thing is that it gives people a single, simple number to judge a site by. I KNOW that is overly simplistic and inadequate to judge page popularity by, and I know Google has more than the PageRank algorithm to go by, but that is beside the point. It is a matter of human psychology. People want that single metric, and that is what PR has done for Google.
This highlights the basic problem that Microsoft has, not that they don't have sufficient technology to compete with Google, they don't have a sufficient grasp of their market and people in general.
-Ken
www.kenstech.com
Now we are going to see Microsoft tricked by link-and-browse farms that are going to inflate a page's "Browse Rank".
http://www.faroo.com/english/technology/architecture.html
FAROO's "If users spend a long time on a page, visit it often, put it to bookmarks or print it out, this page goes up in ranking."
http://altsearchengines.com/2007/10/02/great-debate-peer-to-peer-p2p-search-part-i/
sounds very familiar to Microsoft's
"The more visits of the page made by the users and the longer time periods spent by the users on the page, the more likely the page is important."
http://research.microsoft.com/users/tyliu/files/fp032-Liu.pdf
doesn't it?
A very significant difference is though, that FAROO maintains the privacy of the user because it calculates the PeerRank in a decentralized manner, while Microsoft would collect all click streams of all users in a central server.
It's great to see that Microsoft research paper confirms that attention based ranking is able to outperform PageRank both for relevancy and for spam suppression.
How does Microsoft propose to solve this problem, since those networks are still around today (and there are more than ever)? Ask announced it had added the DirectHit technology into its Edison algorithm last year and the change doesn't seem to have helped them.
Speaking of search engine market share, it's unfortunate that you're using flawed data based on number of queries performed. Traffic estimates from sites like Quantcast and Compete indicate that Microsoft receives more search traffic than Yahoo! and that Google still controls less than 40% of the search market.
It's important to understand that search is not about the number of queries performed (many queries at Google and Yahoo! are automated queries intended to check rankings and backlinks) but rather about the number of people who use the search engine.
- by seositus July 31, 2008 9:19 AM PDT
- i think the BrowseRank and PageRank would be a good combination!
- Like this Reply to this comment
-
Showing 1 of 2 pages (38 Comments)http://www.seositus.com