News Blog

Read all 'duplicate content' posts in News Blog
May 31, 2008 12:56 PM PDT

Be unique to avoid duplicate content

by Brian R. Brown
  • Post a comment

Web site owners might be amazed to learn that one of the biggest sources for duplicate content isn't externally, but rather internally.

Certainly, popular sites and blogs that syndicate a lot of content have to deal with external duplication, but as I already touched on external duplicate content, we know that there are steps to minimize those challenges and to establish your site as the canonical source.

Internal, or on-site, content duplication tends to come in a few key ways, the first of which is within the key page elements. The second is from the content itself; similar to e-commerce sites using stock product copy, you may be using your own copy over and over again on your site. Third, it simply may come from too little differentiated copy.

... Read more
Originally posted at Searchlight
April 7, 2008 7:33 AM PDT

Duplicate content: Separating the penalty from the filter

by Brian R. Brown
  • 1 comment

Several weeks ago at SMX West I had the pleasure of meeting and having lunch with Brian White from Google. White works on Matt Cutts' Web spam team, tirelessly working to make Google's search results the best they can be, ensuring the best user experience. Quite a hefty task indeed.

You'd think that someone who spends his days fighting the never-ending battle that is Web spam might be a bit negative or jaded. If that is the case, he does an amazing job hiding it. Instead, he was upbeat and you could feel the excitement in his voice as he spoke. Here's a guy who loves what he's doing and truly wants to not only improve the searchers' experience on Google, but wants to make the Web a better place. You can't help but like a guy who's fighting the good fight.

... Read more
Originally posted at Searchlight
September 17, 2007 8:34 AM PDT

Don't 'Print This'

by Stephan Spencer
  • Post a comment

Printing Web pages can often be an exercise in frustration. It's amazing how the most important information often gets cut off along the right side of the page.

Web designers and makers of content management systems (CMS) have tried to ease that pain by creating printer-friendly versions of pages to make sure that site visitors get the goods.

Unfortunately, printer-friendly doesn't always equate to search engine-friendly. These printer-friendly pages often result in creating duplicate content, possibly even a complete duplication of the entire Web site. Web site owners have been relieved to learn that duplicate content isn't seen as a penalty by search engines; rather, it results in a filter to help them identify which page they feel is most correct to return in search results. But that doesn't mean that this content duplication doesn't carry a negative impact.

And this is one of those subtle areas, in which good design and SEO best practices intersect. If these printer pages are created through entirely separate pages or appended URLs, they can dilute a site's PageRank as well as diminish crawl equity from the spiders crawling duplicate pages. You can often spot these by looking for a link on the page that says something like "printer friendly" or "print this."

For example, let's say that you have a Web site that has 1,000 pages, a small to moderate-size site, depending on your perspective. Now, because you've taken advantage of your CMS' ability to automatically create a "print this" link on each page to a printer-friendly version, for all practical purposes, your site just doubled to 2,000 pages. But what if your PageRank isn't high enough to warrant very rapid spidering? It could take a lot longer for all your pages to get indexed.

Some of your "good" pages may not get indexed, where they would have otherwise, or they may end up in Google's supplemental index instead of the main index. Not to mention the wasted bandwidth of crawling these duplicate pages. What if your site instead has 10,000 or 100,000 pages? As you can see, there is more at stake here than just duplicate content being filtered out.

Printer-friendly pages present less of an issue on dynamic Web sites, where the pages are created from a database using the same content as the regular pages, but this can be an even bigger issue on sites where these are actually two separate pages that each need to be maintained. It doesn't take long for these pages to get out of sync.

By no means should you run out and remove the printer-friendly functionality on your site, because this is arguably a valuable feature for your visitors. There are, however, alternatives that can be explored.

One method is to use JavaScript-based links to these pages, which search engine spiders aren't able to follow. However, this may present issues to anyone who has chosen to turn off JavaScript in their browser, though this will probably be a small number of users, anyway.

A better method is to utilize CSS (Cascading Style Sheets) to create a separate printer style sheet. The added benefit to this is that you get to remove the extra link from your pages. When visitors choose to print one of your pages, the browser builds that page based on the printer style sheet rather than the one used for onscreen viewing. Visitors can even preview a page to see how it will look printed.

While there are still challenges to printer style sheets, designers with CSS experience should be able to create one for most sites. Implementing this method will mean that you don't have to worry about duplicate content issues, appended URLs, or any other issues created by having separate URLs or pages for your printer-friendly pages. Your regular pages are also your printer-friendly pages; it's no longer about URLs or pages, but rather presentation.

Originally posted at Searchlight
  • prev
  • 1
  • next
advertisement
Click Here

15 sites that went kaput in 2009

Web sites launch all the time, but they also shut their doors. We highlight 15 that bit the dust this year.

Top 10 news stories of the decade

Let the debate begin: Was the iPhone more important than iTunes? Was anything bigger than Google finding a great business model? CNET offers its list of the 10 most important stories of the '00s.

About News Blog

Recent posts on technology, trends, and more.

Add this feed to your online news reader



advertisement

Inside CNET News

Scroll Left Scroll Right