Thursday, January 28, 2010

Duplicate Content

This is a summary of Google’s guidelines on duplicate content, for more information search the rest of the SEO Blog.

Examples of duplicate content includes:

Duplicate pages for mobile devices
E-commerce items available via multiple URLs
Printer-only versions of web pages
Multiple pages with mostly identical content

In search results Google wants to show pages with different information, which is why Google can end up dropping your page from its SERPs if it appears duplicated.

To avoid problems with duplicate content Google suggests the following.

Use 301 redirection if you have re-organised the pages in your website so that the old URL redirects to the new version. Hopefully the pageranking will also transfer however my experience is that this can take some considerable time and in some cases the PR just does not transfer so think carefully if you decide to restructure your website. 301 redirects are carried out in your .htaccess file.

Have consistent linking within your website. If you use the link http://www.site.co.uk/page/ then use that throughout your site, ie do not use http://www.site.co.uk/page and http://www.site.co.uk/page/index.htm to all direct to the same page.

Use the correct domain type ie if you are a UK company use .co.uk not .com.

If you syndicate your articles ensure that each website showing your content includes a link back to your original article and ask for a noindex meta tag so that search engines do not index the syndicated version of the article.

Use Webmaster Tools to advise your preferred domain for indexing, ie the www. or the non www.

If you have some text or legal notice on many pages of your website then just put the full article/notice on one page only and put a link on all of the other pages.

If you are setting up a new page on your website put noindex in the meta tags until the page has content on it.

Google says,

Google no longer recommends blocking crawler access to duplicate content on your website, whether with a robots.txt file or other methods. If search engines can’t crawl pages with duplicate content, they can’t automatically detect that these URLs point to the same content and will therefore effectively have to treat them as separate, unique pages. A better solution is to allow search engines to crawl these URLs, but mark them as duplicates by using the rel=”canonical” link element, the URL parameter handling tool, or 301 redirects. In cases where duplicate content leads to us crawling too much of your website, you can also adjust the crawl rate setting in Webmaster Tools.

My post on rel=canonical

If for some reason your website is removed from the SERPs check in webmaster tools as to why, then make changes to the site and submit the website to Google for reconsideration.

Add comment

Fill out the form below to add your own comments