Duplicate Web Content: Google Penalty Myth

This video by Greg Grothaus, a Google Search Quality Engineer, was created on August 12, 2009 - so this is current information on the Duplicate Content Issue coming directly from the source.

The video is part of what they call "webmaster outreach" which they use to reach out to the webmaster community to explain how search quality works...

The first issue that Greg Grothaus discusses is the common myth about the Duplicate Content Penalty. He explains how they create a set of results for any given search query, and explains that there is actually no penalty.

They simply determine which of the duplicate pieces of content is most relative to the actual search query, and omit the others. Those that are omitted from one search query, may very well show in another more relevant search query. Example: a web page, and the print version of that same web page.

Greg says, "We recognize that most duplicate content is not deceptive in origin, so as a result we're not trying to penalize it, we're just trying to show in our search results content that is distinct and offer the searcher a variety of results. This is very much a per query thing." (paraphrased) He recommends that you read the Duplicate Content help file on Google, which states:

Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues, and you don't follow the advice listed above, we do a good job of choosing a version of the content to show in our search results. source

And which also says:

If you find that another site is duplicating your content by scraping (misappropriating and republishing) it, it's unlikely that this will negatively impact your site's ranking in Google search results pages. If you do spot a case that's particularly frustrating, you are welcome to file a DMCA request to claim ownership of the content and request removal of the other site from Google's index.

The exception is what they consider spam, and Greg says that this is still not a penalty for Duplicate Content but rather a penalty for spam. This is defined as a web page where someone has intentionally copied content and marked it up for the purpose of manipulating the search results, which they will omit from their index and/or give a much lower ranking. The example given was a case where someone copies the entire exact content from a Wikipedia page, then publishes and optimizes it on a page of their own site.

Greg goes on to explain exactly what Duplicate Content is...

The first is a common problem, which is multiple URL's which all point to the exact same page or content. Examples would be: url.com vs url.com/index.htm vs or http://url.com vs http://www.url.com. All 4 of those URL structures pulling up the exact same home page. He explains why this is a problem, and again states that there is no penalty associated with this issue.

The real issues in a case like that - again, not penalties - start with the fact that you are diluting your Link Popularity. The solution is to use only one instance of any given URL (link) and create a 301 redirect for any other instances (with www or without, for example).

Greg also says that in these cases, it causes inefficient crawling of your entire website, which could cause some of your new content to be missed.

How to Fix These Common Duplicate Content Issues

The solution is to understand what they call "the canonical". This refers to the simplest and most significant form of your content - or the URL that you want to show for any given page of content. The URL you choose (example: with www or without) is considered your canonical URL.

Once you've picked your canonical URL there are several ways you can let Google know to use this URL, including:

  • Link to your web pages consistently
  • Use a 301 Redirect for all non-canonical URL's
  • Go into Google's Webmaster Tools and specify www vs non-www
  • New option: use the rel=canonical HTML tag

Greg goes on to explain how to use the canonical tag, and explains it's similarity to the 301 redirect and your option to use either.

The last bit of the video covers Multiple Site Issues, such as different URL's for different audiences (by country or language) - a .co.uk version of the same .com site for example. Or French and German versions of your site.

The statement made here was "Google thinks Multiple Domains are OK". Your only real concerns are diluting Link Reputation. Google will choose the best page for any given query - not necessarily all of your domains.

An example is given of two domains with the same content targeting different countries, an Australian version (.com.au) and a British version (.com.co.uk) - obviously both in the English language. Google will attempt to serve the correct domain to the appropriate searcher, based on their location.

Greg suggests you help Google out by logging in to Google Webmaster Tools and set a particular domain for a particular locale. But again, there is no penalty even though both domains contain the exact same content. In fact, they encourage it because users prefer to read content in their own language, and even on country-specific domains that relate specifically to their location.

I hope this helps clear up the "duplicate content scare" and gives you a better idea of what Google wants and expects in regards to your site structure, and your use of duplicate web content.

Best,

Video shared by Christopher Hooper

About Lynn Terry

Lynn Terry is a full-time Internet Marketer with over 15 years experience in online business. Subscribe to ClickNewz for the latest Internet Marketing trends & strategies, Lynn's unique case studies, creative marketing ideas, and candid reviews...more»

Discussion

  1. Great overview Lynne. Now I can stop checking out where people are copying my stuff to and worrying about it.

  2. Lynn,

    Thanks for posting this. Very valuable information from the source.

  3. Jeffery Wood says:

    Thank you so very much for posting this Lynn. I kept hearing one guru worry about duplicate content and another claim it doesn't exist, so it's nice to hear it right from the horse's mouth...er...so to speak. :)

    - Jeffery

  4. Hi Lynn, Thanks for this great info on a topic that seems to keep going around in circles!

    I've recently seen the following message on the bottom of a couple of blog posts "cross-posted on xxxxx" - hyperlinked to the blog in question. What's your take on this? Is having this type of 'disclaimer' enough to stop the same post on different blogs being regarded as duplicate content?

    Thanks for providing such a great resource that I only found a few weeks ago!

    • They actually ARE duplicate content, and so they'll be "regarded" as such appropriately. That said, as the Google employee stated - there is no actual penalty for this. They will simply try to determine which of the instances of that content are the best match for the results for any particular query.

      Since queries can be unique, one query may call up one of the instances, which a slightly different query may call up the other.

  5. Dream House says:

    nice info. but the myth is still exist among the webmasters especially in SEO contest.

  6. Black and White says:

    he-he, nice information about "penalty". but its only for people who make black SEO!

    • Are you referring to black hat? If so, I disagree - this information is relevant to all of us. Particularly in regards to article marketing, sharing news stories and press releases, and many other aspects of white hat content marketing.

  7. Are you a professional journalist? You write very well.

  8. Frank Dickinson says:

    Great info. Lynn.

    As always - you are on top of the important stuff!

    You are appreciated.

  9. I've noticed in the past in a tiny local niche where a competitor have several websites with exact same content but with different url. Search on google will show one or the other site depending on the keyword used. So, I think if big G penalizes the site, one of them won't show up but they do! Your article confirmed what I saw.

  10. Hello Lynn,

    First of all Thanks for informative msg. Really you resolved my problem. I always thinking about people are copying my content. Now i need not worry about it.

  11. I never believed the icky duplicate content penalty myth, yet it seems to be something you hear everywhere you go online. People trying to make us afraid to market and use articles etc... Glad to have a video from Google saying it's a bunch of bunk. Now when people look at us like I have two heads, we can link them to this video LOL

    • LOL, true! I really appreciated how he spelled everything out in detail within the video, nice job on that, so it does make for a great resource to share when that discussion comes up.

  12. Joe at Mens Snowboard Jacket says:

    Lynn, its good to see a high profile SEOer help bust this myth. What people really need to understand is that blatant plagiarism just doesn't work at least not in the long run. You are not being "penalyzed" but Google wants original information and will rank those pages higher.

    • That's somewhat true, but there is a lot more to Duplicate Content than just blatant plagiarism. Take news for example - you usually get the same news across various networks or television channels. And as another example, articles and press releases are made to be distributed across the internet to reach the broadest market possible. It's nice to know there's no real concern with republishing articles with reprint rights, particularly if you optimize them for a slightly different keyword phrase and/or add your thoughts via introduction and conclusion before & after the piece...

  13. Glad that they are working on duplicated content. For sam suntouch, the only problem here is what if the other person is much better writer than you? LOL Imagine, he/she get all your ideas then leaving you behind.

  14. I wonder - how really they are going to implement it. Looks like a pretty much impossible task to perform. How are they going to understand where is a original and where is a replica?

    • They don't - they serve the most relevant result based on the individual search query. You might want to watch the video again to get a better understanding.

  15. Thanks for the great post.

    You Rock!
    I love IMTW too!

  16. Bookmarking Submissions says:

    I would comment that Duplicate Content has become a huge topic of discussion lately, thanks to the new filters that search engines have implemented. This article will help you understand why you might be caught in the filter, and ways to avoid it. We'll also show you how you can determine if your pages have duplicate content, and what to do to fix it. Search engine spam is any deceitful attempts to deliberately trick the search engine into returning inappropriate, redundant, or poor-quality search results. Many times this behavior is seen in pages that are exact replicas of other pages which are created to receive better results in the search engine.

    • Right. I like how he explained the difference between harmless duplicate content, and obvious manipulation of search results. Because there IS a difference - and I thought he did a good job of explaining that in the video.

  17. Ebiz Graphics says:

    Google is concerned about revolves around affiliate programs. It has been common practice for high traffic websites to establish an affiliate program. Affiliate programs themselves don’t worry Google. What it doesn’t like though, is for an affiliate program to take a template and then offer it to its base of affiliates to use. Some of the higher traffic websites end up with thousands upon thousands of duplicate websites all promoting the very same things and, according to Google, not offering any real value to the internet community. A website offering this type of cookie cutter website can easily find themselves de-listed by Google as happened to Template Monster a while back.

  18. DeAnna Troupe says:

    This is great information, Lynn. I'm going to post this video to my blog. Thanks for sharing.

  19. Well I was looking for info on duplicate content and since I read this blog regularly I'm glad I stopped by to get your take on the situation. Not really keen on code tinkering, canonical? oh oh. here's hoping . . . haha

  20. I was wondering what would happen with duplicate content. I have found a few spammy style sites that are using my articles but stripping the link back to my site. My articles are getting indexed first but I was worried that the spammy sites would affect on me. Thanks

  21. Lynn I have a question. I do article marketing in order to get backlinks. When I create a well written article, am I doing something wrong by submitting it to different article directories? Will I get credit for the links back to my site- or am I supposed to "spin" the article like so many suggest? I understand that I have to alter anchor text, but do I have to alter the wording of the article as well? Thank you!

    • I don't use article spinners. You can submit the same article to multiple directories. There's no harm in that. And yes, the links will count.

      I often change the title and the anchor text for different article directories, to target more long tail keyword phrases. Just changing the title can help you reach more of your market if you use a slightly different phrase in each title.

Trackbacks

  1. [...] Watch for news or stories that surface that will be of interest to your market, or that you can write about with a slant that relates to your topic. Subscribe to Google Alerts with specific keyword phrases, scan traditional media such as magazines and television, search YouTube for new and interesting videos, etc. This type of content is great for niche blogs, along with your thoughts or opinions on the topic. (example) [...]

Leave a Reply

*

CommentLuv badge