It is the willful and sometimes accidental copying of another person’s idea or work and claiming it to be your own. It is a kind of literary theft which involves stealing of another author’s work. It occurs most frequently in academia and journalism. Stealing another author’s ideas and words is a very old practice in the literary world. Over the years many notable incidents of this have happened in journalism too. It is different from copyright infringement which is always illegal but this is not always illegal though it is always unethical to steal from another writer. Only material protected by copyright laws can result in a legal proceeding against the plagiarist. Nowadays it has become extremely easy to get rid of any invalid content by using plagiarism remover tool.
Using other writer’s work has become a very common practice in the web industry too. As content is the most important factor in the ranking systems of search engines, webmasters believe having a large volume of it and regularly updating it will result in a higher ranking for their website on the search engine result page. Providing unique and original content is expensive therefore some webmasters resort to copying content from other websites instead of creating their own.
Such practices were very popular in the past because during the early days of search engines, their algorithms were not strict on duplicate content. But over the years search engines such as Google eliminated loopholes in their search algorithm and developed strict policies about such behavior.
Still, content can be copied in varieties of ways and some forms of copied content are not very easy to detect and can be overlooked by search engines. It is useful to discuss various forms of such content in order to avoid using it.
Kinds of Plagiarized Content
There is more than one way in which it can be done depending on how much of the others author’s content is copied. Some of these forms are more severe than others. Verbatim reproduction and paraphrasing are the two most common techniques involved in this process. Below are some of the frequently found kinds of this phenomenon.
This kind involves copying the whole content of another webpage in such a way that the content on two pages is exactly the same. There is no difference between the content on two pages as these are exact copies of each other. Not only copying is done verbatim but also the entire content is copied with no addition or deletion in the copy. This type of duplication of content is the easiest to detect. Various online tools can be used to detect this form. It is the laziest approach to scraping and is also the most severely punished type when it is discovered.
This form is similar to the above form in regard to the technique as it also involves verbatim copying of content but differs in the extent of copying. Here some section of the content is copied instead of copying the complete content from another source. Some additions are made to this content which can be content copied from another source or original content. This type also constitutes a severe offense. Since content is copied word for word, therefore, it can be easily found with the help of software or online application.
Among all the different types this type is most common one. As the name suggests the content is not copied word for word from the source rather a paraphrased version of the source content is used. It involves the use of synonyms and making similar minor changes to the original content. Specific facts and details are converted to more generalized forms. As a result of all these changes, the original text gets a new wording although the idea contained in the original is retained. Therefore, it can be said that paraphrasing is more about copying of ideas only and not copying of words used to express those ideas by the original author. Copying of idea that is in common knowledge does not constitute paraphrasing. This type is also known as patchwriting. It is very difficult to detect paraphrased content using a machine as the same idea is given a new makeover but humans can easily detect this type of content. Search engines also becoming more sophisticated each day. The ability of the search engines to detect paraphrased content has improved significantly due to Latent Semantic Indexing.
Not only publishing content on your website which has been copied from another website is considered scraped content but also if two pages on your website contain the same content it will be considered duplicate content by the search engine. Therefore in addition to copying from others, copying or repeating one’s own content twice is also considered copying.
Every writer has a unique writing style and has a pattern that is repeated again and again in writing. This includes the use of the same words, phrases and sometimes entire sentences. In addition to this, a professional copywriter might not remember that he wrote as similar articles in the past due to writing on daily basis. This type of copying of content happens mostly by accident.
Google Scrape Content Policy
Now that we know what it is and have also looked at its various form, it is time see to discuss how it impacts content quality and hence Google ranking of a website.
Google has clearly laid down its policy about copying content in webmaster guidelines and advised webmaster to avoid using this technique. Webmaster guidelines contain a section about scraped content which defines it, offers some examples of this type of content and highlights possible implications of this practice.
Why Google does not like this type of content?
Duplicate content means the same content is repeated with no added value. Google wants to provide those search results to its users that can best answer their query. Now if Google search result pages are cluttered with web pages containing identical content these will not be very helpful for the users. If one webpage has failed to answer a user’s query then another webpage with the same content will also fail besides annoying the user. The end result will be a dissatisfied user, therefore, webmasters should avoid duplication but instead focus on providing unique content to their users.