If you are in a hurry and cannot wait to read the rest of the article, no they do not. What are automatic content generators?
Automatic content generators are software which claim to create content, in the form of articles automatically. This might seem an amazing capability, to write articles without human intervention, but the software rewrites existing articles to create new one.
There are three main ways in which these content generators work.
- Scraping. The software gets different parts of the articles from different places and joins them all together to create a new article.
- Thesaurus substitution. Synonyms are substituted in the original article to create the new article.
- Markov chains: Markov chain is a technique in which a statistical model of the existing article is created and the new article is created using the statistical model.
Of all these methods Markov chain holds the most promise as it is hardest of all the methods to detect.
So what are Markov chains?
Apart from being lots of bullshit in computer science, they are a tool to create pseudo random text from a statistical of another text. Since it is based on non random text, most of the times it will follow the rules of English grammar. Given a large non random text to create the statistical model, it will generate text which can sometimes pass the scrutiny of humans.
Markov chain takes into account what words follow a given set of words. Based on this data the new text is created.
My experiments with Markov chain.
Most black hat SEO techniques leave some footprint which the SEs use to identify the article as automatically generated. This leaves commercial automatic content generators vulnerable. I wanted to check if the SEs are able to identify Markov chain content. For this purpose I wrote my own software. I tried to remove other signs which might flag the content as automatically generated. In particular, the size of files was changed. I removed the trailing sentences which ended abruptly. Paragraph breaks were introduced.
A site was created with such content and hosted on Tripod. It was given a link from PR 3 page. We checked the position of the web pages in SE from time to time. After a period of four months no references to the automatically created web pages were found.
So the final words.
Since the web pages were not included in the SEs indexes, the value of creating such web pages is very limited. There are some commercial SW which claim to create automatic articles. I have tried only one of them, so I cannot make claims on their effectiveness. But basically all use the same algorithms. So the results should hold for others as well.
References.
- URLs to created web pages. List at http://seo-experiments.blogspot.com/.
- Source and Binaries of the SW used to create web pages. http://www.fileshack.us/files/1058/MarkovSeo.zip