For the last year and a half, I have been “working” on one of my largest sites. It has passed over half a million words and started off as a no link building experiment. I use the word “working” loosely as for about half a year I have been focusing on other projects. Despite not having much time to focus on this project, it has been slowly ranking, and making more and more money every month.
Every once in a while, I take a look at Ahrefs and check out the backlink profile to see if I have gotten any more organic links. A week or two ago I noticed something strange. What I saw was a site that I had never heard of or seen before so of course I visited the URL to see what they were all about.
What I saw though was a bit of a shock and something I have not dealt with before. The site in question “linking back to me” was a 100% clone of my website! Same template, same link structure, same header, same everything! The really weird thing is that all the affiliate links were still carrying my affiliate code. What then would be the point of them doing this? My best guess at this point is an attempt at negative SEO by duplicate content.
What Has Happened To Date
The first thing I did was try and find out who was doing this. If this is happening to you, the easiest way to do this is run a whois querry on the domain. You can do this here: http://whois.domaintools.com/.
As you can see from the screenshot above, the website is being routed through Cloudflare. Cloudflare is essentially a proxy to the webhost so right off the bat, I was a bit stuck. Since they also have privacy on the domain, I was not able to see who had registered it.
The next thing I did was immediately check Google to see if any of the results were indexed. You can do this by typing this into the search bar: Site:http://sitename.com. This should show all URLs that Google has indexed.
Dammit! While they only have 81 results indexed, and my site has around 500 articles, this is still not good. Duplicate content is indexed. My site gets indexed almost immediately, so I was wondering how it was possible they were getting MY content indexed on “their” site. I then ran the domain through Semrush.
Well, that would explain it then! They have spammed their website in order to get it indexed.
The last thing I had to figure out is if they simply copied my website by hand or if they were just proxying to my domain. To test this, I simply made an edit on one of my articles to see if it immediately showed up on the clone website. It did! I have never dealt with this before, so I knew my first steps were filing DMCA notices.
Filing the DMCA
Since this clone site is using Cloudflare, I needed to find out where the site is actually hosted. To do this, you need to fill out Cloudflare’s abuse page. That can be found here: https://www.cloudflare.com/abuse/form. This should have them respond to you rather quickly and give you the information of the host. You can see their response below.
So with that filled out, I then moved on. When you do a whois on the domain, even though the information we really want is hidden, I was able to see that the domain name registrar was namesilo. I then filed a complaint with them here: https://www.namesilo.com/report_abuse.php.
With that being done, and waiting to hear back from namesilo (I already found the host from the cloudflare email above), it was time to write a DMCA. Believe it or not, in the last 4-5 years of doing online marketing, I have never sent one of these to another website so I had to do a bit of research. Below is the DMCA template that I used found at ipwatchdog.com.
Sample DMCA
My name is INSERT NAME and I am the INSERT TITLE of INSERT COMPANY NAME. A website that your company hosts (according to WHOIS information) is infringing on at least one copyright owned by my company.
An article was copied onto your servers without permission. The original ARTICLE/PHOTO, to which we own the exclusive copyrights, can be found at:
PROVIDE WEBSITE URL
The unauthorized and infringing copy can be found at:
PROVIDE WEBSITE URL
This letter is official notification under Section 512(c) of the Digital Millennium Copyright Act (”DMCA”), and I seek the removal of the aforementioned infringing material from your servers. I request that you immediately notify the infringer of this notice and inform them of their duty to remove the infringing material immediately, and notify them to cease any further posting of infringing material to your server in the future.
Please also be advised that law requires you, as a service provider, to remove or disable access to the infringing materials upon receiving this notice. Under US law a service provider, such as yourself, enjoys immunity from a copyright lawsuit provided that you act with deliberate speed to investigate and rectify ongoing copyright infringement. If service providers do not investigate and remove or disable the infringing material this immunity is lost. Therefore, in order for you to remain immune from a copyright infringement action you will need to investigate and ultimately remove or otherwise disable the infringing material from your servers with all due speed should the direct infringer, your client, not comply immediately.
I am providing this notice in good faith and with the reasonable belief that rights my company owns are being infringed. Under penalty of perjury I certify that the information contained in the notification is both true and accurate, and I have the authority to act on behalf of the owner of the copyright(s) involved.
Should you wish to discuss this with me please contact me directly.
Thank you.
/s/YOUR NAME
Address
City, State Zip
Phone
Now that you have your new DMCA filled out, it is time to send them out. I sent this to the host, and moved on to the next step.
Asking Google To Remove Indexed Results
Now that I had sent the DMCA to the host, it was time to try and get this deindexed so I will not face possible duplicate content in the future.
First visit this page: https://support.google.com/legal/troubleshooter/1114905?hl=en#ts=1115655.
Click on “I have a legal issue that is not mentioned above” and then “I have found content that may violate my copyright” and follow the directions to fill out the form. I copied and pasted the DMCA notice here.
Once I had everything filled out, it was just a waiting game to get emails back.
Things Get Weird
Then things start to get weird. Remember the DMCA I sent to the webhost listed in the Cloudflare email? Well, that bounced right back to me! I will be the first person to say I know next to nothing about how servers work. I am good with content and making money, but I have no idea how this proxy thing works. I recently moved this site to an unmanaged VPS that I have a good friend managing for me. We were both confused as to why this bounced back to us at first.
Then, Google responds…
What the hell?
Not really sure what they were getting at, I send them this as a response:
Surely that should take care of it right? I have friends who own software/plugins who send these out on the daily and Google is always removing results for them without question. Why would this be a problem? Then they responded again!
At this point, I am just lost for words. This entire clone site has been running for almost a month now, has 81 pages of indexed content, and they are unable to locate it? I know Google is known for their piss poor customer service in all aspects of their company unless you have an incredibly large ad spend on adwords, but really?
Wrapping It Up
I started Passive Marketing with the hopes to keep it real and share my wins as well as my failures with the internet marketing community. So here it is! This issue is still unresolved and I am at a loss of what to do moving forward. If anyone has any ideas how to deal with this, drop a comment and let me know!
Until then, keep at it!
Dude, that truly sucks, hope you get it sorted!
Damn, sorry to hear that Neil. Hopefully you get it all resolved but I guess it goes to show you the importance of having multiple income streams when working IM. Good thing you have a few!
Add canonical tags with absolute URLs – if it’s scraped it won’t matter much. Google is pretty good at knowing what is copied though. When you publish info – fetch as google in GSC first, you’ll get 1st priority.
Did you have a peek at the source code on offending site? Sounds like they’re masking your site on another domain, not scraping it. If that’s the case, can’t you just ban the entire Cloudflare ip range they’re using, and problem solved within 24 hours?
I ended up having a server issue and switching back to my old host and that rendered their entire site 403. ^^I like that idea of banning the entire Cloudflare IP range though. Still a bit curious on how they managed to do that.
Easy! You mentioned a managed VPS, well that VPS has a unique IP address that is linked to your domain, but if the VPS hasn’t been setup properly you should be able to access your site using the IP. The blackhat guy registered a domain, set it to point to cloudflare and then on the cloudflare DNS set an A record with your VPS IP.
So the only thing different between sites was the domain, the copy site was your site and your server. When you changed hosts your IP changed and the domain was setup correctly again, so even if they found your IP and pointed to it it would give an error.
Next time you use a VPS make sure to set it up properly, and you should always use CloudFlare since it masks your IP protecting the server from attacks, and of the duplicate content “hack” you were a victim of.
Why they did this? To get the domain ranking higher and sell it or use it for a future site.
This is a perfect explanation that I seemed to have missed. Thank you! I have since moved that site to another server but you learn something new everyday.
So namesilo did ignore your complaint?
I never got a reply from them at all…
Neil, I’ve sent many DMCA takedowns over the years, and here’s the strategy that has gotten a 100% takedown rate: First send the Google DMCA complaint and get the infringing page de-indexed. If that’s the second notice you send, the infringing page may already be down. If Google gets too many complaints about one website, they’ll eventually de-index the site. Secondly, in the case of Cloudflare hosted sites, send your DMCA takedown notice to Cloudflare. They should respond within a day or two with the name of the actual host. Third, send your notice to that host. It’s crucial that you include all details in the first takedown notice. Too many details is better than too few, because by law you cannot go back and amend a notice that you file with the host. I always include a statement to the effect that I possess the original documents or image files with embedded EXIF data proving that I am the originator of the content. This is especially effective when I have created an image in Photoshop and have the layered files that the infringer does not have access to. Your original photos contain embedded EXIF data that show you originated those, as well. Another option is to spend the money to register your copyright with the US Copyright Office. I believe it’s about $150 per item, but depending on the situation, it may be worthwhile to do that. You’ll have my email address with this comment, so please feel free to contact me if you want more info or help.
Really appreciate the response! I actually ended up switching hosts and that cleared everything up. Was a very confusing thing to see happen.