Drupal, Pathologic and Corrupted URLs
I’ve been having some technincal issues with this site lately. Strange links to content within the site have been appearing at random. I would insert a hyperlink to an archived post and then, days later, I would come back to see that the URL has been rewritten with a random sub-domain prefix. My domain would appear as www.wqw.robertgomez.org or similar.
I am not entirely sure what was going on but I think bots and the Drupal Pathologic module are to blame. Pathologic is a great module that will convert any internal site link into a standardized absolute URL. In my code I would create a link with an href of “node/1098” and Pathologic would convert that href to “http://www.robertgomez.org/blog/2014/03/17/drupal-my-list-essential-modules”. However, I suspect that when various bots crawled my site they used weird sub-domain prefixes in hopes of doing… I don’t know what?! Occasionally, one of these bots must have triggered a cron job, and my links were rewritten with the phony sub-domain. Seems feasible, right? If there is a real reason why this was happening, let me know in the comments.
The bad links could be fixed by opening a node and re-saving them. I used the Resave Nodes module to bulk save everything again. However, by the next morning the bad links had returned. The phony sub-domains were still being crawled. The next step was to use a rewrite rule in my .htaccess file that would force all subdomain traffic to a non “www” prefixed URL of the site. I then re-resaved everything, an so far thing are back to normal. Again, if you know what’s going on, shoot me a comment.