Monday, June 7, 2010...11:21 pm
Comment spam raises its game
As a – fairly – regular blogger, I have to deal with my share of comment spam.
The WordPress-owned spam filter Akismet does a pretty good job of filtering out spam comments – despite some reservations by others in the blogging fraternity. So, generally I’m not that bothered by it.
But when I was emptying the spam queue for my stop-motion animation blog I came across a couple of comments that seemed totally unspammy. So much so that I was surprised to see them there.
Having duly designated them as ‘not spam’, I trashed the rest and went onto the site to check they were OK. I confess I was even a little worried that the eager authors might have been disappointed with the delay in publication and given up on the blog in disgust. (It’s not a heavily-trafficked site – this would hurt).
Not so. Because when I saw the comments in place, I realised they were simply copies of existing comments. Someone had clearly just harvested existing comments, added them to their spam details and reposted them to the site.
Frankly, I’m surprised it took me so long to recognise them – given the relatively small number of commenters I have, and how much I treasure and pore over their submissions to me (go on – become part of the family). I’ve just found the same here on Freelance Unbound, too – which I’d never seen before.
Is this actually new? I’ve certainly never come across it before. I did worry that it would be a game changer in the battle between publishers and spammers.
The vast majority of spam is easy to spot, even to the untrained reader, as it is either [a] garbled and/or filled with references to pornography or drugs; or [b] completely anodyne and unrelated to the post it’s attached to (“Great post! I will read your blog forever!”). But stealing real comments and using them as a spammer’s Trojan Horse is another matter.
Luckily, Akismet can easily spot repeated comments from Freelance Unbound. After all, it’s easy to search the site’s existing database for repeated content.
But what if spammers harvest real, meaningful comments from other people’s sites and submit them here? Can a spam filter monitor the whole of the web for repeated content? And what about software that can subtly rework content so it passes a plagiarism test?
If any bloggers have more experience with this kind of thing, I’d be more than eager to hear about it…Tweet