Comment spam and ants

So we’ve been dealing with a pest problem at home (ants), and everyone has been dealing with comment spammers lately (including me, at home _and_ at work). I don’t mean to compare the two problems, as that would be rude — after all, ants are industrious, make cool trails, and are kind of cute in their own way.

With regard to the nofollow proposal, I think it is interesting, but mainly for the fact that competing search engines are actually collaborating with each other and web software makers to implement it. John Battelle and others have expressed concerns about nofollow, which don’t worry me quite as much. I think the concern is essentially rooted in a feeling that a beautiful, simple, and _objective_ system (what they believe PageRank and its relations to be) will be messed with and turned into more of a subjective and human-engineered system. I disagree with this concern mainly, I think, because I agree with Doug Cutting that people often ascribe “objectivity” to algorithms without thinking very hard about it. Algorithms implement policies, whether intentionally or not, and I don’t see any reason why one that pays attention to nofollow labels is inherently less objective than one that does not.

Secondly, I’m sorry, but search engines _will_ address spam one way or another (since any clear and simple algorithm can be gamed). The only unique thing about the nofollow proposal is that it is enlisting the dev and blog communities to provide more information, rather than having search engines make all the discriminations themselves, less transparently. The main virtue of this particular version of the proposal is that it’s simple: nofollow very simply makes a binary judgement saying that the author of the software (or the page) does not fully trust the link in question. To my mind, that is worlds better than the hairer schemes involving numerical weights, multiple dimensions of trust, etc. — those seem like huge opportunities to introduce noise and to have everyone in the world trying to crank their links up to 11.

My own more personal battle with spam took an annoying turn lately. I protected comment-posting on this blog with homegrown captchas (like the one you’ll have to solve below to comment). Or at least I _thought_ I protected it … turns out I left a rather gaping hole, which spammers jumped through. Posting via the regular web form was blocked, but due to a code bug anyone with a mind to could POST directly to the submission handler and ignore the captcha completely. Anyway, it’s fixed now, so I’d like to issue a challenge to anyone out there with white-hat inclinations: can you comment on this blog without solving a captcha? If so, please do and tell me how you did it. (There is a straightforward vulnerability, but I’m wondering how easy it is to exploit.)

The nofollow proposal is hardly a silver bullet, but we shouldn’t really expect one. With ants, their are lots of different things you can do, and they’ll all help: keep your house neat, don’t leave food around, dispense with the ones you see, block them where they’re getting in, and (best of all) use those cool new ant bombs that supposedly have the ants taking the poison back to the hive. Nofollow and captchas are roughly analogous to not leaving food around and blocking them where they get in, respectively. I’ll leave the other analogies to the reader, but I think we (meaning search engines and software makers) will need to do all of these things at the same time.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s