Math Jazz — Mathias Bynens’s shizzle, y’all



Note: This site might seem inactive… That’s because it is. Don’t worry though, I’m still coding webpages and stuff! If you’re interested, I suggest you get a translator and head over to Qiwi; or you could just check the latest site we’ve been working on: Apotheek Goethals – Debrabandere. Enjoy!

Obfuscating email addresses

John linked Stu Nicholls’s email address hiding method through CSS a couple of days ago, and I was amazed. Amazed by the few bytes of code wherewith you can prevent your email address from being spotted by some silly spam bot.

Alliterations aside, I had already written a PHP function that obfuscates your email address in the old-fashioned way — by “encrypting” it through the use of funky HTML entities. It does exactly the same as most online obfuscator tools, but I decided to functionize it for my own pleasure.

The PHP

After watching Stu’s CSS method, I thought the function could use some enhancements. After throwing in some extra arguments, this is what I came up with:

<?php
function obfuscate_email($addy, $reverse = 0, $before = '<span class="email">', $after = '</span>') {
 if ($reverse) {
  $addy = strrev($addy); // I put my thang down, flip it and reverse it
  $encr = $before;
 }
 for ($i=0; $i<(strlen($addy)); $i++)
  $encr .= '&#' . ord($addy[$i]) . ';';
 if ($reverse)
  $encr .= $after;
 return $encr;
}
?>

That demands a bit of explaining.

$addy
Contains the email address in all its unprotectedness.
$reverse
If this parameter is set to TRUE, the email address will be reversed before the encryption process starts off. Enable if you want to use the ultra-cool CSS method.
$before
Contains a string which will be returned before the obfuscated email address. Defaults to <span class="email">.
$after
Contains a string which will be returned after the obfuscated email address. Defaults to </span>.

The CSS

In case you want to use the CSS reversion thing as well, you’ll have to add some extra CSS code to your stylesheet:

.email {
 unicode-bidi: bidi-override;
 direction: rtl;
 }

Note: The reason why I’m not using <span class="backwards"> (like Stu does, is because I’m afraid spam bots might end up with a built-in reverser for text with a CSS class of backwards applied to it. Of course, the same goes for <span class="email">, but then again this is just an example. Be warned — use the funkiest of class names.

The examples

View this page’s source to see the raw HTML output.

I’m convinced using this function in all its aspects (i.e. the encryption/reversion combination) is your best bet to fight spambots. Enjoy!

Update: By, erm… “request” of Indranil, I created another one of those email obfuscator tools. It should be bug-free, but if you’ve got any suggestions (regarding form layout or whatever), they’re more than welcome.

Filed under PHP, CSS, XHTML · February 26th, 2005

Comments (19)

Listed below are the responses for this entry.

  1. Turnip:
    This commenter’s Gravatar

    My problem with Stu’s method is that if someone has an old browser or a screenreader, and thus isn’t using CSS then they’ll just get a garbled pile of shit and find a different web designer. (Or worse, they’d be naive enough to think it’s a real email address and try to mail you with it).

    I also have an opinion on your note about class names. People who go to the length of obfuscating their email addresses are generally savvy enough to have spam filters installed too, and even if they don’t, they are NOT going to respond to any of the spam you send them. Therefore, I don’t think it would be in any way profitable for spammers to spend their time detecting and decoding obfuscations, so I don’t think it’s a worry.

    Comment posted on February 26th, 2005 @ 5:28 pm
  2. Mathias:
    This commenter’s Gravatar

    My problem with Stu’s method is that if someone has an old browser or a screenreader, and thus isn’t using CSS then they’ll just get a garbled pile of shit and find a different web designer. (Or worse, they’d be naive enough to think it’s a real email address and try to mail you with it).

    Good point, Jon, I was just thinking about the very same thing over dinner. On every page Stu’s CSS method is used, it should thus say <span class="hide"><strong>In order to prevent spambots from picking up my email address, it has been reversed. Sorry for the inconvenience.</strong></span>. (All of this with .hide { display: none; }, of course.)

    People who go to the length of obfuscating their email addresses are generally savvy enough to have spam filters installed too, and even if they don’t, they are NOT going to respond to any of the spam you send them. Therefore, I don’t think it would be in any way profitable for spammers to spend their time detecting and decoding obfuscations, so I don’t think it’s a worry.

    Nice thinking. I hope you’re right :)

    Comment posted on February 26th, 2005 @ 5:58 pm
  3. Jonas Rabbe:
    This commenter’s Gravatar

    Now that you have been so nice to comment on my site I thought I’d return the favour.

    There are a number of problems associated with email addresses that look correct in browsers. The first is of course the problem mentioned above, where I might add Safari doesn’t fare very well either (and I got the screenshot to prove it). However, the main problem, which I also talked about in response to Joen’s (Almost) unspammable email addresses, is that any hiding or other obfuscation which relies on technology currently available in web browsers will be broken if enough people use it. There are already spam robots that can read HTML entities and have built-in javascript interpreters, it may not be long before they have CSS handlers too if Stu’s approach catches on. From my comment at Joen’s sidenote:

    Encoding your email address is security through obscurity which is, if not problematic, at least controversial. The wiki entry for security through obscurity likens it to hiding a spare key under your doormat. In theory anyone can enter your house using the spare key, but you rely on the hiding place being secret. Encoding your email address as HTML entities is very much the same. Just as the doormat is one of the first places a burglar would look, HTML entities are easily recognized and converted to reveal the email address. The same is in fact the case for javascript functions. Any JavaScript interpreter will be able to interpret the JavaScript and reveal the email address.

    And this is also the case for CSS, a CSS handler would be able to style the CSS and resolve the email address.

    Just my 2¢.

    Comment posted on February 28th, 2005 @ 3:39 pm
  4. logtar:
    This commenter’s Gravatar

    The bad thing is that spammers seem to always find a way… I am sure that now bots just collect text, then they are washed out by an engine that does something to rectify e-mail addresses. The problem has to be solved by just stopping spammers, not by hiding our e-mail addresses.

    Comment posted on February 28th, 2005 @ 3:49 pm
  5. Aankhen:
    This commenter’s Gravatar

    In case you want to use the CSS reversion thing as well, you’ll have to add some extra CSS code to your stylesheet […]

    “Reversion”? Is that a new word? O_O

    Comment posted on March 2nd, 2005 @ 7:37 am
  6. Mathias:
    This commenter’s Gravatar

    However, the main problem […] is that any hiding or other obfuscation which relies on technology currently available in web browsers will be broken if enough people use it.

    This is indeed a problem, Jonas. However, please note that I’m not claiming this double protection to be fully bulletproof — it’s just a combination of two of the best email obfuscating methods currently available. However, following Jon’s logic, I do have the idea spam bots won’t evolve like this. At least, I hope so.

    The problem has to be solved by just stopping spammers, not by hiding our e-mail addresses.

    Good point, John. But then again, there is no way to stop spammers. Generally, I don’t like it when people try to avoid spam by letting stuff on their site interfere with the activities of their users. And indeed, basically this email obfuscation method is an example of this.

    “Reversion”? Is that a new word? O_O

    Maybe so, Aankhen — I make up a lot of words whilst trying to write in English — though Google knows what I’m talking about: turning in the opposite direction.

    Comment posted on March 2nd, 2005 @ 12:47 pm
  7. Aankhen:
    This commenter’s Gravatar

    Maybe so, Aankhen — I make up a lot of words whilst trying to write in English — though Google knows what I’m talking about: turning in the opposite direction.

    I saw the Google definition too; was the first thing I checked. Didn’t seem quite right to me though.

    Oh well, you live and learn. :-)

    Comment posted on March 2nd, 2005 @ 6:25 pm
  8. porges:
    This commenter’s Gravatar

    Correct me if I’m wrong, but if you’re going along the path of your email looking normal, except for the people using old browsers, and it being un-copy-and-pasteable, you may as well just draw a picture or Flash animation of your email address and get it over with. :)

    Comment posted on March 3rd, 2005 @ 3:55 am
  9. Mathias:
    This commenter’s Gravatar

    Indeed, but an image or a Flash animation usually sucks up more bandwidth than some XHTML. Thus, I am convinced this is the best of those three methods (picture vs. Flash animation vs. CSS + XHTML obfuscation).

    Comment posted on March 7th, 2005 @ 1:38 pm
  10. BoBB:
    This commenter’s Gravatar

    I think the best protection against spam is a good spam filter, as was already stated we can not stop the spammers. I use bayesian filtering for my mail filtering and do not worry about protecting my email address. I have never once got a false positive and have only gotten, at last count, 132 false negatives in the year that I have been using this filtering method. The more spam they send me, the better I get at avoiding it.

    Unfortunately this solution is not one easily implemented and therefore not ideal for most users. But then this method isn’t easily implemented for those not so computer savvy users either. Looks like we are back to survival of the fittest, Darwin was right! heh

    Comment posted on March 8th, 2005 @ 1:39 pm
  11. Indranil:
    This commenter’s Gravatar

    I’d still use the encoder at automaticlabs. It’s very easy to use.

    Comment posted on March 10th, 2005 @ 4:14 pm
  12. Mathias:
    This commenter’s Gravatar

    Well Indranil, especially for you, I uploaded my email obfuscator tool thing. Suggestions are more than welcome.

    Comment posted on March 30th, 2005 @ 9:50 pm
  13. Indranil:
    This commenter’s Gravatar

    Nice, really nice. Automated obfuscating method. Nicely done.

    Comment posted on March 31st, 2005 @ 5:31 pm
  14. Rik Janssen:
    This commenter’s Gravatar

    This is interesting too…

    In addition to replacing the @s (&#64;) in email addresses in order to avoid spambots, you can also replace any other character with its corresponding ASCII code.

    Comment posted on June 7th, 2005 @ 10:52 am
  15. Mathias:
    This commenter’s Gravatar

    Indeed it is. That’s in fact one of the things the obfuscate_email() function does — converting the unencoded address to HTML entities. Try it out for yourself. (Oh, and in the future, try to at least read the post. :P)

    Comment posted on June 7th, 2005 @ 1:06 pm
  16. Rik Janssen:
    This commenter’s Gravatar

    *grin* ;) I just remembered you had a blog post about it with some PHP in it ;)
    So I thought, let’s post this link ;)

    Comment posted on June 7th, 2005 @ 7:29 pm
  17. Aaron Gyes:
    This commenter’s Gravatar

    I’m in the process of writing something a bit more sneaky. I’m totally paranoid of some smart spam bots figuring out this easy stuff. What I want is a function that’ll return stuff like this from someguy5@domain.com:

    • You can email me at “s o m e g u y 5 @DONTSPAMME d o m a i n . c o m”, remove the string DONTSPAMME.
    • You can email me at “moc.naimod@5yugemos”, reverse the text
    • You can email me at “someguy[2+3] at domain dot com”, do the math inside the brackets, remove the brackets, and exchange the “at” and “dot” strings with their respective symbols.

    This should keep my email addresses safe until real AI is invented. While some of those may be confusing, it’s a feature that it will also filter morons out.

    Comment posted on June 29th, 2005 @ 3:57 am
  18. Alan Hogan:
    This commenter’s Gravatar

    To eliminate the need to insert the CSS into a stylesheet, I’ve made the simple change from <span class="email"> to <span style="unicode-bidi: bidi-override; direction: rtl;">.

    Comment posted on July 12th, 2005 @ 10:46 pm
  19. Mardeg:
    This commenter’s Gravatar

    If you’re extremely paranoid, visit my link that creates the illusion of an image of your email address by generating a random-text logo and using CSS to shrink it down so each character is the size of one pixel.
    The generator is written in Javascript, so I’m not sure how easy it would be to convert to PHP.

    Comment posted on October 25th, 2005 @ 10:08 am

Trackbacks & Pingbacks (1)

Listed below are resources on the web that mention this article.

  1. Ian’s Blog: Obfuscating Email Addresses by Math Jazz:
    This commenter’s Gravatar

    Obfuscating Email Addresses by Math Jazz
    […] Obfuscating Email Addresses […]

    Pingback made on May 28th, 2005 @ 10:14 pm