an ASP.NET, C# technical blog, by Gianni Tropiano

Anti-spam-crawler e-mail control

by CodeGolem 8. July 2009 16:02

UPDATE: I've also posted a jQuery way to hide e-mail addresses to web crawlers. Give it a look.

Every time we display a valid e-mail address on our websites, we expose it to spammers' crawlers and robots.

Crawlers are able to scan our pages, find valid e-mail addressed, and add them to their spam-database.
No wonder, then, if we receive some special cialis offer, even if we never used our e-mail address to register anywhere... Yell

We can use a simple technique to avoid crawlers catching e-mail addresses from our pages, still successfully displaying them to our users, as well as click-enabling them for a mailto-like functionality.

We usually display e-mail address on our pages using simple HTML anchors, or equivalent ASP.NET HyperLink controls like this:


<a href="mailto:myaddress@mydomain.com">myaddress@mydomain.com</a>


<asp:HyperLink runat="server"
    NavigateUrl="mailto:myaddress@mydomain.com"
    Text="myaddress@mydomain.com" />

Both of them expose the e-mail address to crawlers, since they render the whole address on the final page.

We can use the Dynamic Text Image Control from a previous post to display the address like this:


<my:DynamicImage
    runat="server"
    Text="myaddress@mydomain.com"
    FontName="Verdana"
    FontSize="10"
    ForeColor="Black"
    BackColor="White"
    />

This will display the e-mail address as an image to the users.
Human users will still be able to read the addres.
Crawlers should be sophisticated ones, integrating some OCR functionality... I don't think spammers are so smart to use OCR crawlers Wink

A simple javascript function can be used to click-enable the address:


<script type="text/javascript">
function mailto(user, domain)
{
    window.location = 'mailto:' + user + '@' + domain;
}
</script>

As you can see, the complete e-mail address is composed on-the-fly. A simple page crawler would not be able to catch valid addresses from a page like this:


<a href="javascript:mailto('myaddress', 'mydomain')">
    <my:DynamicImage
        runat="server"
        Text="myaddress@mydomain.com"
        FontName="Verdana"
        FontSize="10"
        ForeColor="Black"
        BackColor="White"
        />
</a>

Well... we are not yet done: the Dynamic Text Image Control would render an url like this: "image.ashx?text=myaddress@mydomain.com&fontName= ..."

This could still be cought from a crawler.

So, the final solution could be to inherit a specialized control taking two separate parameters and composing the e-mail address on the fly:


namespace CodeGolem
{
    [ToolboxData("<{0}:DynamicEmail runat=server></{0}:DynamicEmail>")]
    public class DynamicEmail : DynamicImage
    {
        public string User { get; set; }
        public string Domain { get; set; }

        protected override void OnLoad(EventArgs e)
        {
            Text = User + "@" + Domain;

            base.OnLoad(e);
        }
    }
}

Finally, we can incapsulate the whole thing into a reusable control, inheriting from HyperLink server control:


namespace CodeGolem
{
    [ToolboxData("<{0}:Email runat=server></{0}:Email>")]
    public class Email : HyperLink
    {
        public string Address { get; set; }
        public string FontName { get; set; }
        public int FontSize { get; set; }

        protected override void OnLoad(EventArgs e)
        {
            base.OnLoad(e);

            string[] email = Address.Split('@');

            DynamicEmail dynamicEmail = new DynamicEmail;
            dynamicEmail.User = email[0];
            dynamicEmail.Domain = email[1];

            dynamicEmail.FontName = FontName;
            dynamicEmail.FontSize = FontSize;
            dynamicEmail.ForeColor = ForeColor;
            dynamicEmail.BackColor = BackColor;

            Controls.Add(dynamicEmail);

            NavigateUrl = string.Format("javascript:mailto('{0}', '{1}')", email[0], email[1]);
        }
    }
}

Such a control, once registered in the web.config file, can be used this way, to totally hide the e-mail address to page crawlers:


<my:Email
    runat="server"
    Address="myaddress@mydomain.com"
    FontSize="10"
    ForeColor="Black"
    BackColor="White"
    />

Using this control to publish e-mail addresses on our websites, we should be safe from receiving unwanted spam mails.

Hope you find it useful.

Any comments and feedbacks are welcome!

Tags: , ,

ASP.NET

Comments


United States My Blogging Net 
July 17. 2009 06:51
I just want to say thank you for the information that you have been shared. Although some words are not too familiar to me, I am glad that I have read your post.


United States sulumits retsambew 
July 22. 2009 09:40
Pretty good post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog posts. Any way I'll be subscribing to your feed and I hope you post again soon.


Indonesia pabx panasonic 
February 5. 2010 06:02
very nice thank's Smile

Add comment



  Country flag

biuquote
  • Comment
  • Preview
Loading