Encoding email links in Sitecore

An email address encoding/encrypting solution for Sitecore CMS
November 29 2011
asp.net    cms    sitecore

Background

I've been asked to encode/encrypt the email addresses on our work site, built on Sitecore's CMS, to reduce the likelihood of email harvesters scanning the site and picking up our emails, especially given we've listed email address for a number of staff, as well as our main contact email addresses.

In email harvesting spiders, or bots, crawl sites and parse HTML, extracting email addresses and adding those addresses to lists for their own spamming or for sale to other organisations.  There are a number of ways to protect against email harvesters, increasing in complexity.

The most common and simple is to 'Address Munge', replacing the @ and periods ( . ) with words and spaces, such as "rob at robert gray net au".  Such replacement is so trivial for a harvester to break that it's not worth the effort.

HTML Obfusication is another method that involves inserting some html elements that are hidden by CSS.  This way any email address won't appear as valid when the source is viewed, but the web user will see what appears like a valid email address. This method has accessibility issues.

Some other methods including using images (accessbility problem), CAPTCHA (not user friendly).

I've chosen Javascript Obfusication, where the contents of the email link and text are hidden by the source and converted by javascript after the page has loaded.  Using this technique, email harvesters cannot find email address, and as long as the browser is javascript enabled, this technique will work.

 

The Good Stuff

When I first tackled this problem, I wanted a solution that would catch all and was easy to implement. My first attempt, yesterday, was to create an Xsl Helper extension that would create the link and render the desired HTML.  Sitecore uses Xsl Renderings to create components to display on the page, and my thinking was that developers and authors can use this extension method whenever they wished to encode an email address (which would be always).  I got this method working, using the following code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Xml;
using System.Xml.XPath;
using Sitecore.Data;
using Sitecore.Xml;
using Sitecore.Data.Fields;

namespace MyVeryOwn.Xml.Xsl
{    
    public class XslHelper : Sitecore.Xml.Xsl.XslHelper
    {    
        /// <summary>
        /// Encodes an email address to reduce the chances of Email Harvesting.
        /// </summary>
        /// <param name="iterator">Xml node from within the sitecore xslt</param>
        /// <returns>an email link with character encoding</returns>
        /// <remarks>
        /// Rather than return <a href="mailto:test@email.com">test@email.com</a>,
        /// the email addresses are character encoded and javascript is used client side to reverse the encoding and display the correct email when clicked on
        /// If the Text of the link is an email address, the encrypted class is added ot the link. jQuery is used to call the decode function after the page has loaded 
        /// and display the correct text. The source still shows the encrypted html 
        /// eg: <a class="encrypted" href="javascript:sendEmail('73616C6573406D616D6D6F74686D656469612E636F6D2E6175')">73616C6573406D616D6D6F74686D656469612E636F6D2E6175</a>
        /// which should be <a href="mailto:sales@mammothmedia.com.au">sales@mammothmedia.com.au</a>
        /// (the 'mailto:' is added in the javascript function)
        /// </remarks>
        public string GetEncryptedEmailLink(XPathNodeIterator iterator, string fieldname)
        {
            try
            {
                if (iterator == null) return "";
                iterator.MoveNext();
                var currentItem = GetItem(iterator);

                if (currentItem != null)
                {
                    LinkField field = currentItem.Fields[fieldname];

                    var url = field.Url.Replace("mailto:", "");
                    var convertedUrl = EncodeEmailAddress(url);

                    var linkText = field.Text;
                    if (IsLinkTextEmailAddress(linkText)) {
                        linkText = EncodeEmailAddress(linkText);
                        return string.Format("<a class=\"encrypted\" href=\"javascript:sendEmail('{0}')\">{1}</a>", convertedUrl, linkText);
                    }                    
                    return string.Format("<a href=\"javascript:sendEmail('{0}')\">{1}</a>", convertedUrl, linkText);                        
                }
                return "";
            }
            catch (Exception ex) {
                Sitecore.Diagnostics.Log.Error("Failed to generate encypted email address", ex, iterator);
            }
            return "";
        }

        private static string EncodeEmailAddress(string email)
        {
            return BitConverter.ToString(ASCIIEncoding.ASCII.GetBytes(email)).Replace("-", "");            
        }

        private static bool IsLinkTextEmailAddress(string linkText)
        {
            // copied from /sitecore/system/Settings/Validation Rules/Field Rules/Common/Is Email
            // to use the same pattern as the email rule within Sitecore.
            const string emailValidPattern = @"^[a-zA-Z][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$";
            var emailValidator = new Regex(emailValidPattern);

            return emailValidator.IsMatch(linkText);
        }
    }
}


My Contact List Rendering would call the above like so:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
  xmlns:sc="http://www.sitecore.net/sc" 
  xmlns:dot="http://www.sitecore.net/dot"
  xmlns:mm="http://www.myveryown.com.au/mm"
  exclude-result-prefixes="dot sc mm">

<xsl:template name="ContactThumbnail">
  <xsl:param name="contact" />
  <div class="ContactThumbnail">    
    <sc:image field="headshotimage" select="$contact" />
    <div class="Details">
      <div class="Name">
        <sc:text field="name" select="$contact" />
      </div>
      <div class="Position">        
        <sc:text field="position" select="$contact"  />
      </div>
      <div class="Email">                
        <xsl:value-of select="mm:GetEncryptedEmailLink($contact, 'emailaddress')" disable-output-escaping="yes" />
      </div>
    </div>
  </div>
</xsl:template>

</xsl:stylesheet>

The main stumbling block with this approach is that the rendering author would need to know to include this method. Also, it would not work when displaying an email address using HTML, such as in a layout or sub-layout.  I needed a method that was more reliable.  On another project (using Kentico CMS) I created a HttpModule that did a similar thing using response filters.  Sitecore has a great extensibility story and it’s own pipelines so I went looking through the web.config file, where pipelines are configured, to see if I could find something more suitable.

After a few minutes of looking and I came to the renderField pipeline, which had a stage GetLinkFieldValue.  Sounding like what I was after I broke out Reflector and took a look inside that class.  Based on my findings I crafted a new class GetEncryptedLinkFieldValue that performed the same base functionality, with the additional logic when the LinkType was “mailto”, which is the link type for email addresses.

What I did find was it wasn’t as easy to just override the particular functionality I wanted, so I created some new classes. The only thing I needed from GetLinkFieldValue was I needed to add my own custom LinkRenderer, but the CreateRenderer method was protected and not virtual.  To get around this I recreated the GetLinkFieldValue class as GetEncryptedLinkFieldValue.

public class GetEncrpytedLinkFieldValue 
{
    protected LinkRenderer CreateRenderer(Item item)
    {
        return new EncryptedEmailLinkRenderer(item);
    }

    
    public void Process(RenderFieldArgs args)
    {
        switch (args.FieldTypeKey)
        {
            case "link":
            case "general link":
                {
                    SetWebEditParameters(args, new string[] { "class", "text", "target", "haschildren" });
                    if (!string.IsNullOrEmpty(args.Parameters["text"]))
                    {
                        args.WebEditParameters["text"] = args.Parameters["text"];
                    }
                    LinkRenderer renderer = this.CreateRenderer(args.Item);
                    renderer.FieldName = args.FieldName;
                    renderer.FieldValue = args.FieldValue;
                    renderer.Parameters = args.Parameters;
                    renderer.RawParameters = args.RawParameters;
                    args.DisableWebEditContentEditing = true;
                    RenderFieldResult result = renderer.Render();
                    args.Result.FirstPart = result.FirstPart;
                    args.Result.LastPart = result.LastPart;
                    break;
                }
        }
    }

    private static void SetWebEditParameters(RenderFieldArgs args, params string[] parameterNames)
    {
        Assert.ArgumentNotNull(args, "args");
        Assert.ArgumentNotNull(parameterNames, "parameterNames");
        foreach (string str in parameterNames)
        {
            if (!string.IsNullOrEmpty(args.Parameters[str]))
            {
                args.WebEditParameters[str] = args.Parameters[str];
            }
        }
    }
}

The only difference to the Sitecore GetLinkFieldValue is that mine returns my EncryptedEmailLinkRenderer.  This adds the ability to encode/encrypt email addresses in links.  Sitecores Link type can apply to a number of links, such as links to other documents in the content tree, links to external sites, and more, including email links

sitecore-email

The LinkType property indicate which of the above options were used to create a link. when “Insert Email” was selected the LinkType is “mailto”.  The important piece of code to change was in the Render method, which allows me to update the dictionary of attributes for the anchor (<a>) tag that will be output.

if (((str8 = this.LinkType) != null) && (str8 == "javascript"))
{
    dictionary["href"] = "#";
    dictionary["onclick"] = StringUtil.GetString(new string[] { dictionary["onclick"], url });
}
else if (this.LinkType == "mailto") {
    var encryptedEmailAddress = EncodeEmailAddress(url.Replace("mailto:", ""));
    dictionary["href"] = string.Format("javascript:sendEmail('{0}')", encryptedEmailAddress);

    if (IsLinkTextEmailAddress(str)) {
        dictionary["class"] = "encrypted";
        str = EncodeEmailAddress(str);
    }                
}
else
{
    dictionary["href"] = HttpUtility.HtmlEncode(StringUtil.GetString(new string[] { dictionary["href"], url }));
}

 

the IsLinkTextEmailAddress method checks to see if the inner text of the <a> element is an email address.  If so, it also needs to be encoded.  Also note that if the inner text is encoded I need a way to decode it on the client side.  To accomplish this I add the “encoded” css class to all links needing to be decoded.  The jquery for my page needs to contain the following logic to perform the decryption/decoding on page load

 $('a.encrypted').each(function() {            
    var emailAddress = decodeEmail($(this).text());
    $(this).text(emailAddress);
});

 

You’ll notice that the anchor tag contains a javascript function sendEmail.  This function decodes the href (destination) and adds the mailto:

function sendEmail(encodedEmail) {
    location.href = "mailto:" + decodeEmail(encodedEmail);
}

function decodeEmail(encodedEmail) {
    var email = "";

    for (i = 0; i < encodedEmail.length; ) {
        var letter = "";
        letter = encodedEmail.charAt(i) + encodedEmail.charAt(i + 1)

        email += String.fromCharCode(parseInt(letter, 16));
        i += 2;
    }

    return email;
}

This javascript also needs to be added to the page (inline or preferrably in a .js file).

Below is the complete class, inherited from Sitecore.Xml.Xsl.LinkRenderer

public class EncryptedEmailLinkRenderer : LinkRenderer
{
    private readonly char[] _delimiter = new char[] { '=', '&' };

    public EncryptedEmailLinkRenderer(Item item) : base(item) { }
    
    protected override string GetUrl(XmlField field)
    {
        if (field != null)
        {
            return new LinkUrl().GetUrl(field, this.Item.Database);
        }
        return LinkManager.GetItemUrl(this.Item, GetUrlOptions());
    }

    protected internal static UrlOptions GetUrlOptions()
    {
        UrlOptions defaultUrlOptions = LinkManager.GetDefaultUrlOptions();
        defaultUrlOptions.SiteResolving = Settings.Rendering.SiteResolving;
        return defaultUrlOptions;
    }

    public override RenderFieldResult Render()
    {
        string str8;
        SafeDictionary<string> dictionary = new SafeDictionary<string>();
        dictionary.AddRange(this.Parameters);
        if (MainUtil.GetBool(dictionary["endlink"], false))
        {
            return RenderFieldResult.EndLink;
        }
        Set<string> set = Set<string>.Create(new string[] { "field", "select", "text", "haschildren", "before", "after", "enclosingtag", "fieldname" });
        LinkField linkField = this.LinkField;
        if (linkField != null)
        {
            dictionary["title"] = StringUtil.GetString(new string[] { dictionary["title"], linkField.Title });
            dictionary["target"] = StringUtil.GetString(new string[] { dictionary["target"], linkField.Target });
            dictionary["class"] = StringUtil.GetString(new string[] { dictionary["class"], linkField.Class });
        }
        string str = string.Empty;
        string rawParameters = this.RawParameters;
        if (!string.IsNullOrEmpty(rawParameters) && (rawParameters.IndexOfAny(this._delimiter) < 0))
        {
            str = rawParameters;
        }
        if (string.IsNullOrEmpty(str))
        {
            Item targetItem = this.TargetItem;
            string str3 = (targetItem != null) ? targetItem.DisplayName : string.Empty;
            string str4 = (linkField != null) ? linkField.Text : string.Empty;
            str = StringUtil.GetString(new string[] { str, dictionary["text"], str4, str3 });
        }
        string url = this.GetUrl(linkField);
        if (((str8 = this.LinkType) != null) && (str8 == "javascript"))
        {
            dictionary["href"] = "#";
            dictionary["onclick"] = StringUtil.GetString(new string[] { dictionary["onclick"], url });
        }
        else if (this.LinkType == "mailto") {
            var encryptedEmailAddress = EncodeEmailAddress(url.Replace("mailto:", ""));
            dictionary["href"] = string.Format("javascript:sendEmail('{0}')", encryptedEmailAddress);

            if (IsLinkTextEmailAddress(str)) {
                dictionary["class"] = "encrypted";
                str = EncodeEmailAddress(str);
            }                
        }
        else
        {
            dictionary["href"] = HttpUtility.HtmlEncode(StringUtil.GetString(new string[] { dictionary["href"], url }));
        }
        StringBuilder tag = new StringBuilder("<a", 0x2f);
        foreach (KeyValuePair<string, string> pair in dictionary)
        {
            string key = pair.Key;
            string str7 = pair.Value;
            if (!set.Contains(key.ToLowerInvariant()))
            {
                FieldRendererBase.AddAttribute(tag, key, str7);
            }
        }
        tag.Append('>');
        if (!MainUtil.GetBool(dictionary["haschildren"], false))
        {
            if (string.IsNullOrEmpty(str))
            {
                return RenderFieldResult.Empty;
            }
            tag.Append(str);
        }
        RenderFieldResult result = new RenderFieldResult();
        result.FirstPart = tag.ToString();
        result.LastPart = "</a>";
        return result;
    }

    private static string EncodeEmailAddress(string email)
    {
        return BitConverter.ToString(ASCIIEncoding.ASCII.GetBytes(email)).Replace("-", "");
    }

    private static bool IsLinkTextEmailAddress(string linkText)
    {
        // copied from /sitecore/system/Settings/Validation Rules/Field Rules/Common/Is Email
        // to use the same pattern as the email rule within Sitecore.
        const string emailValidPattern = @"^[a-zA-Z][\w\.-]*[a-zA-Z0-9]@[a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]$";
        var emailValidator = new Regex(emailValidPattern);

        return emailValidator.IsMatch(linkText);
    }
}

Finally make update the Sitecore pipeline in the web.config file

<renderField>
    <processor type="Sitecore.Pipelines.RenderField.SetParameters, Sitecore.Kernel" />
    <processor type="Sitecore.Pipelines.RenderField.GetFieldValue, Sitecore.Kernel" />
    <processor type="Sitecore.Pipelines.RenderField.ExpandLinks, Sitecore.Kernel" />
    <processor type="Sitecore.Pipelines.RenderField.GetImageFieldValue, Sitecore.Kernel" />
    <!--<processor type="Sitecore.Pipelines.RenderField.GetLinkFieldValue, Sitecore.Kernel" />-->
    <processor type="MyVeryOwn.Pipelines.RenderField.GetEncrpytedLinkFieldValue, sitecore" />
    <processor type="Sitecore.Pipelines.RenderField.GetInternalLinkFieldValue, Sitecore.Kernel" />
    <processor type="Sitecore.Pipelines.RenderField.GetMemoFieldValue, Sitecore.Kernel" />
    <processor type="Sitecore.Pipelines.RenderField.GetDateFieldValue, Sitecore.Kernel" />
    <processor type="Sitecore.Pipelines.RenderField.GetDocxFieldValue, Sitecore.Kernel" />
    <processor type="Sitecore.Pipelines.RenderField.AddBeforeAndAfterValues, Sitecore.Kernel" />
    <processor type="Sitecore.Pipelines.RenderField.RenderWebEditing, Sitecore.Kernel" />
</renderField>

Post a comment

comments powered by Disqus