Parallelizing image requests in ASP.NET for Kentico CMS using Response Filters

Using multiple domains for static files and images is recommended method to dramatically decrease your page load time. Browsers have a limited number of concurrent connections. The trick here is creating using multiple domain names for our images. Here's how I did it with Kentico and a Response Filter
November 03 2011

Background

Using multiple domains for static files and images is recommended method to dramatically decrease your page load time.  Browsers have  a limited number of concurrent connections.

RFC 2616 defines “clients that use persistent connections should limit the number of simultaneous connections that they maintain to a given server. A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy. A proxy SHOULD use up to 2*N connections to another server or proxy, where N is the number of simultaneously active users. These guidelines are intended to improve HTTP response times and avoid congestion.”

The key point to take from the above is the recommendation is a browser should have a maximum of 2 connections per domain name.  This means that if our pages downloading a lot of javascript, css, and image files, our pages will likely load slow, as the browser waits for connections to become available.

We can reduce this by for javascript and css by using script combiners that join all javascript files into one javascript file and all css files into one css file.  But what about images?

The trick here is creating using multiple domain names for our images.  We can use up to for domain names for this and effectively quadruple our page performance for images.  The different domains or sub domains still all point to the same IP address.  We also want each image to always go to the same domain name, so the browser can better cache the image.

 

The Good Stuff

A website I recently developed has a many thumbnail images on the home page, as little as 20, but it could as high as the site editor wants to make it.  Occasionally the site was timing out, with a timeout defined by operations as taking longer than 60 to completely load.  Running the site through Firebug and then pingdom revealed some response for image requests were on occasion taking longer than 2 minutes to return, even for static images, where a 304 Not Modified was returned.  The site does have iframes embedded, which are cached locally, but the cache expires and its possible the source of the iframe is slow to respond and in turn holding up other elements from loading.  I wanted to make it obvious where bottlenecks lay.

The website is built with Kentico CMS so I had to be careful where to put the logic to change the url for images.  Other projects, that used a CMS custom to those projects made it easy because the domain name could be selected when the entity is retrieved from the database.  Kentico doesn’t afford me the same luxury, so my options were either update the src field when I created the HTML for the page/template ( eg in the ASCX or ASPX file) or opt for a catch-all method that scans every outgoing page and modifies it.  I chose the later, using response filters.

 


During one of the later stages in the HTTP Pipeline the rendered markup is handed off to a response filter which, if supplied, has an opportunity to inspect and modify the markup before it is returned to the requesting browser.


 

Now what I needed to do is intercept the outgoing response, after it had been rendered, search for any img elements, extract the src attribute and modify it as necessary.

I store the possible domain names to use in the web.config file.

<add key="ImageHosts" value="i1.myimageserver.com,i2.myimageserver.com,i3.myimageserver.com,i4.myimageserver.com" />

(You would use your own domain names. These are made up)

Because we always want the same image to resolve to the same server I use GetHashCode() to help select a domain name from the above list.  Kentico CMS stores images as attachments as well as in the Media Library.  To have images handled by Kentico not included in the parallelization we need to exclude urls starting with /getattachment/ and /getmedia/. I’ve done this in combination with a web.config setting.

<add key="ParallelizeKenticoImages" value="true" />
<add key="UseImageParallelization" value="true"

Code:

public class ImageHostsHelper
{
    private static string[] _imageHosts;        
    private readonly static string[] _dynamicImageUrls = new []
                                            {
                                                "/getattachment/",
                                                "/getmedia/"
                                            };
                                                            
    private static string[] ImageHosts
    {
        get
        {
            if (_imageHosts == null)
            {
                var imageHostsString = string.IsNullOrEmpty(ConfigurationManager.AppSettings["ImageHosts"])
                    ? "www.mysite.com.au"
                    : ConfigurationManager.AppSettings["ImageHosts"];

                _imageHosts = imageHostsString.Split(new[] { "," }, StringSplitOptions.RemoveEmptyEntries);
            }

            return _imageHosts;
        }
    }

    public static bool ParallelizeKenticImages
    {
        get
        {                
            return bool.Parse(string.IsNullOrEmpty(ConfigurationManager.AppSettings["ParallelizeKenticoImages"])
                ? "true"
                : ConfigurationManager.AppSettings["ParallelizeKenticoImages"]);
        }             
    }

    public static bool ParallelizeKenticImages
    {
        get
        {                
            return bool.Parse(string.IsNullOrEmpty(ConfigurationManager.AppSettings["UseImageParallelization"])
                ? "false"
                : ConfigurationManager.AppSettings["UseImageParallelization"]);
        }             
    }

    public string GetImageUrl(string imageUrl)
    {                     
        if (!string.IsNullOrEmpty(imageUrl) && !imageUrl.StartsWith("http") && !ContainsAnyOf(imageUrl, dynamicImageUrls)) {             
            var hostNumber = Math.Abs(imageUrl.GetHashCode())%ImageHosts.Length;
            return string.Format("http://{0}{1}", ImageHosts[hostNumber], imageUrl);
        }
        return imageUrl;
    }

    private bool ContainsAnyOf(string content, IEnumerable<string> startStrings) 
    {
        if (ParallelizeKenticImages)
            return false;

        foreach (var starter in startStrings) {
            if (starter == null) continue;
            if (string.IsNullOrEmpty(starter) && (content.IndexOf(starter) > -1))
                return true;
        }
        return false;
    } 
}

With the logic to generate the domain name for each image done, now I need to create the response filter.  This is done using a HttpModule to take the incoming request context and then hook the response filter to the response for the request.

The HttpModule is pretty basic, hooking the filter into the Response.  The only noticeable part is that I had to filter on webresource.axd or the resources embedded inside the assembly would not load and the CMS back office would not function correctly.

public class ImageHostsModule : IHttpModule 
{
    public void Dispose() { }
    
    public void Init(HttpApplication context)
    {
        context.BeginRequest += new EventHandler(this.BeginRequestHandler);
    }

    void BeginRequestHandler(object sender, EventArgs e)
    {
        HttpApplication application = (HttpApplication)sender;
  if (ImageRequestParallelizer.UseImageParallelization
&& application.Context.Request.Url.AbsoluatePath.ToLowerInvariant().IndexOf("webresource.axd") == -1)
application.Response.Filter = new ImageHostsFilter(application.Response.Filter); } }

I pass in the Response.Filter to my custom filter so I have access to the outgoing response stream and can modify it.  The filter itself looks for any img elements, and if found, extracts the src value, modifies it and puts it back, using ImageHostsHelper.GetImageUrl to get the new src value.

/// <summary>
/// Scans outgoing HTML for img elements and updates the src path to use parallelization of images
/// </summary>
public class ImageHostsFilter : MemoryStream
{
    private System.IO.Stream _filter;
    private bool _filtered = false;
    private static ImageHostsHelper _helper = new ImageHostsHelper();

    public ImageHostsFilter(System.IO.Stream filter)
    {
        _filter = filter;
    }

    public override void Close()
    {
        const string sourceToken = "src=";
        const string imageToken = "<img ";
        const string imageEndToken = "/>";

        if (_filtered) {
            if (Length > 0) {
                byte[] bytes;
                string content = System.Text.Encoding.UTF8.GetString(this.ToArray());
                
                // find all images.
                var occurrence = content.IndexOf(imageToken, 0);
                while (occurrence >= 0) {
                    int occurrenceEnd = content.IndexOf(imageEndToken, occurrence + 5);
                    if (occurrenceEnd <= occurrence) 
                        break;

                    int srcPos = content.IndexOf(sourceToken, occurrence + 5) + 4;                        
                    string srcDelim = content.Substring(srcPos, 1);
                    int srcPosEnd = content.IndexOf(srcDelim, srcPos + 1);
                    string imageUrl = content.Substring(srcPos + 1, srcPosEnd - srcPos - 1);

                    string newImageUrl = _helper.GetImageUrl(imageUrl.Replace(HttpContext.Current.Request.GetBaseUrl(), ""));

                    if (!newImageUrl.Equals(imageUrl))
                        content = content.Replace(imageUrl, newImageUrl);

                    // should we handle that newImageUrl is different size to imageUrl and therefore 
                    // occurenceEnd value as now changed?

                    occurrence = content.IndexOf(imageToken, occurrenceEnd + 2);
                }
                bytes = System.Text.Encoding.UTF8.GetBytes(content);
                _filter.Write(bytes, 0, bytes.Length);
            }
            _filter.Close();
        }            
        base.Close();
    }

    public override void Write(byte[] buffer, int offset, int count)
    {
        if ((System.Web.HttpContext.Current != null) 
            && ("text/html" == System.Web.HttpContext.Current.Response.ContentType)) {                
            base.Write(buffer, offset, count);
            _filtered = true;
        }  else {
            _filter.Write(buffer, offset, count);
            _filtered = false;
        }            
    }
}
 

Now all that’s left is to include the HttpModule in your web.config, which you know how to do, right? Smile<

Post a comment

comments powered by Disqus