Enabling Client-Side Caching of Generated Content in ASP.NET

By: Yorai Aminov

Abstract: This article describes how to support client-side caching of dynamically generated content in ASP.NET and Delphi, and lists some of the lesser-known problems with ASP.NET’s HttpCachePolicy class.

    Introduction

Web sites are slow. Well, usually slower than native applications, anyway. It’s not that web applications are badly written, or that browsers are stupid – it’s simply that browsers and web servers have to perform many tasks just to show a single page. For example:

  1. The browser has to parse the URL, extract the server name, resolve its address, and connect to it.
  2. The browser then sends an HTTP request and waits for a response.
  3. The server parses the request, and tries to fulfill it.
  4. The server then sends an HTTP response to the browser.
  5. The browser parses the response headers and extracts the content.
  6. Finally, the browser renders the content. If the page has embedded content (such as images or style sheets), the steps are repeated for every embedded element.

The third and fourth steps usually take the longest. The server has to load or generate the content, then transfer it to the browser over the Internet. For static sites, that transfer is the slowest part, so browsers try to cache the data locally. This is a little more difficult for web applications.

Web applications usually work by generate content based on information stored in databases. In fact, almost every element you see on this page was retrieved from a database. Because content is dynamically generated, the web server doesn’t know if the content is going to change, so it can’t give the browser enough information to cache the results. It then becomes the responsibility of the application to provide that information.

    How Client-Side Caching Works

For caching to work, browsers and web servers follow a set of rules specified in the HTTP 1.1 specifications. The rules can be quite complicated, but the basic process is simple:

  1. When the server returns content that can be cached, it provides additional information such as when was the content last modified, when does it expire, and who is allowed to cache it.
  2. When the browser asks for content, it tells the server which version of the content it currently has in its cache.
  3. If the server determines the content hadn’t changed, it sets the HTTP status code to 304 (“Not Modified”). Otherwise, the request is treated as a normal request for new content, and the server tried to fulfill it.

The caching information on conveyed in HTTP request and response headers, and most web servers provide caching information in response headers and process caching request headers for static content.

    Client-Side Caching in ASP.NET

ASP.NET allows pages to control client-side caching (which it calls “output caching”) using several methods:

  • By using the OutputCacheSection and OutputCacheSettingsSection elements in web.config and machine.config files.
  • By setting the @ OutputCache directive in ASP.NET pages and user controls.
  • By explicitly setting caching policy using the HttpCachePolicy class.

The last method is particularly useful for ASP.NET applications that generate content in code.

    HttpCachePolicy

The HttpCachePolicy class contains several methods for controlling output caching, and looks something like this:

type
  HttpCachePolicy = class sealed
  public
    procedure AddValidationCallback(handler: HttpCacheValidateHandler;
      data: TObject);
    procedure AppendCacheExtension(extension: string);
    procedure SetAllowResponseInBrowserHistory(allow: boolean);
    procedure SetCacheability(cacheability: HttpCacheability); overload;
    procedure SetCacheability(cacheability: HttpCacheability;
      field: string); overload;
    procedure SetETag(etag: string);
    procedure SetETagFromFileDependencies;
    procedure SetExpires(date: DateTime);
    procedure SetLastModified(date: DateTime);
    procedure SetLastModifiedFromFileDependencies;
    procedure SetMaxAge(delta: TimeSpan);
    procedure SetNoServerCaching;
    procedure SetNoStore;
    procedure SetNoTransforms;
    procedure SetOmitVaryStar(omit: boolean);
    procedure SetProxyMaxAge(delta: TimeSpan);
    procedure SetRevalidation(revalidation: HttpCacheRevalidation);
    procedure SetSlidingExpiration(slide: boolean);
    procedure SetValidUntilExpires(validUntilExpires: boolean);
    procedure SetVaryByCustom(custom: string);
    property VaryByContentEncodings: HttpCacheVaryByContentEncodings;
    property VaryByHeaders: HttpCacheVaryByHeaders;
    property VaryByParams: HttpCacheVaryByParams;
  end;

Code can access the HttpCachePolicy class using the Cache property of the current HttpRequest.

    Supporting Client-Side Caching in Code

To support client-side caching for dynamic content, an ASP.NET web application has to do two things:

  1. Content that can be cached needs to be decorated with the appropriate response headers.
  2. The application must process the request headers and determine whether cached content has changed since it was last cached.

Because caching headers use a timestamp, the application must keep a timestamp of the last modification to the content.

    Setting Response Headers

    Last Modification Timestamp

The SetLastModified method of HttpCachePolicy adds the Last-Modified HTTP header to the response, and correctly encodes the specified timestamp. The browser will pass the timestamp to the server next time it tries to retrieve the content.

  Response.Cache.SetLastModified(ModifyDate);

    Entity Tag

Some clients, however, do not send conditional requests unless the ETag header is also specified. The ETag header contains a unique “entity tag.” Clients may use the entity tag instead of or in addition to the last modification date, so applications should provide both.

The entity tag must identify the exact version of the content. One method of generating such a unique identifier is to use the content name and last modification date. The following code generates an entity tag based on those parameters:

function GetFileETag(fileName: string; modifyDate: DateTime): string;
var
  FileString: string;
  StringEncoder: Encoder;
  StringBytes: array of Byte;
  MD5Enc: MD5CryptoServiceProvider;
begin
  { Use file name and modify date as the unique identifier }
  FileString := fileName + modifyDate.ToString;

  { Get string bytes }
  StringEncoder := Encoding.UTF8.GetEncoder;
  SetLength(StringBytes,
    StringEncoder.GetByteCount(
      FileString.ToCharArray, 0, Length(FileString), True));
  StringEncoder.GetBytes(FileString.ToCharArray, 0,
    Length(FileString), StringBytes, 0, True);

  { Hash string using MD5 and return the hex-encoded hash }
  MD5Enc := MD5CryptoServiceProvider.Create;
  Result :=
    '"' + BitConverter.ToString(
      MD5Enc.ComputeHash(StringBytes)).Replace('-', '') + '"';
end;

The double quotes added by the function are expected by the HTTP 1.1 specification for entity tags.

The SetETag method of the HttpCachePolicy class adds the ETag header to the response. However, there are a few restrictions on the use of this method:

  1. The SetETag method can only be called once. Calling it a second time will raise an exception.
  2. The method does not add the double quotes expected by HTTP 1.1. If you use a different algorithm to create the tag, make sure you include the double quotes.
  3. HttpCachePolicy will not add the header if the “cacheability” is set to Private, which is the default value. Unless you call the SetCacheability method with a different value, you’ll have to add the header by calling HttpRequest’s AppendHeader method.

As far as I can tell, that last restriction isn’t documented, and may be a bug.

    Expiration

You can specify cache expiration using the SetExpires, SetMaxAge, and SetSlidingExpiration methods. Most browsers will not send request for cached content until it expires. This means the server will not get the opportunity to check whether content had been updated, so only set expiration for content you know is not going to change within the specified timeframe.

    Processing Requests for Cached Content

If the content is cached, browsers send what the HTTP 1.1 specification calls “conditional GET” requests. A conditional GET is a GET request that includes an If-Modified-Since, If-Unmodified-Since, If-Match, If-None-Match, or If-Range header field, but not the Range header field (in that case it is considered a “partial GET” request).

The If-Modified-Since request header corresponds to the Last-Modified response header, and contains the same value. Similarly, the If-None-Match request header contains the value passed in the ETag response header.

According to the HTTP 1.1 specification, browsers must pass the If-Match or If-None-Match headers for cache-conditional requests if the server provided an entity tag, but not all browsers do. The specification also states that servers receiving requests containing both an entity tag and a last modification date must process both, and may only return a 304 status code if all conditions match.

The following code determines whether content had been updated since it was last cached:

function IsFileModified(fileName: string; modifyDate: DateTime;
  eTag: string; request: HttpRequest): Boolean;
var
  FileDateModified: Boolean;
  ModifiedSince: DateTime;
  ModifyDiff: TimeSpan;
  ETagChanged: Boolean;
begin
  { Assume file has been modified unless we can determine otherwise }
  FileDateModified := True;

  { Check If-Modified-Since request header, if it exists }
  if (Length(request.Headers['If-Modified-Since']) > 0) and
    DateTime.TryParse(request.Headers['If-Modified-Since'], ModifiedSince) then
  begin
    FileDateModified := False;
    if ModifyDate > ModifiedSince then
    begin
      ModifyDiff := ModifyDate - ModifiedSince;
      { Ignore time difference of up to one seconds to compensate for date
        encoding }
      FileDateModified := ModifyDiff > TimeSpan.FromSeconds(1);
    end;
  end;

  { Check the If-None-Match header, if it exists. This header is used by FireFox
    to validate entities based on the ETag response header }
  ETagChanged := False;
  if (Length(request.Headers['If-None-Match']) > 0) then
    ETagChanged := request.Headers['If-None-Match'] <> ETag;

  Result := ETagChanged or FileDateModified;
end;

Because all that’s needed to determine whether the content had been updated is the resource name and the last modification date, it’s possible to optimize applications that retrieve content from a database by only checking the required fields and not retrieving the actual content.

Notice the additional check for time differences larger than one second:

      ModifyDiff := ModifyDate - ModifiedSince;
      { Ignore time difference of up to one seconds to compensate for date
        encoding }
      FileDateModified := ModifyDiff > TimeSpan.FromSeconds(1);

The DateTime type has a higher resolution than the HTTP date/time format, so we lose values smaller than one second when setting the Last-Modified header. Making sure the content is at least one second newer than the cache avoids the problem of always assuming the cache is invalid if the last modification timestamp does not have a round second value.

    Returning Responses for Cached Content

If the application determines the cache is still valid, it may return a 304 HTTP status code:

  if not IsFileModified(fileName, modifyDate, ETag, request) then
  begin
    { File hasn't changed, so return HTTP 304 without retrieving the data }
    Response.StatusCode := 304;
    Response.StatusDescription := 'Not Modified';
    Response.&End;
    Exit;
  end;

A 304 response must not contain a message-body.

The code shown above isn’t actually HTTP 1.1 compliant. There are a couple of additional requirements:

  • If a normal response would have included an ETag header, that header must also be included in the 304 response.
  • Cache headers (Expires, Cache-Control, and/or Vary), if their values might differ from those sent in a previous response.

Since we always include the ETag header in normal responses, we must include it in 304 responses:

  if not IsFileModified(fileName, modifyDate, ETag, request) then
  begin
    { File hasn't changed, so return HTTP 304 without retrieving the data }
    response.StatusCode := 304;
    response.StatusDescription := 'Not Modified';
    response.Cache.SetCacheability(HttpCacheability.&Public);
    response.Cache.SetLastModified(ModifyDate);
    response.Cache.SetETag(ETag);
    response.&End;
    Exit;
  end;

There’s still one problem with the code, and it’s not obvious without checking the HTTP response at run-time: the Connection field is automatically added to the header, with its value set to close. This causes the browser to close the connection, which means the browser will have to open a new connection for future requests, which may affect performance.

The reason the field is added is that the content is empty, and because the browser doesn’t know what content to expect, it waits for additional data. The problem can be prevented by explicitly adding the Content-Length header. Since we’re not returning any data, we set the field value to 0:

  if not IsFileModified(fileName, modifyDate, ETag, request) then
  begin
    { File hasn't changed, so return HTTP 304 without retrieving the data }
    response.StatusCode := 304;
    response.StatusDescription := 'Not Modified';

    { Explicitly set the Content-Length header so the client doesn't wait for
      content but keeps the connection open for other requests }
    response.AddHeader('Content-Length', '0');

    response.Cache.SetCacheability(HttpCacheability.&Public);
    response.Cache.SetLastModified(ModifyDate);
    response.Cache.SetETag(ETag);
    response.&End;
    Exit;
  end;

    Sample Responses

The code presented in this article is a simplified version of the code used to display this site. Here are a couple of HTTP requests and responses with client caching enabled:

First Request:

GET /article/36897/images/36897/screen-shot-1.jpg HTTP/1.1
Accept: */*
Referer: http://www.codegear.com/products/radstudio
Accept-Language: en-US,he-IL;q=0.5
Accept-Encoding: gzip, deflate
Host: www.codegear.com
Connection: Keep-Alive

Response:

HTTP/1.1 200 OK
Date: Tue, 06 May 2008 17:58:16 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Content-disposition: inline; filename=screen-shot-1.jpg
Content-Length: 278054
Cache-Control: public
Last-Modified: Thu, 24 Apr 2008 23:37:27 GMT
ETag: "78BC2A032DBD0B296726BD94B57568F8"
Content-Type: image/jpeg; charset=utf-8

Second Request (image is cached):

GET /article/36897/images/36897/screen-shot-1.jpg HTTP/1.1
Accept: */*
Referer: http://www.codegear.com/products/radstudio
Accept-Language: en-US,he-IL;q=0.5
Accept-Encoding: gzip, deflate
If-Modified-Since: Thu, 24 Apr 2008 23:37:27 GMT
If-None-Match: "78BC2A032DBD0B296726BD94B57568F8"
Host: www.codegear.com
Connection: Keep-Alive

Response:

HTTP/1.1 304 Not Modified
Date: Tue, 06 May 2008 17:58:52 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Content-Length: 0
Cache-Control: public
Last-Modified: Thu, 24 Apr 2008 23:37:27 GMT
ETag: "78BC2A032DBD0B296726BD94B57568F8"
Content-Type: image/jpeg

    Additional Resources

Server Response from: ETNASC02