Thomas Alva Edison's light bulb VBInfoZine Home
   An ordinary VB developer shares his own successes and failures
   FREE to registered subscribers.  

Manual File Downloads In ASP.NET

Fighting Response.End and the infamous ThreadAbortException while trying to implement manual file downloading in an ASP.NET application.

In this article, I'm going to talk about a simple ASP.NET application I recently created for the Slovak parliament. The application allows browsing and displaying transcripts of parliamentary debates (see http://www.nrsr.sk/appbin/net/nrozprava/; Slovak language only).

The transcripts are stored as plain text files. Each text file contains transcript for a given time frame, which is encoded in the file's name. All the text files are stored in a single file system folder.

The job of the ASP.NET application is, among others, to display the list of transcripts with associated time frames and allow the user to view the contents of selected transcripts.

In order for the transcript files to be more readable to the user, the application applies some basic HTML formatting to the transcript's plain text before it is sent to the browser for viewing.

For various reasons, I've designed the formatting process to take place dynamically whenever a user requests a given transcript file:

First, I've defined an IDownloadService interface that is implemented by the class that formats the plain text files and generates the HTML content (the RozpravaRaw class - see below):

Public Interface IDownloadService
  Sub SendItemToBrowser( _
    ByVal itemID As String, _
    ByVal response As HttpResponse)
End Interface
The itemID argument identifies what should be downloaded. In the case of transcript files, it is the file name. The response argument references the current HttpResponse instance.

Second, I've implemented a generic Download.aspx page, which expects two QueryString variables passed to it:

  • The Type variable contains a type name of a class that would handle the download by implementing the IDownloadService interface.
  • The Ref variable is the name of an item to be downloaded (i.e. a transcript file), which is passed as the itemID argument in the call to IDownloadService.SendItemToBrowser.
Third, I've added the actual IDownloadService implementation to a RozpravaRaw class ("rozprava" is a Slovak term for "debate" :-)), which already handled other aspects of the transcript files processing:
Public Class RozpravaRaw
  Implements IDownloadService
  ...
  Private Sub SendItemToBrowser( _
    ByVal itemID As String, _
    ByVal response As System.Web.HttpResponse) _
    Implements IDownloadService.SendItemToBrowser
    ' Generate content dynamically (Response.Write etc.)
  End Sub
End Class
The processing goes as follows:
  1. The user views a list of transcripts in her browser. Each transcript file link is generated using the Download.aspx page, for example
    Download.aspx?Type=NRozprava.RozpravaRaw&Ref=XXX030909101042000_030909101500000.txt.
  2. The user clicks the link, which activates the Download.aspx page.
  3. The Page_Load event handler in the Download.aspx page extracts the Type and Ref elements from the query string, instantiates the designated type and calls the IDownloadService.SendItemToBrowser method, passing the Ref element as the itemID argument:
Private Sub Page_Load( _
  ByVal sender As System.Object, _
  ByVal e As System.EventArgs) Handles MyBase.Load
  Try
    ...
    ' Parse the query string - "Ref" and "Type" variables are mandatory.
    Dim TypeName, Ref As String
    Me.ParseQueryString(TypeName, Ref)

    ' Create an IDownloadService dynamicaly and delegate to it.
    Dim Service As IDownloadService = CreateDownloadService(TypeName)
    Service.SendItemToBrowser(Ref, Response)
  Catch ex As Exception
    SetError(ex.ToString())
    Trace.Warn(ex.ToString())
    Diagnostics.Trace.WriteLine(ex.ToString())
  End Try
End Sub

Private Sub ParseQueryString( _
 ByRef typeName As String, _
 ByRef ref As String)

  typeName = Request.QueryString("Type")
  If typeName Is Nothing Then
    typeName = String.Empty
  End If
  ' If typeName contains a space, it is just HTML-encoded "+" sign, 
  ' which the CLR uses when naming nested classes.
  typeName = typeName.Replace(" "c, "+"c)

  ref = Request.QueryString("Ref")
  If ref Is Nothing Then
    ref = String.Empty
  End If
End Sub

Private Shared Function CreateDownloadService( _
  ByVal typeName As String) As IDownloadService
  Try
    Dim Assm As System.Reflection.Assembly = System.Reflection.Assembly.GetExecutingAssembly()
    Return DirectCast(Assm.CreateInstance(typeName, True), IDownloadService)
  Catch ex As Exception
    Throw New ApplicationException( _
    String.Format("Unknown request ({0}).",
typeName), ex)
  End Try
End Function
I've designed it this way because, besides the dynamic transcript file download, the application had to support downloading other file types (namely .DOC and .MP3 files). This way, the download link is always the same and the actual download content generation logic is placed within the particular IDownloadService implementation.

Here is the complete Download.aspx.vb code.

Well, everything seemed to work fine until I accidentally looked at the HTML source code of a downloaded transcript file.

Take look at this URL, for example; open the link in the browser and view the HTML source code.

Do you see what's wrong?

...I'm waiting...

Now you've got it - there are two <HTML> blocks in the file! The first block is the one I've generated dynamically in the SendItemToBrowser implementation. The second one was presumably added by the ASP.NET page-processing infrastructure.

Although this is not a fatal error (IE and Mozilla display just the first HTML block), it is still an error. Moreover, the same mechanism is used to download binary files (.DOC and .MP3), where the effect of additional data at the end of the file can be fatal.

So there it was - my quest for a solution began.

The first one that came to mind was to use the Response.End method to terminate the ASP.NET page processing after the dynamic content was generated:

Private Sub Page_Load( _
  ByVal sender As System.Object, _
  ByVal e As System.EventArgs) Handles MyBase.Load
  Try
    ...
    ' Parse the query string - "Ref" and "Type" variables are mandatory.
    Dim TypeName, Ref As String
    Me.ParseQueryString(TypeName, Ref)

    ' Create a IDownloadService dynamically and delegate to it.
    Dim Service As IDownloadService = CreateDownloadService(TypeName)
    Service.SendItemToBrowser(Ref, Response)
    Response.End()
  Catch ex As Exception
    SetError(ex.ToString())
    Trace.Warn(ex.ToString())
    Diagnostics.Trace.WriteLine(ex.ToString())
  End Try
End Sub
The problem with this approach is that the Response.End call generates a ThreadAbortException. This exception is a very lusty beast. Even if you catch it, it is still rethrown at the end of the Catch block (or Finally block if there is any). The only way one can "swallow" the ThreadAbortException is by calling Thread.ResetAbort. But...that call will prevent the thread from being aborted, which is what we're trying to achieve in the first place.

In addition, good practice recommends that exceptions should not be thrown in the normal course of operation, and I wanted to stick with the rules.

Another solution explained in this MSKB article is to use the HttpApplication.CompleteRequest method, instead of the Response.End method. Unfortunately, it doesn't help either - the "default" ASP.NET HTML code is still appended to the end of the dynamically generated content even if you call the CompleteRequest method.

Yet another solution that I came acrross on my quest was to completely remove all the HTML code off of the 'download' page (see the post at the end). Although it might work (I haven't tried), I don't like the solution because it feels like a hack (I mean, it is completely undocumented). There is no guarantee that future ASP.NET versions won't put some HTML content to the response buffers even if the associated aspx page is empty.

At this point, I finally realized that what I really need is to have complete control over the content sent to the browser - IHttpHandler comes to save the day.

IHttpHandler is an interface defined by ASP.NET, which "Defines the contract that ASP.NET implements to synchronously process HTTP Web requests using custom HTTP handlers."

Because I didn't want to change the Download's class name (the Shared CreateHyperlink method was referenced in several other places in the project), I've decided to "morph" the existing Download.aspx web page into a custom IHttpHandler:

I've excluded the Download.aspx file from the project and added a new Download class hosted in an ordinary Download.vb code file. I've added the Implements IHttpHandler clause and implemented the core IHttpHandler.ProcessRequest method by pasting the relevant code from the "old" Download.aspx.vb code-behind file.

Here is the new Download custom HTTP handler source code.

In order to actually associate the Download.aspx URL with our custom HTTP handler, the following lines had to be added to the <system.web> section of the application's web.config file:

<httpHandlers>
  <add verb="*" path="Download.aspx"
       type="NRozprava.Download, NRozprava"/>
</httpHandlers>
Now, whenever a browser requests the Download.aspx URL, the ASP.NET plumbing instantiates our Download class and calls the IHttpHandler.ProcessRequest method. What is written by the method implementation to the response buffer, exactly that is sent back to the browser - nothing more, nothing less!

The conclusion is simple: If you need to generate custom HTTP responses in your ASP.NET application, use the IHttpHandler-based technique. Forget about the other tricks; they're crude and simply not "politically correct".

© Palo Mraz, Tuesday, September 30, 2003

 ©2003-2007 Palo Mraz. All Rights Reserved.   See my 'new browser window' policy