Script injection and StringTemplate

I just read a very interesting article called Secure String Interpolation in JS that got me thinking about rendering in StringTemplate. I don’t follow the caja project but it looks interesting. I found the article linked from John Resig’s blog.

Script injection in web applications, which also goes by the name XSS (Cross Site Scripting), is a serious and common security problem. It arises when the application does not appropriately escape data that comes from an untrusted source. The data could contain malicious JavaScript that would get executed. If developers are burdened with remembering to escape each and every instance of data they will eventually forget or make a mistake and a security hole will result. One main point of the article is that string interpolation should be automatically secure — it should do the right escaping for you.

Every now and again the issue of how to escape special characters when generating HTML files comes up on the StringTemplate mailing list. The answer is very straight forward – write a renderer. My main point, which I’ll get to eventually, is about the details of the renderer.

The author of the above article distinguishes string interpolation from “full blown” templating languages. JSP, PHP, and XSL were given as examples. Three problems with full blown templating languages were given (folks familiar with StringTemplate know there are more):

  1. They do nothing to help solve the escaping problem.
  2. They are verbose and add lots of boiler plate.
  3. They don’t make simple things simple.

I think StringTemplate holds up well against these complaints. The renders can do escaping. The simple templates (.st files) have no boiler plate at all and the group files (.stg) add only a little. Lastly, it doesn’t get much simpler than “Hello $name$!”.

The most important issue and the one the article is really about is number one — escaping.

On an HTML page what characters need to be escaped depends on the context: element content, attribute value, JavaScript, and URIs. StringTemplate rendering assumes a single rendering context. As of StringTemplate version 3.1 the format option can be used to specify a specific rendering (format). The trouble with this is that the template writer must be aware of the different contexts and what format to use in each. An alternative is to come up with a one size fits all escaping renderer. This will end up escaping more than it needs to. There may be some fringe cases where this could cause problems.

The thing I found most interesting about the approach given in the article is that the context scanner keeps track of the current context in the output so that the escaper can do the right escaping according to the context. To put this in StringTemplate terms the output after rendering would be feed into a parser that tracks changes in context (element to attribute value to JavaScript to element content etc.) and informs the renderer(s) of the current context.

You can still use the format option or other custom renderers to specify how the data is turned into a string (for example a number could be currency or a percentage). This is something that the template author is expected to know. It seems reasonable to me that it would be better if the author didn’t need to understand the intricate escaping rules of web pages.

A key innovation of StringTemplate is the identification of the Model-View-Controller-Renderer pattern — that there is a renderer distinct from the view. (Read about MVCR here.) Now I am seeing the render as two distinct pieces. There is the format renderer and the escaping renderer. The former is concerned with if negative 23 should be output as (23) or -23. The latter is concerned with if single quote should be output as ‘ or ' The format renderer runs first and passes its output to the escaping renderer.

The reason it makes sense to think of two different renderers is that they each have different masters. The format renderer is directed by the needs of the application, the locale and perhaps presentation directed formatting with the format option. The escaping renderer’s function is determined by the language of the output: SQL, HTML, XML or just the text of an email message.

Other thoughts and observations:

Is there a need for a JavaScript implementation of StringTemplate? As it becomes more common to move the model, view and controller to the client there could be a role for it. It could also run server side in Rhino. Although the author of the article states that it is not a goal to create a template language I wonder if the extra capabilities of StringTemplate would be welcome.

The method used in the article for implementing auditable exemptions (cases where you know the data is safe and you don’t want it escaped) is the same as would be done in StringTemplate. Just wrap the data in a type that returns the raw string when rendered.

In the article a simple FSM was used to parse the HTML. Only an excerpt of the code was shown and I didn’t bother to download it and look deeper. It seems to me that the output HTML needs to be well formed otherwise the parser would be very complicated if it has to deal with the tag soup that browsers deal with.

This is now at least the fourth time I’ve seen a parser written in JavaScript so I think there could be a need for a JavaScript backend for ANTLR. I would have liked to create a l-system language parser in JavaScript for a recent project of mine. Is there already an ANTLR for JavaScript that I don’t know about?

Does anyone else think that it would be useful/feasible to have a HTML escaping renderer (that does the correct escaping in automatically) as part of the StringTemplate library?

jQueryUI – A new hope

Recently I had some not so nice things to say about jQueryUI while still professing my love for jQuery. What didn’t come across in that post is that I am hopeful that jQueryUI will become as great as jQuery.

Now it seems I have good reason to be hopeful. John Resig tells us the good news that Paul Bakaus (the lead developer) now works full time on jQueryUI at Liferay and that a new release is due out soon.