The imperia View Processor

The main motivation for developing the imperia view processor was frustration about the existing templating engines available for Perl at the time. None of them offered all of the features that I considered a must for the V in the imperia MVC framework. Some of the more important design decisions behind it are outlined below, giving you some insight into the anatomy of a modern template processor.

Please note that the syntax highlighting for the examples below is far from being perfect. This site uses the Rouge syntax highlighter which has no support for imperia view templates.

General Syntax

Like many other templating engines, the imperia view processor copies its input to its output as is, except for sections marked up with the combination of curly braces and percent signs.

<div>
verbatim
{% VIEWCODE %}
verbatim
</div>

The opening and closing delimiters are actually configurable but that is a rather esoteric feature and rarely used.

Whitespace Digestion

Using template engines often adds gratuitous whitespace to its output, sometimes leading to layout problems. The imperia view processor allows you very fine control over the processing of whitespace around view instructions.

<div><a href="{% link %}">
    {%= "Label" -%}
  </a>
</div>

Delimiters decorated with a hyphen (-) are greedy and eat all the whitespace in their direction that is in the way until the next line feed. Those decorated with an equal sign (=) are very greedy and also consume line feeds. The above code will hence produce this:

<div><a href="http://content.of.variable.link">Label
  </a>
</div>

The basic idea was taken from Template Toolkit but extended.

Comments

View comments are almost HTML comments but they start with <!--%. That has a big advantage:

  <!--% FIXME! This has never been tested because nobody has IE around here ... -->
  Best viewed with Internet Explorer.

The comment will simply vanish in the output. You can also combine this with the whitespace consuming features:

  <strong>
    <!--%= FIXME! This has never been tested because nobody has IE around here ... =%-->
    Best viewed with Internet Explorer.<!--%=
  %--></strong>

This will produce:

  <strong>Best viewed with Internet Explorer.</strong>

View Directives

There are two flavors of view directives, bodyless directives and directives with a body:

<ul>
{%- #foreach item (@{items}) =%}
  <li>{% item %}</li>
{%= #end -%}
</ul>

This is an example of a directive with a body. It begins with a #KEYWORD, the keyword being foreach in this case. It ends with a corresponding #end.

Then there are bodyless directives that begin with a double hash sign (##):

<footer>
{% ##include('layout/footer.html') %}
</footer>

Think <br/>!

Expressions

Everything else inside the curly-percent combinations is interpreted as code in the “X” mini language:

{% 
    faculty = 1;
    if (number != 0) {
        for (i = number; i > 0; --i) {
            faculty *= i;
        }
    }
%}
<span>The faculty of {% number %} is {% faculty %}.</span>

X is a mixture of JavaScript and Perl without its infamous variable sigils. The name X was chosen because B, C, and D were already taken.

Terms

Terms have a syntax very similar to JavaScript aka EcmaScript:

1
2
3
4
5
6
7
8
9
10
{% i %} <!--% Variable. -->
{% (a + b) * (a - b) %} <!--% Arithmetics. -->
{% document.head.title %} <!--% Hash lookups (or maybe method calls). -->
{% document[head][title] %} <!--% Same as above -->
{% items[5] %} <!--% 6th element of array "items". -->
{% items.5 %} <!--% Exactly the same. -->
{% items["5"] %} <!--% Still the same. -->
{% meta.getHeader('Content-Type') %} <!--% Method invocation. -->
{% meta.getAuthor() %} <!--% Method invocation without arguments. -->
{% meta.getAuthor %} <!--% Same as above. -->

When view templates are rendered, they are passed an arbitrarily complex data structure, called the payload. Terms are evaluated against this data structure. The result of the term in line 3 is hence evaluated like this by the view processor (more precisely by the X interpreter):

return $payload->{document}->{head}->{title}.

Functions

The syntax for function calls should not be surprising to anybody.

1
2
{% push(array, item1, item2, item3) %}
<span>{% time() %} seconds since January 1st 1970, 00:00 GMT have passed.</span>

This should be enough information to understand the code snippets used below.

Interesting Features

I had worked with a lot of template engines before I started writing the imperia view processor. Its set of features combines almost everything that I missed in other templating engines.

Body-First-Head-Last-Rendering

In HTML templates you often have to struggle with the fact that the head is rendered before the body. That leads to awkward and unclean code.

Take this typical example of an HTML template, not at all specific to the imperia view processor:

1
2
3
4
5
6
7
8
<html>
  <head>
  {% ##include('layout/head.html') %}
  </head>
  <body>
  {% ##include(imperia.body.template) %}
  </body>
</html>

The HTML head is included in line 3, the body in line 6. Note that the argument of the include statement for the body is a variable, not a hardcoded string. That variable is set by the controller, so that a controller specific view is selected.

The body template for the helloWorld controller may look like this:

1
2
3
4
5
6
7
8
9
 <!--% Assignments. -->
 {%= 
     imperia.stylesheets.push('/assets/css/hello.css');
     imperia.title = 'Hello, world!' 
 -%}
<h1>{% imperia.title %}</h1>
<div>
Your content may go here ...
</div>

In line 3 the view specific stylesheet /assets/css/hello.css is added to the list of stylesheets. The page title is set in line 4, and printed out in line 6. That works well for the body.

Say the view layout/head.html looks like this:

1
2
3
4
<title>{% imperia.title %}</title>
{% #foreach stylesheet (@{imperia.stylesheets}) %}
 <link rel="stylesheet" href="{% stylesheet %}" />
{% #end %}

It is included before the body template. Therefore imperia.title is not yet set, and the view specific stylesheet is missing from the list in imperia.stylesheets.

You end up setting the title and the stylesheet in the controller. That is awkward and violates the MVC paradigm. The imperia view processor allows a very simple solution for this problem. The root template for HTML output looks like this:

1
2
3
4
5
6
7
8
9
10
11
<html>
  {% #assign(body) %}
  {% ##include(imperia.body.template) %}
  {% #end %}
  <head>
  {% ##include('layout/head.html') %}
  </head>
  <body>
  {% raw(body) %}
  </body>
</html>

The body is rendered into a variable body before the head is rendered. The body views can therefore set or manipulate variables that are used in the head.

The body is then output in line 13. Because it contains markup, it has to be passed through the function raw() which prevents HTML escaping.

This is a common pattern in imperia view templates, ridiculously simple but very powerful.

Skinnability

One of the advantages of defining the look and feel of an application in templates is skinnability. Users can replace templates so that the application blends in better into a corparate design. That works well most of the time, until the next upgrade of the software. The customized templates have to be merged, most of the time manually, or the vendor cannot upgrade them.

The imperia view processor is always invoked with a list of include directories. The arguments of every ##include directive are searched in each one of these directories. The imperia default directory is always the last in that list.

The idea behind this is that imperia as a software vendor can mercilessly overwrite the view templates it ships with every update. Customers store their own templates in a project specific directory that takes precedence over the imperia default directory.

1
2
3
4
5
6
<body>
<img src="{% ##include('layout/logo.html') %}" width="200" height="100" />
...
{% ##include('scripts.html') %}
{% ##xinclude('custom-scripts.html') %}
</body>

The template layout/logo.html will be searched in the project specific directories first. In the same way you can switch server-side between template for specific targets like mobile devices or provide output templates for better accessibility, simply by prepending directories to the include path.

Another simple but efficient feature for improving flexible skinnability are ##xinclude directives. They work just like ##include directives but fail gracefully without even a warning. This allows imperia to build in hooks into the view templates. The file custom-scripts.html (see line 5 in the above example) is loaded only when present. If it does not exist, nothing happens.

Rendering Contexts

When you design a template processor, one of the basic decisions is whether to escape output by default or not. For security reasons (XSS attacks) escaping by default is the preferred approach.

The imperia view processor is called with a so-called rendering context which defaults to HTML. That means that the result of all evaluations is normally HTML escaped before it is copied to the output. But function definitions may define a list of contexts in which they are considered safe.

The raw() function specifies that it is safe in all contexts, therefore its argument is always copied verbatim to the output. The escape_html function is safe in context “HTML” but unsafe in all other contexts whereas the function escape_javascript is save both in JavaScript and HTML context.

The only supported contexts are currently HTML and JavaScript but other contexts can easily be added.

Pluggability

The imperia view processor is rather a framework for a templating engine that can be extended by two plug-in APIs.

Directive plug-ins are those that are invoked with one or two leading hash signs like ##include, #assign or #macro (for function definitions inside view templates). It may be surprising that even #for, #foreach or #if are defined as plug-ins. That means that the semantics can not only be extended - you could for example define a #switch directive - but also changed by overwriting a plug-in.

The other plug-in API is the library interface. Functions like raw(), escape_html() and so on are grouped in librarys called “import realms”. The list of realms/libraries to import is specified, when the view processor is invoked. It is very easy to add self-written plug-ins to extend the functionality of the view processor with custom functions.

While developing you will often also import the Util library that defines the functions dump(arg) for dumping data structures into a human readable form, warn(arg) which also dumps data structures but writes them to standard error (normally the web server error log) and log(msg) which logs a message along with the exact location in the source code to standard error.

Payload Variable Scoping

Every view include gets its own copy of the payload data structure. But this is a shallow copy only! This leads to a somewhat obscure scoping of variables, at least at first glance. After some time it becomes second nature and is quite comfortable.

1
2
3
4
5
6
7
8
9
10
{%= a = 'foo' =%}
{%= data.a = 'bar' =%}

Value of a is {% a %}.
Value of data.a is {% data.a %}.

{%- ##include('child.html') -%}

Value of a is {% a %}.
Value of data.a is {% data.a %}.

In line 1 the (top-level) variable a is defined, in line 2 the second-level variable data.a. Second-level means that it data.a is a property of the top-level variable data.

The included view child.html looks like this:

1
2
3
4
5
{%= a = 'overwritten' =%}
{%= data.a = 'overwritten' =%}

Value of a inside child.html is {% a %}.
Value of data.a inside child.html is {% data.a %}.

It overwrites exactly what was defined in the view that included it. The rendered output is a litte surprising:

Value of a is foo.
Value of data.a is bar.

Value of a inside child.html is overwritten.
Value of data.a inside child.html is overwritten.

Value of a is foo.
Value of data.a is overwritten.

That means: An include can modify second-level variables of its parent but not top-level variables. This is due to the fact that the payload of the included child view is just a shallow copy of the parent’s payload.

It is maybe easier to understand what a shallow copy is, when we translate it into JavaScript.

1
2
3
4
5
6
7
8
9
10
11
12
var payload = {
    a: 'foo',
    data: {
        a: 'bar'
    }
};
var shallowCopy = {};
for (var key in payload) {
    shallowCopy[key] = payload[key];
}

console.log(shallowCopy.data.a === payload.data.a);

Just like Perl JavaScript copies scalars (string literals, numbers, …) by value and objects and arrays by reference. Therefore shallowCopy.data.a and payload.data.a point to the same thing. Execute the code snippet with nodejs or copy it to the JavaScript console of your browser. You will see that it outputs true to the console (see line 12 above).

You may think that the somewhat obscure variable scoping of the imperia view processor is a prime example of the promote-a-bug-to-feature paradigm of software development. Admittedly, creating a shallow copy is a lot cheaper than creating a deep copy, but the scoping behavior is done that way fully intentionally.

One of the goals of the view processor was to trade off consistency for intuitiveness, so that the language was easy to use and understand for designers and non-programmers. It turned out that most early adopters somehow expected a behavior that was very similar to the one implemented now. They wrote things like this:

1
2
3
4
5
6
7
8
9
<div>Logged in as {% imperia.user.name %}</div>
<div>
Mails:
<ul>
{% #for (i = 0; i < imperia.mails.length; ++i) %}
  <li>imperia.mails[i].subject</li>
{% #end %}
</ul>
</div>

They expected that they could happily use and overwrite “simple” variables like i or tmp without bothering to check whether these variables were already defined in the parent scope. They also expected that the variables that they defined were visible in included views and that they would have the same value there. And they rarely uttered the desire to write into those variables that used cryptic things like dots (imperia.user) or brackets (mails[i]).

I ultimately decided that this is more intuitive for non-programmers than the concept of local and global variables.

Do The Right ThingTM

Variable scoping is one example where consistency was traded off for usability. You may have spotted more examples for that in the code examples above.

1
{% a = 'foo' %}

The assignment statement does not produce any output. Conventionally, assignments evaluate to the value of their left-hand argument after the assignment is done, and the above statement produced the output “a”. In the very beginning people therefore always wrote assignments like this:

1
{% a = 'foo'; '' %}

Now the expression evaluated to the empty string but it was awkward to write. I therefore decided that assignments in the X mini language always evaluated to the empty string. If you want to have the traditional behavior, you have two options:

1
2
3
{% a = 'foo'; a %}
or
{% a := 'foo' %}

Another example for intuitiveness over consistency:

1
{% data.a = 'foo' %}

In other languages, you first would have to create a variable data. Then you have to initialize it to an empty hash. And then you can set values inside it. In the X language this is done altogether. If the variable does not exist it gets created. If it is used like a hash it is created as a hash. If it is used as an array it is created as an array.

This autovivification (borrowed from Perl) may be considered unclean but most of the time it is exactly what is meant and it can hardly lead to errors.

There are more examples in the imperia view processor where complaints by early adopters about awkward coding led to modifications of the language.

Error Output

Access denied connecting to mysql server at /var/www/shop/db.php:25!
The supplied user name was "root", the password was "Ken Sent Me".

Everybody surfing the web knows such funny messages that are not so funny for the maintainers of the site. They are meant for developers but not for the visitors of the site.

The imperia view processor has three possible error channels. The default one is STDERR, normally the web server error log.

Two more error channels can be configured for development systems. HTML writes error messages as HTML comments into the rendered output and JAVASCRIPT writes little pieces of inline JavaScript code that calls alert(error) so that error messages pop up as alert dialog boxes.

It should also be mentioned that error messages contain the exact location of the error, that is the filename plus the line number plus the column. No more searching for forgotten semi-colons.

Internationalization

The I18N library that gets imported by default exposes the complete Gettext API to the view processor so that you can use all of its advanced translation functions including plural forms and translation contexts. The syntax is very simple:

1
2
3
4
5
6
<h1>{% __('Fatal Error!') %}</h1>

<div>
{% __x('Error deleting "{file}": {error}!',
       file => filename, error => syserror) %}
</div>

Most of the time this is all you have to know. You wrap simple messages into __() and messages with interpolated named variables into __x().

By the way, the => is the so-called fat comma known from Perl. It has exactly the same semantics as a regular comma but treats its left-hand side argument as a quoted string.

Compile On Read

The view compiler is called with either a string as an argument or a string reference. String references are treated as a reference to view code, normal strings are interpreted as the filename of a view template, which is then searched relative to the list of include directories.

When you pass the name of a view template file, actually nothing happens. The Perl code looks pretty much like this:

1
$code = \"{\% include('$filename') \%}";

It just generates an include statement and returns immediately. Nothing gets parsed, nothing gets compiled, it is not even checked whether the template file exists. All directives are lazily evaluated, that is only when they are used while rendering the template.

1
2
3
4
5
{% #if ('x' == 'x') %}
  {% ##include('root.html') %}
{% #else %}
  {% ##include('does/not/exist.html') %}
{% #endif %}

In the above example, the include directive for does/not/exist.html is never executed because the if-condition is always true.

By the way, the parser/compiler aggressively caches all templates in memory up to a configurable maximum size. For development you can change the caching strategy to check the last modification date of template files before a compiled syntax tree is used from the cache.

 

That should conclude the feature tour of the imperia view processor. None of these features is exactly rocket science, a lot of them are today available in other templating engines. When I started developing the view processor back in 2008 many of them were not.

As a whole, this feature set makes the view processor a very powerful tool for web development allowing intelligent views with complex features second to no other templating engine I know.


blog comments powered by Disqus