Multilingual Web Sites with Jekyll

After waiting years on my TODO list I recently decided to tackle the project “web site”. Since I no longer work for Imperia I was looking for a light-weight alternative. I did not want to install PHP on my server, ruling out a lot of options. A colleague finally recommended Jekyll. Its simple semi-static approach reminded me of Imperia and I decided to give it a shot.

Multilinguality is one of the topics that separates the men from the boys in the CMS market, and coming from Imperia I was failing from great height. Multilinguality is deeply integrated into the system, and it more or less works out of the box.

Multilang Options For Jekyll

A number of plug-ins for Jekyll claim to make the system multilingual. After a little reseach I found, however, the excellent post Making Jekyll multilingual by Sylvain Durand who describes an approach without plug-ins.

I only present those of my own solutions here where I did not follow Sylvain’s recommendations. For complete coverage you should first read his post.

Basic Considerations

The structure of a multilingual site depends a lot on the choice and configuration of the web server. Best practice is page-based content negotiation where the browser and the server negotiate the language version of the landing page. This technique is the default in Imperia.

However, I wanted to use Nginx instead of Apache as a web server. Content negotiation is still only available as a source code patch for nginx. I did not want to go that way and decided to solve the problem with a little handler written in Perl.

The Perl handler is only used for the start page /. From there it triggers a redirect to the language-specific start page /en/, /de/ and so on. That is described in the post Simple Content Negotiation For Nginx.

Language Switch

Sylvain recommends to set a variable name in the front matter of each page, the value of which is shared between the different language version of a post. That allows to find the other versions of a particular document. His menu for the language switch looks like this:

1
2
3
4
5
6
7
8
{% assign posts=site.posts | where:"name", page.name | sort: 'path' %}
<ul>
{% for post in posts %}
    <li class="lang">
        <a href="{{ post.url }}" class="{{ post.lang }}">{{ post.lang }}</a>
    </li>
{% endfor %}
</ul>

In line 1 all posts that have the same value for the property name are searched. In the loop between lines 3 to 7 a link is created for each of the versions.

If a translation for a particular language is missing, no link for that language is displayed. I prefer to always have the same languages displayed, and in case of a missing translation link to a category or in doubt the start page:

1
2
3
4
5
6
7
8
9
10
11
12
{% for lang in site.languages %}
  {% if page.type == 'posts' %}
    {% assign other=site.posts | where: "name", page.name 
                               | where: "lang", lang | first %}
  {% else %}
    {% assign other=site.pages | where: "pageid", page.pageid 
                               | where: "lang", lang | first %}
  {% endif %}
  <li class="lang">
    <a href="{% if other.url %}{{ other.url }}{% else %}/{{ lang }}/{% endif %}" 
       class="{{ lang }}">{{ lang | upcase }}</a></li>
{% endfor %}

In line 1 I iterate over the languages for the site. They are defined in the the variable site.languages in _config.yml:

languages: [en, de]

The if-clause in line 2 is also new. The property name should link all different language versions. But that worked for me only for posts but not for pages. For non-pages I define a similar variable pageid instead that has the same purpose. Maybe you rather want to use pageid everywhere instead of name and go without any if-clause here.

Edit: Instead of name Sylvain now uses a new variable ref which has the same effect as using pageid everywhere.

At the end of the day the variable other contains the corresponding resource in that particular language if it exists. Depending on this, the link points to either the version for that language or as a fallback to the start page.

Linking Between Language Variants

The code for sitemap.xml has to be modified accordingly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
---
layout:
permalink: /sitemap.xml
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" 
        xmlns:xhtml="http://www.w3.org/1999/xhtml">
  {% for page in site.pages %}
  <url>
    <loc>{{ site.url }}{{ page.url }}</loc>
    {% assign versions=site.pages | where:"pageid", page.pageid %}
    {% for version in versions %}
      <xhtml:link rel="alternate" hreflang="{{ version.lang }}" 
                  href="{{ site.url }}{{ version.url }}" />
    {% endfor %}
    {% if page.date %}
    <lastmod>{{ page.date | date_to_xmlschema }}</lastmod>
    {% endif %}
    <changefreq>monthly</changefreq>
  </url>
  {% endfor %}
  {% for post in site.posts %}
  <url>
    <loc>{{ site.url }}{{ post.url }}</loc>
    {% assign versions=site.posts | where:"name", post.name %}
    {% for version in versions %}
      <xhtml:link rel="alternate" hreflang="{{ version.lang }}" 
                  href="{{ site.url }}{{ version.url }}" />
    {% endfor %}
    <lastmod>{{ post.date | date_to_xmlschema }}</lastmod>
    <changefreq>weekly</changefreq>
  </url>
  {% endfor %}
</urlset>

We have to distinguish again between documents of type page and post. Posts are linked via name, pages via via pageid. The linking in the HTML <head> works the same way:

1
2
3
4
5
6
7
8
9
10
11
12
{% for lang in site.languages %}
  {% if page.type == 'posts' %}
    {% assign other=site.posts | where: "name", page.name
                               | where: "lang", lang | first %}
  {% else %}
    {% assign other=site.pages | where: "pageid", page.pageid
                               | where: "lang", lang | first %}
  {% endif %}
  {% if other and page.lang != other.lang %}
  <link rel="alternate" hreflang="{{other.lang}}" href="{{other.url}}" />
  {% endif %}
{% endfor %}

This time there is no fallback to the start page because we only want to link to the corresponding resources in the other language.

Translations Of Template Texts

Not only the actual content has to be translated but also boilerplate text from the templates. I currently do that with the approach recommended by Sylvain Durand, the translations are defined in _config.yml:

# Boilerplate translations.
t:
  en:
    home: Home
    toggle_navigation: "Toggle navigation
    categories: Categories
    featured_posts: "Featured Posts"
    ads: Ads
  de:
    home: Start
    toggle_navigation: "Navigation ein-/ausklappen"
    categories: Rubriken
    featured_posts: "Mehr zu lesen"
    ads: Werbung

In the templates you access these texts like this:

<span class="sr-only">{{site.t[page.lang].toggle_navigation}}</span>

Now that is plain ugly! Using placeholders for translatable strings is a recipe for trouble. I would prefer marking strings in the primary language:

<span class="sr-only">{% gettext "Toggle navigation" %}</span>

At the moment I only have a handful of such strings in the configuration file and therefore do not mind too much. With more strings coming I have to think of something better.


blog comments powered by Disqus