Blogging in vim

We blog using Jekyll at MapBox, which means that all of our blog posts are written in code. Sometimes we make mistakes though, and missing or invalid metadata can cause layout quirks or unexpected errors. To catch these problems earlier, we decided to treat our blog like we do our code — automated unit tests now run after every commit.

Travis tests failed

Jekyll is a static site generator. We keep all content in simple text files and Jekyll reads each file and transforms it into HTML. We use Jekyll for all static content on our site — the blog, developer docs, help pages and much more.

Each bit of content, like a blog post or a help document, is a file composed of two parts: metadata stored in YAML, and content written in Markdown.

Here’s what the YAML part of this blog post looks like:

---
layout: blog
category: blog
title: "We unit test our blog at MapBox"
permalink: /unit-test-blog/
image: https://farm4.staticflickr.com/3710/10732224274_a4c27f21fc.jpg
author: Mike Morris
---

If we forget the author tag, the blog layout breaks. If we write invalid YAML, the blog won’t rebuild and the post will stay in limbo.

Missing author

Content testing prevents these failures ahead of time. Every blog post is submitted as a pull request on GitHub, and with pull request testing hooked up to Travis CI, every change is run through a test suite that gives the green light.

Good to merge - the Travis build passed

If there’s a problem, we know immediately.

Failed - the Travis CI build failed

Travis-CI supports plenty of languages for test suites, and we ended up writing ours in Node.js. Since Jekyll is a Ruby project, Travis installs Jekyll for the compilation and Node for the test runner. We use mocha and assert for our content tests.

Here’s our .travis.yml file:

language: node_js

before_install:
  - gem install liquid -v 2.5.1 --no-rdoc --no-ri
  - gem install jekyll -v 1.0.2 --no-rdoc --no-ri
  - gem install rdiscount -v 1.6.8 --no-rdoc --no-ri

script:
  - ./node_modules/.bin/mocha test/test.metadata.js
  - jekyll build

One important consideration is that all tests must be created (but not necessarily run) synchronously in Mocha, which necessitates using the synchronous variants of some Node functions to build tests dynamically. While writing some of the more complex tests, we found that it was more efficient to load all posts using fs.readFileSync before any tests were run, rather than loading each post asynchronously during its corresponding test. This approach allows for testing one-to-many relationships between posts (such as unique permalinks) while minimizing the time spent loading files from disk.

We first construct a posts object and create a test for each post.

var paths = {
        blog: '_posts/blog/',
        team: '_posts/team/'
    },
    dirs = Object.keys(paths);

var posts = dirs.reduce(function(prev, dir, index, list) {
    var path = paths[dir];
    describe(path, function() {
        prev[dir] = readDir(path);
    });
    return prev;
}, {});

dirs.forEach(function(dir) {
    var path = paths[dir];
    describe(path, function() {
        posts[dir].forEach(function(post) {
            it(post.name, tests[dir](post));
        });
    });
});

The metadata parsing is wrapped in a try/catch statement because js-yaml throws an error when parsing invalid YAML.

function readPost(dir, filename) {
    var buffer = fs.readFileSync(dir + filename),
        file = buffer.toString('utf8');

    try {
        var parts = file.split('---'),
            frontmatter = parts[1];

        it(filename, function() {
            assert.doesNotThrow(function() { jsyaml.load(frontmatter); });
        });

        return {
            name: filename,
            file: file,
            metadata: jsyaml.load(frontmatter),
            content: parts[2]
        };
    } catch(err) {}
}

function readDir(dir) {
    return fs.readdirSync(dir).map(function(filename) {
        return readPost(dir, filename);
    });
}

tests['blog'] asserts each necessary property of a blog post: all image links and iframes are HTTPS, exactly the expected metadata keys are present, and the metadata is valid. The date key, if it exists, must be a valid JavaScript Date object, the permalink must begin with /blog/ and each post needs to contain a <!--more--> tag for generating post excerpts with the excerpt.rb Jekyll plugin.

var tests = {
    'blog': function(dir, file) {
        return function() {
            var file = post.file,
                metadata = post.metadata,
                content = post.content,
                keys = [
                'published', 'date',
                'layout', 'category',
                'title', 'image',
                'permalink', 'tags'];

            // HTTPS images & iframes in blog
            var urls = file.match(/https?:\/\/[\w,%-\/\.]+\/?/g);
            if (urls) urls.forEach(function(url) {
                assert.ok(!(/http:[^'\"]+\.(jpg|png|gif)/).test(url), url + ' should be https');
            });

            var iframes = file.match(/<iframe [^>]*src=[\"'][^\"']+/g);
            if (iframes) iframes.forEach(function(iframe) {
                assert.ok(!(/<iframe [^>]*src=[\"']http:/).test(iframe), iframe + ' should be https');
                assert.ok(!(/<iframe [^>]*src=[\"']https:\/\/[abcd]\.tiles\.mapbox\.com.*\.html[^\?]/).test(iframe), iframe + ' is insecure embedded map (add ?secure=1)');
            });

            assert.equal(typeof metadata, 'object');
            assert.ok('layout' in metadata, missing('layout'));
            assert.ok('category' in metadata, missing('category'));
            assert.ok('title' in metadata, missing('title'));
            assert.ok('image' in metadata, missing('image'));
            assert.ok('permalink' in metadata, missing('permalink'));
            assert.ok('tags' in metadata, missing('tags'));

            if (metadata.date) {
                assert.ok(metadata.date instanceof Date, invalid('date', metadata.date));
            }

            assert.equal(metadata.category, 'blog', invalid('category', metadata.category));
            assert.ok(isImage(metadata.image), invalid('image', metadata.image));
            assert.ok(/^\/blog\//.test(metadata.permalink), invalid('permalink', metadata.permalink));

            assert.ok(content.indexOf('<!--more-->') !== -1, missing('<!--more-->'));

            var extraKeys = Object.keys(metadata).diff(keys);
            assert.deepEqual(extraKeys, [], extraneous(extraKeys));
        };
    }
};

We also check the integration between different posts and confirm that the author of each blog post matches the title of a post in _posts/team/.

// Build a list of team member names
var team = posts.team.map(function(post) {
    return post.metadata.title;
});

// Later, in a test assertion, make sure that that
// the author of a blog post is a team member.
assert.ok(team.indexOf(author) !== -1, 'no team post found for author ' + author);

We’ve saved ourselves a lot of frustration by automating this little part of our publishing workflow. The integration between Travis CI and GitHub lets everyone on our team, not just developers, benefit from tests and push new posts with confidence.