Saying stuff about stuff.

Introducing Decant

Decant is a dependency-free frontmatter-aware framework-agnostic wrapper around a directory of static content (probably Markdown). It pairs perfectly with Parklife plus your favourite Ruby web framework to help you make a content-driven static website.

Usage

Start by defining a Decant model-like content class that points to a directory of files and their extension. You can declare convenience readers for common frontmatter data, and add your own methods – it’s a standard Ruby class.

Page = Decant.define(dir: 'content', ext: 'md') do
  # Declare frontmatter convenience readers.
  frontmatter :title

  # Add custom methods - it's a standard Ruby class.
  def shouty
    "#{title.upcase}!!!"
  end
end

Then, given the file content/about.md with the following contents:

---
title: About
stuff: nonsense
---
# About

More words.

You can find the item by its extension-less path within the directory, read its content and frontmatter, and call frontmatter convenience methods or your own manually-defined methods:

about = Page.find('about')
about.content     # => "# About\n\nMore words.\n"
about.frontmatter # => {:title=>"About", :stuff=>"nonsense"}
about.title       # => "About"
about.shouty      # => "ABOUT!!!"

Nesting

Files can be nested so with the following layout:

└ content/
  ├ about.md
  ├ features/
  │ ├ frontmatter.md
  │ ├ globs.md
  │ ├ nesting
  │ │ ├ lots.md
  │ │ └ more.md
  │ └ slugs.md
  └ usage.md

A nested file can be targeted by its full path:

page = Page.find('features/nesting/more')
page.title # => "More Nesting"

Which makes it easy to support catch-all-style routes, like this one in Rails:

# Route.
get '/*slug', to: 'pages#show'

# Controller action.
def show
  @page = Page.find(params[:slug])
end

Or Sinatra:

get '/*slug' do
  @page = Page.find(params[:slug])
  erb :page
end

Collections

Each class provides a few methods for accessing its collection of files. You’ve already seen find which returns a single matching item, there’s also all which returns all of the collection’s matching items, and glob. With glob you pass a shell-like pattern that can include some special characters, it uses Pathname#glob so some knowledge of Ruby’s glob behaviour can be useful – though I imagine * and **/* will be most commonly used. The other thing to note is that including the file extension isn’t necessary – in fact it mustn’t be included because it will be added for you from the configured ext.

Here are a couple of examples using the same nested file layout from earlier. You can target a subdirectory’s immediate files:

Page.glob('features/*')
# Returns Page objects for:
# - features/frontmatter.md
# - features/globs.md
# - features/slugs.md

Or include everything from a directory down:

Page.glob('features/**/*')
# Returns Page objects for:
# - features/frontmatter.md
# - features/globs.md
# - features/nesting/lots.md
# - features/nesting/more.md
# - features/slugs.md

The Content object

As well as the file’s content, its frontmatter, frontmatter-added convenience methods, and your own manually-defined methods, a Decant::Content object also knows a little about itself:

  • #path returns a Pathname – which may or may not be an absolute path depending on the configured dir.
  • #relative_path returns the item’s relative path within its collection.
  • #slug returns the item’s “slug” (its extension-less relative path) within the collection.
page = Page.find('features/slugs')
page.path             # => #<Pathname:content/features/slugs.md>
page.path.expand_path # => "/Users/dave/my-website/content/features/slugs.md"
page.relative_path    # => "features/slugs.md"
page.slug             # => "features/slugs"

Frontmatter

Frontmatter must start with a line consisting of three dashes ---, then the YAML, then another line of three dashes (just like Jekyll). The YAML should be key/value pairs and the returned Hash will have Symbol keys.

---
title: Frontmatter
tags:
 - lists
 - are
 - fine
as:
  is:
    nesting: etc (it's YAML)
---
Content begins here

The above YAML frontmatter will be turned into the following Ruby data:

{
  title: "Frontmatter",
  tags: ["lists", "are", "fine"],
  as: {
    is: {
      nesting: "etc (it's YAML)"
    }
  }
}

What, no Markdown?

You’re probably using Decant with collections of Markdown files – ala Jekyll and friends – so it might be a surprise to find that Decant knows nothing about Markdown and how to turn it into HTML but it’s just not a Decant concern; there are loads of different Markdown libraries out there and it’s up to you which one you choose. Besides, there’s more than just Markdown – remember Textile?

Anyway it’s generally simple to add Markdown support using the library of your choice configured exactly how you like it. Here I’m using Kramdown with its GitHub-Flavoured Markdown support:

Page = Decant.define(dir: 'content', ext: 'md') do
  def html
    Kramdown::Document.new(content, input: 'GFM').to_html
  end
end

about = Page.find('about')
about.html # => "<h1 id=\"about\">About</h1>\n\n<p>More words.</p>\n"

Sinatra also has a builtin markdown helper that can be used in a view:

<div class="page">
  <%= markdown @page.content %>
</div>

Or directly from a route (and wrapped by a layout):

get '/*slug' do
  @page = Page.find(slug)
  markdown @page.content, layout_engine: :erb
end

But maybe you can even get away without Markdown and use something like Rails’s simple_format helper.

Maybe just make a website

When paired with Parklife plus the Ruby web framework of your choice Decant helps make it incredibly easy to make a content-driven static website with very little code and none of the constraints of a traditional static site framework. I’m using Decant in a few places (here for instance), the most visible is the Parklife website that, thanks to Sinatra, has just a handful of lines of code leaving me to get on with making a website.

Challenge: find the longest time gap

Very occasionally I’m inspired by a code challenge, this time it’s finding the longest gap in minutes between consecutive timestamps (via @andycroll). Here’s my approach using Ruby:

def find_longest_time_gap(times)
  times
    .map { _1.split(':', 2).map(&:to_i) }
    .map { |(h, m)| h * 60 + m }
    .sort
    .each_cons(2)
    .map { |(a, b)| b - a }
    .max || 0
end

find_longest_time_gap(['14:00', '09:00', '15:00', '10:30'])
# => 210

Breaking it down.

I’m using a “numbered parameter” _1 (which I kinda like more than the newer it 😬 though I’m not sure I’d use either in “proper” code). This is split into two-and-only-two parts (it’s maybe a bit defensive but I like that it’s explicit) and turned into integers to get an array of hours/minutes tuples:

.map { _1.split(':', 2).map(&:to_i) }
# => [[14, 0], [9, 0], [15, 0], [10, 30]]

They’re converted to minutes. Here I’m expanding each two-element tuple/array block argument into two variables (I’m not sure what this is called) which is helpful in these simple cases because it turns it into a neat single line:

.map { |(h, m)| h * 60 + m }
# => [840, 540, 900, 630]

The original times may not be in order so we need to sort them:

.sort
# => [540, 630, 840, 900]

Ruby has a wealth of useful methods – I’m thinking of chunk_while – so I knew there would be something to help me out. I cheated a little and peeked at Andy’s solution 👀 which revealed that this time it’s each_cons (its name was bound to contain an underscore).

each_cons is one of those methods I find slightly weird. It’s called “each” which suggests a side-effect rather than something being returned, but calling it without a block returns an Enumerator which I think I’d want more often (map_cons?).

Here’s what each_cons(n) does:

# Original array
[1, 2, 3, 4, 5]

# each_cons(2)
[1, 2]
   [2, 3]
      [3, 4]
         [4, 5]

# each_cons(3)
[1, 2, 3]
   [2, 3, 4]
      [3, 4, 5]

So here’s the each_cons(2) magic:

.each_cons(2)
# => [[540, 630], [630, 840], [840, 900]]

Now we can map over these to get the differences:

.map { |(a, b)| b - a }
# => [90, 210, 60]

Then get the maximum value. If the array has fewer items than the number passed to each_cons then the returned Enumerator will be empty and the call to max will return nil so we should default to zero:

.max || 0
# => 210

Update: I realised I hadn’t read the question fully and had an incorrect assumption that the times would always be sorted. I fixed that, and the case where there’s only one time (or zero) and therefore no gap, by adding some tests:

require 'minitest/autorun'

class TestMe < Minitest::Test
  def test_cases
    assert_equal 0, find_longest_time_gap(['12:00'])
    assert_equal 120, find_longest_time_gap(['09:00', '11:00'])
    assert_equal 210, find_longest_time_gap(['14:00', '09:00', '15:00', '10:30'])
    assert_equal 240, find_longest_time_gap(['08:00', '10:00', '10:00', '14:00'])
  end
end

Operatic 0.7

Operatic defines a minimal standard interface to encapsulate your Ruby operations. The job of Operatic is to receive input and make it available to the operation, and to gather output and return it via a result object. This leaves you a well-defined space to write the actual code by implementing the #call method – and with so much of the ceremony taken care of it feels like writing a function but in a Ruby wrapping.

Version 0.7.0 introduces concrete Success/Failure result classes which brings a slight change to the API and may therefore impose some tweaks to your code.

In earlier versions an operation gathered data on its result object and marked the result as a success or failure. Now the operation gathers data on a separate Operatic::Data object and it’s the operation that controls the success/failure status by choosing the appropriate Success/Failure result class and initialising it with its data.

Operatic::Data could have perhaps been a plain Hash but I wanted to retain the ability to define a per-operation subclass with custom accessors (which remain accessible via the result thanks to the magic of #method_missing). I couldn’t use the Data class introduced in Ruby 3.2 because it’s frozen at the point of creation whereas in an operation data is gathered throughout execution and frozen on completion.

Having a concrete success/failure result class also makes pattern matching clearer by replacing an anonymous boolean ([true, { message: }]) with the self-documenting result class itself:

case SayHello.call(name: 'Dave')
in [Operatic::Success, { message: }]
  # Result is a success, do something with the `message` variable.
in [Operatic::Failure, _]
  # Result is a failure, ignore any data and do something else.
end

Altogether I’m pleased with the overall refactor, I think it’s resulted in a better API by more clearly defining the roles and responsibilities of each of the component parts.

Streaming Phlex from Sinatra

Sinatra supports streaming, Phlex supports streaming, but until recently I hadn’t put the two together in phlex-sinatra. Well now I have and from version 0.3 you can pass stream: true like so:

get '/foo' do
  phlex MyView.new, stream: true
end

When streaming is enabled Phlex will automatically flush to the response after a closing </head> which means that the browser can start processing external resources as quickly as possible while the rest of the view is being generated. Even with no further intervention this small change should improve your Time to First Byte and could improve your First Contentful Paint.

It took very little code to integrate Phlex with Sinatra’s streaming but it did require a lot of learning and understanding, some of which will be useful to you dear reader. The first thing to note is that streaming with Sinatra requires a compatible server like Puma (it appears that WEBrick isn’t compatible) but the main thing to be aware of is the assortment of buffers between your code and the receiving client.

Buffers buffers buffers

It’s normal for a web framework to wait until a page has been fully generated (to buffer) before sending the response – so that the Content-Length can be determined, among other things. Sinatra (et al) provides a way to bypass its buffer and write directly to the response, and it’s this object that’s passed to Phlex.

I was using curl to inspect the response and although I could now see the immediately-streamed HTTP headers (pass -i/--include to curl to include the headers in its output) the rest of the response was only shown once complete. I spent an age investigating both Sinatra and Phlex’s internals only to realise that everything was working correctly and the culprit was curl itself which has its own buffer. Curl behaves like other command-line tools and outputs lines but Phlex “uglifies by default” (it doesn’t add extra whitespace to generate pretty HTML) and this lack of newlines exacerbated the apparent problem – the trick is to pass -N/--no-buffer.

But streaming still wasn’t working as expected. It was then I discovered that Phlex has a buffer that’s automatically flushed after a closing </head> tag, but it’s otherwise left to the developer to choose if and when to call the provided #flush method (I’m guessing Phlex doesn’t write directly to the buffer due to some sort of performance penalty).

One more thing that I didn’t encounter while testing but that it’s worth being aware of is that other intermediaries can buffer the response without your knowledge, for example the reason for adding a X-Accel-Buffering=no header is explained here (the whole article is well worth a read).

When to use streaming

So… I think ideally you shouldn’t need to use streaming at all but it’s easy to find yourself with a particularly problematic page (I’m thinking of a very long list or if it depends on some slow external service) where streaming can help. Adding pagination or completely rethinking the approach is quite probably the proper solution but enabling streaming combined with some judicious calls to #flush could be enough to patch over the issue and give some breathing room for the real solution.

Using Phlex in Sinatra with phlex-sinatra

Phlex already works with Sinatra (and everything else) but its normal usage leaves you without access to Sinatra’s standard helper methods. That’s why I created phlex-sinatra which lets you use Sinatra’s url() helper from within Phlex (along with the rest of the usual helper methods available in a Sinatra action).

To enable the integration use the phlex method in your Sinatra action and pass an instance of the Phlex view (instead of using .call to get its output):

get '/foo' do
  phlex MyView.new
end

You can now use Sinatra’s url() helper method directly and its other methods (params, request, etc) via the helpers proxy:

class MyView < Phlex::HTML
  def template
    h1 { 'Phlex / Sinatra integration' }
    p {
      a(href: url('/foo', false)) { 'link to foo' }
    }
    pre { helpers.params.inspect }
  end
end

Why?

It might not seem obvious at first why you’d use url() at all given that you mostly just pass the string you want to output 🤷🏻‍♂️ but I hit the issue immediately when I switched to Phlex in my Wordle results Sinatra/Parklife microsite hosted on GitHub Pages.

One of the main features of using Parklife is that your development flow remains completely unchanged. In development you start the server as usual which means the app is almost certainly served from the root /, but if the static site is hosted as a GitHub Pages repository site it’ll be served from /my-repository-name – which means all your links will be broken in production! It’s incredibly frustrating but luckily easily fixed.

Step 1 is to use Sinatra’s url() helper method wherever you need a URL (the false second argument means the scheme/host isn’t included):

link(href: url('/app.css', false), rel: 'stylesheet', type: 'text/css')

Step 2, configure a Parklife base:

Parklife.application.config.base = '/wordle'

It’s also possible to pass --base at build-time, in fact if you used Parklife to generate a GitHub Actions workflow (parklife init --github-pages) then it’s already configured to fetch your GitHub Pages site URL – whether it’s a custom domain or a standard repository site – and pass it to the build script so you won’t need to manually configure it as above.

Step 3 (profit?). The result is that when Parklife generates the static build Sinatra will know to serve the site from the /wordle subpath and will include the prefix on all url()-generated URLs:

<link href="/wordle/app.css" rel="stylesheet" type="text/css">

Another main reason to use the url() helper is to generate a full URL – for instance from within a feed or for an og:image social media preview link. In this case don’t pass the false second argument (it defaults to true) and the full URL will be generated. Once again you’ll need to configure Parklife with the correct base but once again it’s already taken care of if you generated the GitHub Actions workflow with parklife init --github-pages.