Saying stuff about stuff.

My YAML reference

I find myself having to write ever more YAML nowadays and whilst it seems pretty simple at first (isn’t it just key/value?) having a bit more knowledge can be helpful. So here are a bunch of YAML things that I find useful — and that I continue to forget and have to look up again.

Strings

It’s not always necessary to “quote” a string and because of this there are lots of subtle ways that things can go awry — The Norway Problem being a classic example — but there are also lots of features that can help with formatting strings.

Formatting with |

The | treats its following lines as a block of multiline text:

text: |
  # Markdown Heading

  These separate lines
  will remain separate lines
  with no extra indentation.
"# Markdown Heading\n\nThese separate lines\nwill remain separate lines\nwith no extra indentation.\n"

There’s also |- which removes the final trailing newline.

Formatting with >

The > joins its following lines with a space:

text: >
  These separate lines
  will become one long line
  joined with spaces and
  with no indentation.
"These separate lines will become one long line joined with spaces and with no indentation.\n"

Using > can aid readability by splitting a single long command over many lines:

step:
  run: >
    NODE_ENV=production
    SOME=more
    ENV=vars
    npm build
"NODE_ENV=production SOME=more ENV=vars npm build\n"

There’s also >- which joins its following lines with a space and removes the final trailing newline.

Note that lines will only continue to be joined while the indentation level remains the same (this has caught me out in the past). So the following will join the first two lines with a space but the rest with newlines:

text: >
  These separate lines
  will NOT become one long line
    joined with spaces and
  with no indentation.
"These separate lines will NOT become one long line\n  joined with spaces and\nwith no indentation.\n"

Here’s what the spec says:

each line break is folded to a space unless it ends an empty or a more-indented line

Multiline with no extra formatting

The behaviour of >- appears to be similar to the default behaviour for a multiline string in that the lines are joined with a space and there’s no trailing newline — the big difference seems to be that indentation changes are ignored:

text:
  These separate lines
    will become one long line
      joined with spaces and
  with no indentation.
"These separate lines will become one long line joined with spaces and with no indentation."

This, again, can aid readability of a long command by splitting it over many lines with different indentation:

step:
  run:
    ./run_a_command
      --with=a
      --big=list
      --of=arguments
      -- and/file/paths
"./run_a_command --with=a --big=list --of=arguments -- and/file/paths"

Maps

Key/value pairs are “simple” can be nested:

key: value
nested:
  key: value
{
  "key": "value",
  "nested": {
    "key": "value"
  }
}

But this is exactly the sort of thing that caused The Norway Problem because the values can be anything and may not result in what you expected:

a: true
b: false
c: null
d: YES
e: NO
f: hello
g: 1.234
h: a long unquoted string
{
  "a": true,
  "b": false,
  "c": null,
  "d": true,
  "e": false,
  "f": "hello",
  "g": 1.234,
  "h": "a long unquoted string"
}

Collections

I think of arrays as Markdown bullet lists — they can also be nested:

- one
- two
- three
-
  - nested
  - array
["one", "two", "three", ["nested", "array"]]

And can be written “inline”:

- one
- two
- three
- [nested, array]

Anchors (&)

Anchors act as variables and can be used to reduce repetition. Here’s example 2.10 from the spec where & declares the named anchor SS and * is used to reference it further on through the document:

---
hr:
  - Mark McGwire
  # Following node labeled SS
  - &SS Sammy Sosa
rbi:
  - *SS # Subsequent occurrence
  - Ken Griffey

Anchors can be used to DRY up CI config (although they can’t be used in the GitHub Actions YAML) or a Rails database.yml:

default: &default
  adapter: postgresql
  encoding: unicode
  pool: 5

development:
  <<: *default
  database: app_development

test:
  <<: *default
  database: app_test

Comments

A comment starts with a #:

# Commented line.
- hello
# Interleaved comment.
- there # Another comment.
["hello", "there"]

But, as you may have noticed from a previous example, a # can appear in a multiline string when using | or > without being interpreted as a comment:

text: |
  # Markdown Heading

  These separate lines
  will remain separate lines
  with no extra indentation.
"# Markdown Heading\n\nThese separate lines\nwill remain separate lines\nwith no extra indentation.\n"

How to quickly test a snippet of YAML with Ruby

While writing this I encountered loads of little mistakes in my YAML and often had to verify that the output was what I expected. To check I used the DATA/__END__ trick:

require 'yaml'

pp YAML.load(DATA.read)

__END__

a: true
b: false
c: null
d: YES
e: NO
f: hello
g: 1.234
h: a long unquoted string
{"a"=>true,
 "b"=>false,
 "c"=>nil,
 "d"=>true,
 "e"=>false,
 "f"=>"hello",
 "g"=>1.234,
 "h"=>"a long unquoted string"}