---
title: "YAML in 2 Minutes: A Gentle Introduction for R Users"
output: rmarkdown::html_vignette
editor_options:
  markdown:
    wrap: 72
    canonical: true
vignette: >
  %\VignetteIndexEntry{YAML in 2 Minutes: A Gentle Introduction for R Users}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(yaml12)
```

Here’s a short introduction to YAML for R users. YAML is a data
serialization format designed to be easily human readable.

Think of YAML as “JSON with comments and nicer multiline strings.”
`yaml12` parses YAML 1.2 (the modern specification that removes some of
YAML 1.1’s surprising eager conversions) into plain R objects.

YAML has three building blocks: **scalars** (single values),
**sequences** (ordered collection of items), and **mappings** (key/value
pairs with unique keys). JSON is a subset of YAML 1.2, so all valid JSON
is also valid YAML and parses the same way.

## A first example

``` yaml
title: A Modern YAML parser written in Rust
properties: [correct, safe, fast, simple]
score: 9.5
categories:
  - yaml
  - r
  - example
settings:
  simplify: true
  note: >
    This is a folded block
    that turns line breaks
    into spaces.
  note_literal: |
    This is a literal block
    that keeps
    line breaks.
```

```{r, echo = FALSE}
first_example <- '
title: A Modern YAML parser written in Rust
properties: [correct, safe, fast, simple]
score: 9.5
categories:
  - yaml
  - r
  - example
settings:
  simplify: true
  note: >
    This is a folded block
    that turns line breaks
    into spaces.
  note_literal: |
    This is a literal block
    that keeps
    line breaks.
'
```

```{r}
str(parse_yaml(first_example))
```

## Comments

Comments start with `#` and run to the end of the line. They must be
separated from values by whitespace and can sit on their own line or at
line ends; they are ignored by the parser.

``` yaml
# Whole-line comment
title: example # inline comment
items: [a, b] # trailing comment
```

→ `list(title = "example", items = c("a", "b"))`

## Collections

There are two "collection" types: Sequences and Mappings.

### Sequences: YAML’s ordered collections

A sequence is a list of items. Each item begins with `-` at the parent
indent.

``` yaml
- cat
- dog
```

→ `c("cat", "dog")` (or `list("cat", "dog")` when `simplify = FALSE`)

JSON-style arrays work too:

``` yaml
[cat, dog]
```

→ same result

Anything belonging to one of the sequence entries is indented at least
one space past the dash:

``` yaml
- name: cat
  toys: [string, box]
- name: dog
  toys: [ball, bone]
```

↓

``` r
list(
  list(name = "cat", toys = c("string", "box")),
  list(name = "dog", toys = c("ball", "bone"))
)
```

### Mappings: key/value pairs

A mapping is a set of `key: value` pairs at the same indent:

``` yaml
foo: 1
bar: true
```

→ `list(foo = 1L, bar = TRUE)`

A key at its indent owns anything indented more:

``` yaml
settings:
  simplify: true
  max_items: 3
```

→ `list(settings = list(simplify = TRUE, max_items = 3L))`

JSON-style objects also work:

``` yaml
{a: true}
```

→ `list(a = TRUE)`

Mappings become named lists in R.

## Scalars

All nodes that are not collections are Scalars; these are the leaf nodes
of a YAML document.

Scalars can be provided in three forms: block, quoted, or plain.

### Block scalars

`|` starts a **literal** block that keeps newlines; `>` starts a
**folded** block that joins lines with spaces (except blank/indented
lines keep breaks). Block scalars always become strings.

``` yaml
|
  hello
  world
```

→ `"hello\nworld\n"`

``` yaml
>
  hello
  world
```

→ `"hello world\n"`

## Quoted scalars

Like block scalars, quoted scalars always resolve to strings. Double
quotes interpret escapes (`\n`, `\t`, `\\`, `\"`). Single quotes are
literal and do not interpret escapes, except for `''` which is parsed as
a single `'`.

``` yaml
["line\nbreak", "quote: \"here\""]
```

→ `c("line\nbreak", 'quote: "here"')`

``` yaml
['line\nbreak', 'quote: ''here''']
```

→ `c("line\\nbreak", "quote: 'here'")`

## Plain (unquoted) scalars

If a node is not a sequence, mapping, block scalar, or quoted scalar, it
is a *plain* scalar.

Plain nodes can resolve to one of five types: string, int, float, bool,
or null.

YAML 1.2 uses simple rules to then infer the type of a plain node:

-   `true` / `false` → `TRUE` / `FALSE`
-   `null`, `~`, or empty → `NULL`
-   numbers: signed, decimal, scientific, hex (`0x`), octal (`0o`),
    `.inf`, `.nan` → `numeric()` or `integer()`
-   everything else stays a string (`yes`, `no`, `on`, `off` and other
    aliases remain strings in YAML 1.2)

``` yaml
[true, 123, 4.5e2, 0x10, .inf, yes]
```

→ `list(TRUE, 123L, 450, 16L, Inf, "yes")`

If a sequence is homogeneous and `simplify = TRUE`, nulls become the
appropriate `NA_*` values.

## End-to-end example

``` yaml
doc:
  pets:
    - cat
    - dog
  numbers: [1, 2.5, 0x10, .inf, null]
  integers: [1, 2, 3, 0x10, null]
  flags: {enabled: true, label: on}
  literal: |
    hello
    world
  folded: >
    hello
    world
  quoted:
    - "line\nbreak"
    - 'quote: ''here'''
  plain: [yes, no]
  mixed: [won't simplify, 123, true]
```

R result (`parse_yaml()` with defaults):

``` r
list(
  doc = list(
    pets = c("cat", "dog"),
    numbers = c(1, 2.5, 16, Inf, NA_real_),
    integers = c(1L, 2L, 3L, 16L, NA_integer_),
    flags = list(enabled = TRUE, label = "on"),
    literal = "hello\nworld\n",
    folded = "hello world\n",
    quoted = c("line\nbreak", "quote: 'here'"),
    plain = c("yes", "no"),
    mixed = list("won't simplify", 123L, TRUE)
  )
)
```

## Quick notes

-   Indentation defines structure for collections. Sibling elements
    share an indent, children are indented more. YAML 1.2 forbids tabs;
    use spaces.

-   All JSON is valid YAML.

-   Homogeneous sequences simplify to vectors unless `simplify = FALSE`.

-   Block scalars (`|`, `>`) always produce strings.

-   Booleans are only `true`/`false`;

-   `null` maps to `NULL` (or `NA` inside simplified vectors).

These essentials cover most YAML you’ll run into in practice. If you
encounter YAML tags or non-string mapping keys, check out the "Advanced
YAML" vignette.
