---
title: "Introduction to elixir"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introduction to elixir}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  markdown: 
    wrap: 80
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(elixir)
```

`elixir` is a set of tools for transforming R expressions, including into other
programming languages.

One of the neat features of R is that you can use the language to inspect
itself. Expressions, functions, indeed entire R scripts can be examined and
manipulated just like any list, data.frame, or other R object.

However, the syntax for manipulating R language objects is a little tricky.
Packages such as `rlang` help to make this task easier. `elixir` makes a few
extra shortcuts available, and is geared for advanced R users.

# Find and replace for language objects

Sometimes you want to detect certain patterns within an expression or list of
expressions, or easily replace a certain pattern with another. When working with
strings, regular expressions are a handy way of accomplishing such tasks.
`elixir` provides a sort of "regular expressions for R expressions"
functionality through the functions `expr_match()`, `expr_replace()`, and the
"shortcut" functions `expr_count()`, `expr_detect()`, `expr_extract()`, and
`expr_locate()`.

Frequent users of the [`stringr`](https://stringr.tidyverse.org/) package will 
recognize the intentional similarity between the above functions and 
`str_match()`, `str_replace()`, `str_count()`, `str_detect()`, `str_extract()`, 
and `str_locate()` from `stringr`.

The easiest way to demonstrate these is through an example.

## Example: a domain-specific language for ordinary differential equations

The Lotka-Volterra equations can be used to model a predator-prey interaction as
a system of ordinary differential equations:

$$
dx/dt = \alpha x - \beta x y \\
dy/dt = -\gamma y + \delta x y
$$

Here, $x$ is the predator density, $y$ is the prey density, $\alpha$ is the prey
birth rate, $\beta$ is the rate at which the prey is killed by the predator,
$\gamma$ is the predator net death rate in the absence of prey to eat, and
$\delta$ is the effect of eating prey on the predator birth rate.

One could implement this in R using the package `deSolve`:

```{r, eval = TRUE}
y <- c(x = 1, y = 1)
times <- 0:100
parms <- c(alpha = 1/6, beta = 1/3, gamma = 0.25, delta = 0.25) 

func <- function(t, y, parms)
{
    with(as.list(c(y, parms)), { 
        dx <- alpha * x - beta * x * y 
        dy <- -gamma * y + delta * x * y 
        return (list(c(dx, dy)))
    })
}

# Run this with:
# sol <- deSolve::ode(y, times, func, parms) 
# matplot(sol[, 1], sol[, -1], type = "l")
```

![Oscillating time series of predator and prey population size](ode_plot.png)\

If you run those last two lines that are commented out, you should see the
solution plotted as above. (The lines are commented out so that `deSolve` isn't
required to build this vignette.)

Let's suppose that instead we want to start with a set of quoted statements like
this:

```{r}
system <- quote({
    t_end = 100

    x(0) = 1
    y(0) = 1

    dx/dt = alpha * x - beta * x * y
    dy/dt = -gamma * y + delta * x * y

    alpha = 1/6
    beta = 1/3
    gamma = 0.25
    delta = 0.25
})
```

and turn this into the components above. Here is one approach with `elixir`. The
aim isn't to make something totally robust, but just to get something up and
running, and demonstrate the use of the package.

First, we want to set the variable `times` to a set of integers running from 0
to the specified `t_end` within `system`. We can look for a statement of the
form `t_end = .X` like so:

```{r}
expr_match(system, { t_end = .X })
```

`elixir` allows you to quote expressions inline using `{ curly braces }` as
above. It's sort of the equivalent of `"quotation marks"` for a string. This is
handy to avoid having the equals sign `=` interpreted as naming a parameter to
`quote` or `rlang::expr`:

```{r, eval = FALSE}
# neither of these will work
expr_match(system, quote(t_end = .X))
expr_match(system, rlang::expr(t_end = .X))

# instead you would have to do something like this:
expr_match(system, quote((t_end = .X))[[2]])
expr_match(system, rlang::expr((t_end = .X))[[2]])

# This works because the expression (t_end = .X) is a call, which is list-like
# with two elements: 
# [[1]] is the symbol `(`, and [[2]] is the call t_end = .X.
```

We can extract the number `100` from this list returned by `expr_match`, but
instead we will use a shortcut, `expr_extract`:

```{r}
expr_extract(system, { t_end = .X }, "X")
```

This always returns a `list` with as many entries as there are matches, so if
there were two statements of the form `t_end = .X` then this would be a
two-element `list`. We can tell `elixir` to stop after the first match:

```{r}
expr_extract(system, { t_end = .X }, "X", n = 1)
```

or use `expr_count` to make sure there is exactly one `t_end = .X` statement:

```{r}
if (expr_count(system, { t_end = .X }) != 1) {
    stop("Need exactly one specification of end time.")
}
```

and set `times` like so:

```{r}
times <- 0:expr_extract(system, { t_end = .X }, "X")[[1]]
times
```

OK, that's done. Now let's extract the initial state vector. For this we want to
look for patterns of the form `.X(0) = .V`:

```{r}
expr_match(system, { .X(0) = .V })
```

Again, we'll use `expr_extract` to pull out the two needed components, the names
of the states (here "x" and "y") and their initial values (both 1).

```{r}
expr_extract(system, { .X(0) = .V }, "X")
expr_extract(system, { .X(0) = .V }, "V")

y <- as.numeric(expr_extract(system, { .X(0) = .V }, "V"))
names(y) <- as.character(expr_extract(system, { .X(0) = .V }, "X"))
y
```

Now for parameters, we might initially think to use:

```{r}
expr_match(system, { .P = .X })
```

But that picks up `t_end` and misses out on `alpha` and `beta`. The reason the
latter two components are missed out is that in `.P = .X`, `.X` only matches a
single token, and `1/6` is an expression with three tokens, `/`, `1`, and `6`.
So we'll use `..X` instead of `.X` so that we can match any subexpression:

```{r}
expr_match(system, { .P = ..X })
```

We can also filter out t_end by adding a "test" to the capture token `.P` like
so:

```{r}
expr_match(system, { `.P|P != "t_end"` = ..X })
```

Anything after the `|` is interpreted as a condition to evaluate, and the match
only succeeds if the condition evaluates to `TRUE`. Within the condition, `.` is
a placeholder for the matched token, but we can also use the name of the token
itself, i.e. `P`. Note that we have to wrap the whole capture token in backticks
so that it gets read as a single symbol.

All together, we can get the parameters like so:

```{r}
parms <- expr_extract(system, { `.P|P != "t_end"` = ..X }, "X")
parms <- sapply(parms, eval)
names(parms) <- as.character(expr_extract(system, { `.P|P != "t_end"` = ..X }, "P"))
```

We need to use `eval` on what has been captured by `..X` in order to evaluate
the quoted expressions.

Capturing the ordinary differential equations themselves, and inserting this
into a function that `deSolve` can use, requires us to look for patterns
`dX/dt = ...`; since the "dX" there is one symbol, we will check it to make sure
it is a symbol that starts with a lowercase d:

```{r}
expr_match(system, { `.A:name|substr(A, 1, 1) == "d"`/dt = ..X })
```

Here, the `:name` checks that the the captured element `.A` is of class `name`.

Let's extract the statements:

```{r}
statements <- expr_extract(system, { `.A:name|substr(A, 1, 1) == "d"`/dt = ..X })
statements
```

We can now use `expr_replace` to change e.g. `dX/dt = ...` to `dX <- ...` so
that it is a valid R assignment statement:

```{r}
R_statements <- expr_replace(statements,
    { `.A:name|substr(A, 1, 1) == "d"`/dt = ..X },
    { .A <- ..X })
R_statements
```

Let's also extract the names of the derivatives themselves, i.e. `dx` and `dy`:

```{r}
derivatives <- expr_replace(R_statements, { .D <- ..X }, { .D })
derivatives
```

Finally we put this all into a function using `rlang` and its injection
operators:

```{r}
func <- eval(rlang::expr(
    function(t, y, parms)
    {
        with(as.list(c(y, parms)), {
            !!!R_statements
            return (list(c(!!!derivatives)))
        })
    }
))
```

Putting it all together into a wrapper function, we get something like this:

```{r}
run_ode <- function(system)
{
    # Get times
    if (expr_count(system, { t_end = .X }) != 1) {
        stop("Need exactly one specification of end time.")
    }
    times <- 0:expr_extract(system, { t_end = .X }, "X")[[1]]
    
    # Get initial state
    y <- as.numeric(expr_extract(system, { .X(0) = .V }, "V"))
    names(y) <- as.character(expr_extract(system, { .X(0) = .V }, "X"))
    
    # Get parameters
    parms <- expr_extract(system, { `.P|P != "t_end"` = ..X }, "X")
    parms <- sapply(parms, eval)
    names(parms) <- as.character(expr_extract(system, { `.P|P != "t_end"` = ..X }, "P"))
    
    # Get statements
    statements <- expr_extract(system, { `.A:name|substr(A, 1, 1) == "d"`/dt = ..X })
    R_statements <- expr_replace(statements,
        { `.A:name|substr(A, 1, 1) == "d"`/dt = ..X },
        { .A <- ..X })
    derivatives <- expr_replace(R_statements, { .D <- ..X }, { .D })
    
    func <- eval(rlang::expr(
        function(t, y, parms)
        {
            with(as.list(c(y, parms)), {
                !!!R_statements
                return (list(c(!!!derivatives)))
            })
        }
    ))
    
    # uncomment if deSolve is available:
    # sol <- deSolve::ode(y, times, func, parms) 
    # matplot(sol[, 1], sol[, -1], type = "l")
}

system <- quote({
    t_end = 100

    x(0) = 1
    y(0) = 1

    dx/dt = alpha * x - beta * x * y
    dy/dt = -gamma * y + delta * x * y

    alpha = 1/6
    beta = 1/3
    gamma = 0.25
    delta = 0.25
})

run_ode(system)
```

![Oscillating time series of predator and prey population size](ode_plot.png)\

# Other `elixir` features

The function `expr_apply()` allows you to transform and extract information from
nested list structures which contain expressions, so if you have a big structure
and you want to check all the variable names or make certain replacements, this
may be useful.

`expr_sub()` offers an interface for extracting or replacing part of an
expression; the one advantage this has over `[[` is that it allows you to use
`NULL` as the index, which gives back the whole expression.

`lang2str()` does the opposite of `str2lang()`; it is like `deparse1()` which is
new since R 4.0.0, but with `collapse = ""` instead of `collapse = " "`.

Finally, `meld()`, `translate()`, and `reindent()` are various experimental
functions for constructing code using R.