---
title: "Reproducible Analytical Pipelines with Nix"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Reproducible Analytical Pipelines with Nix}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

## Introduction

Isolated environments are great to run pipelines in a safe and reproducible
manner. This vignette details how to build a reproducible analytical pipeline
using an environment built with Nix that contains the right version of R and
packages.

## An example of a reproducible analytical pipeline using Nix

Suppose that you've used `{targets}` to build a pipeline for a project and that
you did so using a tailor-made Nix environment. Here is the call to `rix()` that
you could have used to build that environment:

```{r, eval = F}
path_default_nix <- tempdir()

rix(
  r_ver = "4.2.2",
  r_pkgs = c("targets", "tarchetypes", "rmarkdown"),
  system_pkgs = NULL,
  git_pkgs = list(
    package_name = "housing",
    repo_url = "https://github.com/rap4all/housing/",
    commit = "1c860959310b80e67c41f7bbdc3e84cef00df18e"
  ),
  ide = "none",
  project_path = path_default_nix,
  overwrite = TRUE
)
```

This call to `rix()` generates the following `default.nix` file:

```
let
 pkgs = import (fetchTarball "https://github.com/rstats-on-nix/nixpkgs/archive/2023-02-13.tar.gz") {};
 
  rpkgs = builtins.attrValues {
    inherit (pkgs.rPackages) 
      rmarkdown
      tarchetypes
      targets;
  };
 
    housing = (pkgs.rPackages.buildRPackage {
      name = "housing";
      src = pkgs.fetchgit {
        url = "https://github.com/rap4all/housing/";
        rev = "1c860959310b80e67c41f7bbdc3e84cef00df18e";
        sha256 = "sha256-s4KGtfKQ7hL0sfDhGb4BpBpspfefBN6hf+XlslqyEn4=";
      };
      propagatedBuildInputs = builtins.attrValues {
        inherit (pkgs.rPackages) 
          dplyr
          ggplot2
          janitor
          purrr
          readxl
          rlang
          rvest
          stringr
          tidyr;
      };
    });
    
  system_packages = builtins.attrValues {
    inherit (pkgs) 
      glibcLocales
      nix
      R;
  };
  
in

pkgs.mkShell {
  LOCALE_ARCHIVE = if pkgs.system == "x86_64-linux" then "${pkgs.glibcLocales}/lib/locale/locale-archive" else "";
  LANG = "en_US.UTF-8";
   LC_ALL = "en_US.UTF-8";
   LC_TIME = "en_US.UTF-8";
   LC_MONETARY = "en_US.UTF-8";
   LC_PAPER = "en_US.UTF-8";
   LC_MEASUREMENT = "en_US.UTF-8";

  buildInputs = [ housing rpkgs  system_packages   ];
  
}
```

The environment that gets built from this `default.nix` file contains R version
4.2.2, the `{targets}` and `{tarchetypes}` packages, as well as the `{housing}`
packages, which is a package that is hosted on GitHub only with some data and
useful functions for the project. Because it is on GitHub, it gets installed
using the `buildRPackage` function from Nix. You can use this environment to
work on you project, or to launch a `{targets}` pipeline. [This GitHub
repository](https://github.com/b-rodrigues/nix_targets_pipeline/tree/master)
contains the finalized project.

On your local machine, you could execute the pipeline in the environment by
running this in a terminal:

```
cd /absolute/path/to/housing/ && nix-shell default.nix --run "Rscript -e 'targets::tar_make()'"
```

If you wish to run the pipeline whenever you drop into the Nix shell, you could
add a *Shell-hook* to the generated `default.nix` file:

```{r parsermd-chunk-3, eval = FALSE}
path_default_nix <- tempdir()

rix(
  r_ver = "4.2.2",
  r_pkgs = c("targets", "tarchetypes", "rmarkdown"),
  system_pkgs = NULL,
  git_pkgs = list(
    package_name = "housing",
    repo_url = "https://github.com/rap4all/housing/",
    commit = "1c860959310b80e67c41f7bbdc3e84cef00df18e"
  ),
  ide = "none",
  shell_hook = "Rscript -e 'targets::tar_make()'",
  project_path = path_default_nix,
  overwrite = TRUE
)
```

Now, each time you drop into the Nix shell for that project using `nix-shell`,
the pipeline gets automatically executed. `{rix}` also features a function
called `tar_nix_ga()` that adds a GitHub Actions workflow file to make the
pipeline run automatically on GitHub Actions. The GitHub repository linked above
has such a file, so each time changes get pushed, the pipeline runs on GitHub
Actions and the results are automatically pushed to a branch called
`targets-runs`. See the workflow file
[here](https://github.com/b-rodrigues/nix_targets_pipeline/blob/master/.github/workflows/run-pipeline.yaml).
This feature is very heavily inspired and adapted from the
`targets::github_actions()` function.

