---
title: "Sobol4R: Sobol indices for stochastic models"
shorttitle: "Sobol4R: Sobol indices for stochastic models"
author:
- name: "Frédéric Bertrand"
  affiliation:
  - Cedric, Cnam, Paris
  email: frederic.bertrand@lecnam.net
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    toc: true
vignette: >
  %\VignetteIndexEntry{Sobol4R: Sobol indices for stochastic models}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "figures/sobol-stochastic-",
  fig.width = 7,
  fig.height = 5,
  dpi = 150,
  message = FALSE,
  warning = FALSE,
  eval=FALSE
)

LOCAL <- identical(Sys.getenv("LOCAL"), "TRUE")

library(Sobol4R)
library(sensitivity)
set.seed(4669)
```

# Introduction

This vignette illustrates the use of the Sobol4R package to compute Sobol indices
for deterministic and stochastic versions of the classical Sobol g function.

The designs are generated with `sensitivity::sobol` and the models are provided
by Sobol4R, in particular

- `sobol_g_function` for the deterministic g function  
- `sobol_g2_additive_noise` for a version of the model with additive Gaussian noise  
- `sobol_g2_with_covariate_noise` for a version with covariate dependent noise  

The stochastic models can be combined with the generic quantity of interest
wrapper `sobol4r_qoi`.

# Deterministic Sobol g function

## Order 1 and Total via `sensitivity::sobol()`
```{r g-det-design, cache=TRUE, eval=LOCAL}
n  <- 1e4
p  <- 8
X1 <- data.frame(matrix(runif(p * n), nrow = n))
X2 <- data.frame(matrix(runif(p * n), nrow = n))

sob_det <- sobol4r_design(X1 = X1, X2 = X2, order = 2, nboot = 50)

Y <- sobol_g_function(sob_det$X)
sensitivity::tell(sob_det, Y)
```

```{r g-det-plot, cache=TRUE, eval=LOCAL}
print(sob_det)
Sobol4R::autoplot(sob_det, ncol = 1)
```

## Order 1 and Total via `sensitivity::sobol2007()`

```{r, cache=TRUE, eval=LOCAL}
n  <- 1e4
p  <- 8
X1 <- data.frame(matrix(runif(p * n), nrow = n))
X2 <- data.frame(matrix(runif(p * n), nrow = n))

sob_det_2007 <- sobol4r_design(
  X1    = X1,
  X2    = X2,
  nboot = 50,
  type  = "sobol2007"
)

Y <- sobol_g_function(sob_det_2007$X)
sensitivity::tell(sob_det_2007, Y)
```

```{r g-det-plot_2007, cache=TRUE, eval=LOCAL}
print(sob_det_2007)
Sobol4R::autoplot(sob_det_2007)
```

# Random effect on the output

We restrict the g function to the first two inputs and add a Gaussian noise term
with zero mean and unit variance.

## Order 1 and Total via `sensitivity::sobol()`

```{r g-noise-const-run, cache=TRUE, eval=LOCAL}
sob_noise_add <- sobol4r_design(X1 = X1[, 1:2], X2 = X2[, 1:2], order = 2, nboot = 50, type = "sobol")

Y <- sobol_g2_additive_noise(sob_noise_add$X)
sensitivity::tell(sob_noise_add, Y)
```

```{r g-noise-const-plot, cache=TRUE, eval=LOCAL}
print(sob_noise_add)
Sobol4R::autoplot(sob_noise_add)
```

## Order 1 and Total via `sensitivity::sobol2007()`

```{r g-noise-const-run-2007, cache=TRUE, eval=LOCAL}
sob_noise_add <- sobol4r_design(X1 = X1[, 1:2], X2 = X2[, 1:2], nboot = 50, type = "sobol2007")

Y <- sobol_g2_additive_noise(sob_noise_add$X)
sensitivity::tell(sob_noise_add, Y)
```

```{r g-noise-const-plot-2007, cache=TRUE, eval=LOCAL}
print(sob_noise_add)
Sobol4R::autoplot(sob_noise_add)
```

# Quantity of interest based on repeated runs

Instead of a single noisy run, we can focus on a quantity of interest, here the
conditional mean of the output given the inputs. This is approximated by
repeated calls to the stochastic model.

## Order 1 and Total via `sensitivity::sobol()`

```{r g-noise-const-qoi-run, cache=TRUE, eval=LOCAL}
sob_noise_const_qoi <- sobol4r_qoi_indices(
  model = sobol_g2_additive_noise,
  X1 = X1[, 1:2], 
  X2 = X2[, 1:2], 
  order = 2, 
  nboot = 50, 
  type = "sobol")
```

```{r g-noise-const-qoi-plot, cache=TRUE, eval=LOCAL}
print(sob_noise_const_qoi)
Sobol4R::autoplot(sob_noise_const_qoi)
```

## Order 1 and Total via `sensitivity::sobol2007()`

```{r g-noise-const-qoi-run-2007, cache=TRUE, eval=LOCAL}
sob_noise_const_qoi <- sobol4r_qoi_indices(
  model = sobol_g2_additive_noise,
  X1 = X1[, 1:2], 
  X2 = X2[, 1:2], 
  nboot = 50, 
  type = "sobol2007")
```

```{r g-noise-const-qoi-plot-2007, cache=TRUE, eval=LOCAL}
print(sob_noise_const_qoi)
Sobol4R::autoplot(sob_noise_const_qoi)
```

# Covariate dependent noise

We now add a third input which controls the mean of the Gaussian noise term.
The mean is equal to the third input, and the variance is fixed.

```{r g-noise-cov-design, cache=TRUE, eval=LOCAL}
X1_cov <- data.frame(
  C1 = runif(n),
  C2 = runif(n),
  C3 = runif(n, min = 1, max = 100)
)
X2_cov <- data.frame(
  C1 = runif(n),
  C2 = runif(n),
  C3 = runif(n, min = 1, max = 100)
)
```

## Order 1 and Total via `sensitivity::sobol()`

```{r g-noise-cov-run, cache=TRUE, eval=LOCAL}
sob_cov_single <- sobol4r_design(X1 = X1_cov, X2 = X2_cov, order = 2, nboot = 50, type = "sobol")

Y <- sobol_g2_additive_noise(sob_cov_single$X)
sensitivity::tell(sob_cov_single, Y)
```

```{r g-noise-cov-plot, cache=TRUE, eval=LOCAL}
print(sob_cov_single)
Sobol4R::autoplot(sob_cov_single)
```

## Order 1 and Total via `sensitivity::sobol2007()`

```{r g-noise-cov-qoi-run, cache=TRUE, eval=LOCAL}
sob_cov_qoi <- sobol4r_qoi_indices(
  model = sobol_g2_with_covariate_noise,
  X1    = X1_cov,
  X2    = X2_cov,
  nboot = 50, 
  type = "sobol2007")
```

```{r g-noise-cov-qoi-plot, cache=TRUE, eval=LOCAL}
print(sob_cov_qoi)
Sobol4R::autoplot(sob_cov_qoi)
```

# Conclusion

This vignette shows how Sobol4R can be used to study the impact of randomness in
the model output on Sobol indices. More advanced examples, including models with
random distributional parameters, are presented in a separate vignette.
