---
title: "Bayesian Mediation Analysis in R"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Bayesian Mediation Analysis in R}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

Perform mediation analysis in the presence of high-dimensional
mediators based on the potential outcome framework. Bayesian Mediation
Analysis (BAMA), developed by Song et al (2019) and
Song et al (2020),
relies on two Bayesian sparse linear mixed models to simultaneously analyze
a relatively large number of mediators for a continuous exposure and outcome
assuming a small number of mediators are truly active. This sparsity
assumption also allows the extension of univariate mediator analysis by
casting the identification of active mediators as a variable selection
problem and applying Bayesian methods with continuous shrinkage priors on
the effects.

## Installation
You can install `bama` from CRAN
```{r, eval = FALSE}
install.packages("bama")
```

or from Github via `devtools`
```{r, eval = FALSE}
# install.packages(devtools)
devtools::install_github("umich-cphds/bama", built_opts = c())
```
`bama` requires the R packages `Rcpp` and `RcppArmadillo`, so you may want to
install / update them before downloading. If you decide to install `bama` from
source (eg github), you will need a C++ compiler that supports C++11. On Windows
this can accomplished by installing
[Rtools](https://cran.r-project.org/bin/windows/Rtools/), and
[Xcode](https://apps.apple.com/us/app/xcode/id497799835?mt=12) on MacOS.

The Github version may contain new features or bug fixes not yet present on
CRAN, so if you are experiencing issues, you may want to try the Github
version of the package.
## Example problem
`bama` contains a semi-synthetic example data set, `bama.data` that is used in
this example. `bama.data` contains a continuous response `y` and a continuous
exposure `a` that is mediated by 100 mediators, `m[1:100]`.

```{r, eval = FALSE}
library(bama)
# print just the first 10 columns
head(bama.data[,1:10])
```

The mediators have an internal correlation structure that is based off the
covariance matrix from the Multi-Ethnic Study of Atherosclerosis (MESA) data.
However, `bama` does not model internal correlation between mediators.
Instead, `bama` employs continuous Bayesian shrinkage priors to select mediators
and assumes that all the potential mediators contribute small effects
in mediating the exposure-outcome relationship, but only a small proportion of
mediators exhibit large effects.

We use no adjustment covariates in this example, so we just include the
intercept. Also, in a real world situation, it may be beneficial to normalize
the input data.

```{r, eval = FALSE}
Y <- bama.data$y
A <- bama.data$a

# grab the mediators from the example data.frame
M <- as.matrix(bama.data[, paste0("m", 1:100)], nrow(bama.data))

# We just include the intercept term in this example as we have no covariates
C1 <- matrix(1, 1000, 1)
C2 <- matrix(1, 1000, 1)
beta.m  <- rep(0, 100)
alpha.a <- rep(0, 100)

out <- bama(Y = Y, A = A, M = M, C1 = C1, C2 = C2, method = "BSLMM", seed = 1234,
            burnin = 1000, ndraws = 1100, weights = NULL, inits = NULL, 
            control = list(k = 2, lm0 = 1e-04, lm1 = 1, l = 1))

# The package includes a function to summarise output from 'bama'
summary <- summary(out)
head(summary)

# Product Threshold Gaussian 
ptgmod = bama(Y = Y, A = A, M = M, C1 = C1, C2 = C2, method = "PTG", seed = 1234,
              burnin = 1000, ndraws = 1100, weights = NULL, inits = NULL, 
              control = list(lambda0 = 0.04, lambda1 = 0.2, lambda2 = 0.2))

mean(ptgmod$beta.a)
apply(ptgmod$beta.m, 2, mean)
apply(ptgmod$alpha.a, 2, mean)
apply(ptgmod$betam_member, 2, mean)
apply(ptgmod$alphaa_member, 2, mean)

# Gaussian Mixture Model
gmmmod = bama(Y = Y, A = A, M = M, C1 = C1, C2 = C2, method = "GMM", seed = 1234,
              burnin = 1000, ndraws = 1100, weights = NULL, inits = NULL, 
              control = list(phi0 = 0.01, phi1 = 0.01))

mean(gmmmod$beta.a)
apply(gmmmod$beta.m, 2, mean)
apply(gmmmod$alpha.a, 2, mean)

mean(gmmmod$sigma.sq.a)
mean(gmmmod$sigma.sq.e)
mean(gmmmod$sigma.sq.g)
```

## Reference
Song, Y, Zhou, X, Zhang, M, et al. Bayesian shrinkage estimation of high
dimensional causal mediation effects in omics studies. Biometrics. 2019;
1-11.
