---
title: "Manipulating Discrete Joint Distributions"
author: "Robin Evans"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Manipulating Discrete Joint Distributions}
  %\VignetteEngine{knitr::rmarkdown}
  %\usepackage[utf8]{inputenc}
---

### Marginal and Conditional Distributions

```{r message=FALSE}
library(rje)
```

First let's generate a joint probability distribution for a 
$2 \times 2 \times 2 \times 2$-table. 
```{r}
set.seed(123)
p = rprobdist(2, 4)
```
We can easily calculate the marginal distribution for the first
two variables:
```{r}
marginTable(p, 1:2)
```
Note that the function `base::margin.table()` performs the same 
function as `marginTable()`, but is not as fast.  Output is 
ordered according to how the variables are entered into the function:
```{r}
marginTable(p, 2:1)
```
but this can be over-ridden by setting the argument `order=FALSE`.

We can also obtain conditional distributions:
```{r}
conditionTable(p, 3, 1)
```
`conditionTable()` orders output with the 'free' variables first
(as ordered in the argument `variables`) followed by the conditioning
variables.  Sometimes it's useful to keep a conditional (or marginal) 
distribution in the same form as the original table, even for variables 
which are removed
```{r}
conditionTable2(p, 3, 1)
```

### Interventions

In causal inference it is common to want to know what happens if we 
*intervene* on a variable under a certain causal ordering.  This is 
effectively just knowing about a joint distribution after dividing by
a particular conditional distribution.  
```{r}
p_int = interventionTable(p, 3, 1:2)
## check this is p(1,2) * p(4 | 1, 2, 3)
p_int2 = conditionTable2(p, 1:2, c())*conditionTable2(p, 4, 1:3)
all.equal(p_int, p_int2)
```


### Multiple distributions

```{r}
#rprobMat(100, dim=2, d=4)
```

### Reconstructing Joint Distributions

When dealing with margins of multivariate distributions, it can be
useful to be able to repeat probabilities to match the pattern of a
joint distribution.  In particular if we are given various conditional
distributions (say from a Bayesian network model), we may wish to
multiply them together to obtain the joint distribution.

For example, the model in which $X_2$ is independent of $X_3$ given
$X_1$ might be stored as the conditional probability tables $P(X_1)$,
$P(X_2 | X_1)$ and $P(X_3 | X_1)$.  In order to reconstruct the joint
distribution $P(X_1,X_2,X_3)$, one needs to multiply
$$
P(X_1=x_1,X_2=x_2,X_3=x_3)
= P(X_1=x_1) \cdot P(X_2 = x_2 | X_1=x_1) \cdot P(X_3 = x_3 | X_1=x_1),
$$
so that the values of $x_1,x_2,x_3$ match.

To use R's vectorization for this we must turn the 
probability tables into vectors indexed by
$(x_1,x_2,x_3)$, regardless of which variables are actually represented
in the table; if a variable is not represented then values will be
repeated.  The indexing should be in reverse lexicographical order
(i.e. first index changes fastest: 000, 100, 010, 110, ..., 111),
which is the way arrays are stored in R.  

For example, if $X_1,X_2,X_3$ are all binary (i.e.\ take values in
$\{0,1\}$) then we'd transform the table of $X_3 | X_1$ into
$$
P(X_3 = 0 | X_1 = 0), \, P(X_3 = 0 | X_1 = 1), P(X_3 = 0 | X_1 = 0), \, P(X_3 = 0 | X_1 = 1)\\
P(X_3 = 1 | X_1 = 0), \, P(X_3 = 1 | X_1 = 1), P(X_3 = 1 | X_1 = 0), \, P(X_3 = 1 | X_1 = 1).
$$
Now, suppose we already have a vector for $P(X_3 = x_3 | X_1 = x_1)$
indexed by $(x_1, x_3)$ in reverse lexicographical order:
$$
P(X_3 = 0 | X_1 = 0), \, P(X_3 = 0 | X_1 = 1), P(X_3 = 1 | X_1 = 0), \, P(X_3 = 1 | X_1 = 1),
$$
we need the first and second entries repeated, followed by the third
and fourth entries:
```{r}
patternRepeat0(c(1,3), c(2,2,2))
```

`patternRepeat0()` requires us only to specify the elements present
and the dimension of the full distribution.  The existing order of the
distribution is assumed to be reverse lexicographic, regardless of the
order given in the first argument, but this can be over-ridden.

```{r}
patternRepeat0(c(3,1), c(2,2,2))
patternRepeat0(c(3,1), c(2,2,2), keep.order=TRUE)
```

Another way to think about this is that if we take the possible
indices for a 3 dimensional array and match them to the indices of 
just the first and third dimensions, `patternRepeat0()` tells us 
which point should be matched.

#### Example

Let's generate some conditional probability tables.
```{r}
set.seed(134)

p1 = c(rdirichlet(1,c(1,1)))
p2.1 = c(rdirichlet(2,c(1,1)))
p3.1 = c(rdirichlet(2,c(1,1)))

p12 = p1*p2.1
## get joint distribution
p123 = p12*p3.1[patternRepeat0(c(1,3), c(2,2,2))]

## put into array to verify this has correct
## conditional distribution
dim(p123) = c(2,2,2)
conditionTable(p123, 3, 1)

## can also get conditional distribution indexed by all variables
p3.1[patternRepeat0(c(1,3), c(2,2,2))]
c(conditionTable2(p123, 3, 1))
```