---
title: "Workflow with fmcmc"
author: "George G. Vega Yon"
date: "April 19th, 2019"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Workflow with fmcmc}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(fig.dim=c(9, 9), out.width=600, out.height=600)
```

# Motivating example

We start by loading the dataset that the `mcmc` package includes. We will use
the `logit` data set to obtain a posterior distribution of the model parameters
using the `MCMC` function.

```{r loading-data, include = TRUE}
library(fmcmc)
data(logit, package = "mcmc")
out <- glm(y ~ x1 + x2 + x3 + x4, data = logit, family = binomial, x = TRUE)
beta.init <- as.numeric(coefficients(out))
```

To use the Metropolis-Hastings MCMC algorithm, the function should be
(in principle) the log unnormalized posterior. The following block of code,
extracted from the `mcmc` package vignette "MCMC Package Example,"
creates the function that we will be using:

```{r log-unnorm-posterior}
lupost_factory <- function(x, y) function(beta) {
    eta <- as.numeric(x %*% beta)
    logp <- ifelse(eta < 0, eta - log1p(exp(eta)), - log1p(exp(- eta)))
    logq <- ifelse(eta < 0, - log1p(exp(eta)), - eta - log1p(exp(- eta)))
    logl <- sum(logp[y == 1]) + sum(logq[y == 0])
    return(logl - sum(beta^2) / 8)
}

lupost <- lupost_factory(out$x, out$y)
```

Let's give it the first try. In this case, we will use the beta estimates from the
estimated GLM model as a starting point for the algorithm, and we will ask it to
sample 1e4 points from the posterior distribution (`nsteps`).

```{r estimation1.0}
# to get reproducible results
set.seed(42)
out <- MCMC(
  initial = beta.init,
  fun     = lupost,
  nsteps  = 1e3
)
```

Since the resulting object is of class `mcmc` (from the `coda` R package), we
can use all the functions included in `coda` for model diagnostics:

```{r post-estimation}
library(coda)
plot(out[,1:3])
```

So this chain has very poor mixing, so let's try again by using a smaller scale
for the normal kernel proposal moving it from 1 (the default value) to .2:

```{r estimation1.1}
# to get reproducible results
set.seed(42)
out <- MCMC(
  initial = beta.init,
  fun     = lupost,
  nsteps  = 1e3,
  kernel  = kernel_normal(scale = .2) 
)
```

The `kernel_normal`--the default kernel in the `MCMC` function--returns
an object of class `fmcmc_kernel`. In principle, it consists of a list of two
functions that are used by the `MCMC` routine: `proposal`, the proposal kernel
function, and `logratio`, the function that returns the log of the
Metropolis-Hastings ratio. We will talk more about `fmcmc_kernel` objects later.
Now, let's look at the first three variables of our model:

```{r}
plot(out[,1:3])
```

Better. Now, ideally we should only be using observations from the stationary
distribution. Let's give it another try checking for convergence every 1,000
steps using the `convergence_geweke`:

```{r estimation1.2}
# to get reproducible results
set.seed(42)
out <- MCMC(
  initial = beta.init,
  fun     = lupost,
  nsteps  = 1e4,
  kernel  = kernel_normal(scale = .2),
  conv_checker = convergence_geweke(200)
)
```


A bit better. As announced by `MCMC`, the `convergence_geweke` checker suggests
that the chain reached a stationary state.
With this in hand, we can now rerun the algorithm such that we start from the
last couple of steps of the chain, this time, without convergence monitoring as
it is no longer necessary.

We will increase the number of steps (sample size), use two chains using parallel
computing, and add some thinning to reduce autocorrelation:

```{r estimation2.0}
# Now, we change the seed, so we get a different stream of
# pseudo random numbers
set.seed(112) 

out_final <- MCMC(
  initial   = out,                       # Automagically takes the last 2 points
  fun       = lupost, 
  nsteps    = 5e4,                       # Increasing the sample size
  kernel    = kernel_normal(scale = .2),
  thin      = 10,
  nchains   = 2L,                        # Running parallel chains
  multicore = TRUE                       # in parallel.
  )
```

Notice that, instead of specifying the starting points for each
chain, we passed the `out` object to `initial`. By default, if
`initial` is of class `mcmc`, `MCMC` will take the last `nchains` points from
the chain as starting point for the new sequence. If `initial` is of class
`mcmc.list`, the number of chains in `initial` must match the `nchains`
parameter. We now see that the output posterior distribution appears to be
stationary

```{r final-results}
plot(out_final[, 1:3])
summary(out_final[, 1:3])
```


# Reusing fmcmc_kernel objects

`fmcmc_kernel` objects are environments that are passed to the `MCMC` function.
While the `MCMC` function only returns the `mcmc` class object (as defined in
the `coda` package), users can exploit the fact that the kernel objects are
environments to reuse them or inspect them once the `MCMC` function returns.
For example, `fmcmc_kernel` objects can be useful with adaptive
kernels as users can review the covariance structure or other
components.

To illustrate this, let's re-do the MCMC chain of the previous example but using
an adaptive kernel instead, in particular, Haario's 2010 adaptive metropolis.

```{r kernel-object-adapt}
khaario <- kernel_adapt(freq = 1, warmup = 500)
```

The MCMC function will update the kernel at every step (`freq = 1`), and the
adaptation will start from step 500 (`warmup = 500`). We can see that some of
its components haven't been initialized or have a default starting value before
the call of the `MCMC` function:

```{r haario-first-inspect}
# Number of iterations (absolute count, starts in 0)
khaario$abs_iter

# Variance covariance matrix (is empty... for now)
khaario$Sigma
```


Let's see how it works:

```{r haario-first-run}
set.seed(12) 

out_haario_1 <- MCMC(
  initial   = out,                       
  fun       = lupost, 
  nsteps    = 1000,    # We will only run the chain for 100 steps                    
  kernel    = khaario, # We passed the predefined kernel
  thin      = 1,       # No thining here
  nchains   = 1L,      # A single chain
  multicore = FALSE    # Running in serial
  )
```

Let's inspect the output and mark when the adaptation starts:

```{r haario-first-run-plots}
traceplot(out_haario_1[,1], main = "Traceplot of the first parameter")
abline(v = 500, col = "red", lwd = 2, lty=2)
```

If we look at the `khaario` kernel, the `fmcmc_kernel` object, we can see that
things changed from the first time we ran it

```{r haario-second-inspect}
# Number of iterations (absolute count, the counts equal the number of steps)
khaario$abs_iter

# Variance covariance matrix (now is not empty)
(Sigma1 <- khaario$Sigma)
```

If we re-run the chain, using as starting point the last step of the first run,
we can also continue using the kernel object:

```{r haario-second-run}
out_haario_2 <- MCMC(
  initial   = out_haario_1,
  fun       = lupost, 
  nsteps    = 2000,    # We will only run the chain for 2000 steps now
  kernel    = khaario, # Same as before, same kernel.
  thin      = 1,       
  nchains   = 1L,      
  multicore = FALSE,
  seed      = 555      # We can also specify the seed in the MCMC function    
  )
```

Let's see again how does everything looks like:

```{r haario-second-run-plots}
traceplot(out_haario_2[,1], main = "Traceplot of the first parameter")
abline(v = 500, col = "red", lwd = 2, lty=2)
```

As shown in the plot, since the warmup period already passed for the kernel object,
the adaptation process is happening at every step, so we don't see a big break at
step 500 as before. Let's see the counts and the covariance matrix and compare
it with the previous one:

```{r haario-third-inspect}
# Number of iterations (absolute count, the counts equal the number of steps)
# This will have 1000 (first run) + 2000 (second run) steps
khaario$abs_iter

# Variance covariance matrix (now is not empty)
(Sigma2 <- khaario$Sigma)

# How different are these?
Sigma1 - Sigma2
```

Things have changed since the last time we used the kernel, as expected. Kernel
objects in the `fmcmc` package can also be used with multiple chains and in
parallel. The `MCMC` function is smart enough to create independent copies of
`fmcmc_kernel` objects when running multiple chains, and keep the original
kernel objects up-to-date even when using multiple cores to run `MCMC`. For
more technical details on how `fmcmc_kernel` objects work see the manual
`?fmcmc_kernel` or the vignette "User-defined kernels" included in the package
`vignette("user-defined-kernels", package = "fmcmc")`.

# Accessing other elements of the chain

In some situations, you may want to access the computed unnormalized log-posterior
probabilities, the states proposed by the kernel, or other process components.
In those cases, the functions with the prefix `get_*` can help you.

Starting version 0.5-0, we replaced the family of functions `last_*` with
`get_*`; a re-design of this "memory" component that gives users access
to data generated during the Markov process. After each run of the `MCMC` function,
information regarding the last execution is stored in the environment
`MCMC_OUTPUT`.

If you wanted to look at the logposterior of the last call and proposed states,
you can do the following:

```{r get-logpost-draws}
plot(get_logpost(), type="l")

# Pretty figure showing proposed and accepted
plot(
  get_draws()[,1:2], pch = 20, col = "gray",
  main = "Haario's second run"
  )
points(out_haario_2[,1:2], col = "red", pch = 20)
legend(
  "topleft", legend = c("proposed", "accepted"),
  col = c("gray", "red"),
  pch = 20,
  bty = "n"
)
```

The `MCMC_OUTPUT` environment also contains the arguments passed to `MCMC()`:

```{r get-arguments}
get_initial()
get_fun()
get_kernel()
```