---
title: "Imperfect serological test"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Imperfect serological test}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup, output=FALSE}
library(serosv)
```

## Imperfect test

Function `correct_prevalence()` is used for estimating the true prevalence if the serological test used is imperfect

Arguments:

-   `data` the input data frame, must either have:

    -   age, pos, tot columns (for aggregated data)

    -   **OR** age, status columns for (linelisting data)

    -   Users can specifiy the name for these columns in the input data frame using arguments `age_col`, `pos_col`, `tot_col`or `age_col`, `status_col` respectively

-   `bayesian` whether to adjust sero-prevalence using the Bayesian or frequentist approach. If set to `TRUE`, true sero-prevalence is estimated using MCMC.

-   `init_se` sensitivity of the serological test (default value `0.95`)

-   `init_sp` specificity of the serological test (default value `0.8`)

-   `study_size_se` (applicable when `bayesian=TRUE`) sample size for sensitivity validation study (default value `1000`)

-   `study_size_sp` (applicable when `bayesian=TRUE`) sample size for specificity validation study (default value `1000`)

-   `chains` (applicable when `bayesian=TRUE`) number of Markov chains (default to `1`)

-   `warmup` (applicable when `bayesian=TRUE`) number of warm up runs (default value `1000`)

-   `iter` (applicable when `bayesian=TRUE`) number of iterations (default value `2000`)

The function will return a list of 2 items:

-   `info`

    -   if `bayesian = TRUE` contains estimated values for se, sp and corrected seroprevalence

    -   else return the formula for computing corrected seroprevalence

-   `corrected_sero` return a data.frame with `age`, `sero` (corrected sero) and `pos`, `tot` (adjusted based on corrected prevalence)

```{r}
# ---- estimate real prevalence using Bayesian approach ----
data <- rubella_uk_1986_1987
output <- correct_prevalence(data, warmup = 1000, iter = 4000, init_se=0.9, init_sp = 0.8, study_size_se=1000, study_size_sp=3000)

# check fitted value 
output$info[1:2, ]

# ---- estimate real prevalence using frequentist approach ----
freq_output <- correct_prevalence(data, bayesian = FALSE, init_se=0.9, init_sp = 0.8)

# check info
freq_output$info
```

User can then visualize the output using `plot_corrected_prev()` function

```{r}
# Plot output of the frequentist approach
plot_corrected_prev(freq_output)

# Plot output of the bayesian approach 
plot_corrected_prev(output)
```

To compare both correction methods in a single plot, provide the output from the second method as the optional `y` argument in `plot_corrected_prev()`

```{r}
plot_corrected_prev(output, freq_output)

# set facet = TRUE to display the confidence or credible intervals for each method
plot_corrected_prev(output, freq_output, facet = TRUE)
```

### Fitting corrected data

**Data after seroprevalence correction**

Bayesian approach

```{r}
suppressWarnings(
  corrected_data <- farrington_model(
  output$corrected_se,
  start=list(alpha=0.07,beta=0.1,gamma=0.03))
)

plot(corrected_data)
```

Frequentist approach

```{r}
suppressWarnings(
  corrected_data <- farrington_model(
  freq_output$corrected_se,
  start=list(alpha=0.07,beta=0.1,gamma=0.03))
)

plot(corrected_data)
```

**Original data**

```{r}
suppressWarnings(
  original_data <- farrington_model(
  data,
  start=list(alpha=0.07,beta=0.1,gamma=0.03))
)
plot(original_data)
```