
---
title: "summaryLevels"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Summary table}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  markdown: 
    wrap: 72
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  echo = TRUE,
  eval = TRUE,
  warning=FALSE,
  fig.height = 6,
  fig.width = 9,
  fig.align='center'
)
```

```{r color-function, echo = FALSE}
colorize <- function(text, color) {
  if (knitr::is_latex_output()) {
    sprintf("\\textcolor{%s}{%s}", color, text)
  } else if (knitr::is_html_output()) {
    sprintf("<span style='color: %s;'>%s</span>", color, text)
  } else text
}

```

```{r setup, echo = FALSE, message = FALSE, warning = FALSE}
library(dplyr)
library(tidyverse)
library(gtsummary)
library(summarySCI)
library(flextable)
```

The function `summaryLevels()` produces a table with descriptive
statistics for levels of a categorical variable, when those are saved as binary variables in different columns. It is largely based on the function `gtsummary::tbl_summary()`. The changes as
compared to `tbl_summary` are:

-   Collapses binary variables from different columns into levels of a factor and creates a summary table.

## Setup and data

To demonstrate the various functionalities of the function, we will create a small
dataset. The factor of interest is 'Site of progression'. For each site of progression, presence or absence is decoded in a separate column. In addition, we have a grouping variable called 'arm'.

```{r, message = FALSE}
data<- as.data.frame(cbind(c(1:10), c("A","A","A","A","A","B","B","B","B","B"),
                            c("absent","present","absent","present","absent","absent","present","absent","present","absent"),
                            c("absent","absent","present","absent","absent","absent","absent","absent","absent","absent"),
                            c("present","absent","present","present","present","present","present","present","present","present")))
names(data)<-c("upn", "arm", "liver", "lung", "brain")
  
```

## Basic table

Now, we use `summarySCI::summaryLevels` to collapse the columns 'liver', 'lung' and 'brain' into a single factor named 'Site of progression'. The presence of each site of progression is decoded as 'present'. We need to define the columns containing the factor levels using the `vars` argument.

```{r}
summarySCI::summaryLevels(data=data,
                      vars = c("liver", "lung", "brain"),
                      label = "Site of progression",
                      levels= "present",)
```

The footnote emphazises that a patient may have more than one site of progression. Therefore, the column percentages do not neccessarily add up to 100%.

## By group 
We can stratify the table by groups via the `group` argument. The overall column can still be shown if desired, using the `overall = TRUE` argument.

```{r}
summarySCI::summaryLevels(data=data,
                      vars = c("liver", "lung", "brain"),
                      group = "arm",
                      label = "Site of progression",
                      levels= "present")
```

```{r}
summarySCI::summaryLevels(data=data,
                      vars = c("liver", "lung", "brain"),
                      group = "arm",
                      label = "Site of progression",
                      levels= "present",
                      overall = TRUE)
```

## Performing a statistical test

A statistical test can be performed for each level separately when the argument `test` is set to `TRUE`. The type of test can be changed with the `test_cat` argument. Options include `chisq.test`, `chisq.test.no.correct`, `fisher.test` (default)

```{r}
summarySCI::summaryLevels(data=data,
                      vars = c("liver", "lung", "brain"),
                      group = "arm",
                      label = "Site of progression",
                      test = TRUE,,
                      levels= "present")
```

```{r}
summarySCI::summaryLevels(data=data,
                      vars = c("liver", "lung", "brain"),
                      group = "arm",
                      levels= "present",
                      label = "Site of progression",
                      test = TRUE,
                      test_cat = "chisq.test")
```

## Add a confidence interval
A confidence interval can be added, if requested using the statement `ci = TRUE`. The confidence level can be adjusted by `conf_level`. The type of confidence interval can be chosen using the command `ci_cat`.

```{r}
summarySCI::summaryLevels(data=data,
                      vars = c("liver", "lung", "brain"),
                      group = "arm",
                      label = "Site of progression",
                      levels= "present",
                      test = TRUE,
                      test_cat = "fisher.test",
                      ci=TRUE,
                      conf_level = 0.9,
                      overall = FALSE)
```


# Further steps:

-   make it possible to stack tables
