---
title: "Introduction to tern.gee"
date: "`r Sys.Date()`"
output:
    rmarkdown::html_document:
        theme: "spacelab"
        highlight: "kate"
        toc: true
        toc_float: true
vignette: >
    %\VignetteIndexEntry{Introduction to tern.gee}
    %\VignetteEngine{knitr::rmarkdown}
    %\VignetteEncoding{UTF-8}
editor_options:
    markdown: 
        wrap: 72
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

## Introduction

Generalized Estimating Equations (GEEs) are mainly used for modeling
longitudinal binary or count endpoints from clinical trials. Within this
package, a GEE is used to estimate the parameters of a generalized
linear model that includes as fixed effects the variables: treatment
arm, categorical visit, and other covariates for adjustment (e.g. age,
sex, race). The covariance structure of the residuals can take on
different forms. Often, an unstructured (i.e. saturated
parameterization) covariance matrix is assumed which can be represented
by random effects in the model.

This vignette shows the general purpose and syntax of the `tern.gee` R
package which provides an interface for GEEs within the `tern`
framework. This package builds upon some of the GEE functionality
included in the `geepack` and `geeasy` R packages. Within this package,
we have implemented GEEs in R in such a way that they can easily be
embedded into a `shiny` application. See
`teal.modules.clinical::tm_a_gee()` and the [`teal.modules.clinical`
package](https://insightsengineering.github.io/teal.modules.clinical/)
for more details about using this code inside a `shiny` application.

------------------------------------------------------------------------

## Example

Here we will demonstrate how the `tern.gee` package functionality can be
used to fit a GEE model and tabulate its output.

### Setup

Our sample dataset, `fev_data`, is available in the `tern.gee` package
and consists of seven variables: subject ID (`USUBJID`), visit number
(`AVISIT`), treatment (`ARMCD` = TRT or PBO), 3-category `RACE`, `SEX`,
FEV1 at baseline (%) (`FEV1_BL`), and FEV1 at study visits (%) (`FEV1`).
Additionally we create an arbitrary binary variable `FEV1_BINARY` for
our analysis which takes a value of 1 where `FEV1 > 30` and 0 otherwise.
FEV1 (forced expired volume in one second) is a measure of how quickly
the lungs can be emptied. Low levels of FEV1 may indicate chronic
obstructive pulmonary disease (COPD). The scientific question at hand is
whether treatment leads to an increase in FEV1 over time after adjusting
for baseline covariates.

```{r, message=FALSE}
library(tern.gee)
fev_data$FEV1_BINARY <- as.integer(fev_data$FEV1 > 30)
head(fev_data)
```

### Model Fitting

Fitting a GEE model is easy when you use `tern.gee`. By default, the
model fitting function `fit_gee()` assumes unstructured correlation and
proportional weights when calculating LS means, and fits a logistic
regression model. Currently only logistic regression has been
implemented as an available regression model when using `fit_gee()`. In
future the package will be extended to include other models such as
Poisson regression, etc. as alternative options.

```{r}
fev_fit <- fit_gee(
  vars = list(
    response = "FEV1_BINARY",
    covariates = c("RACE", "SEX", "FEV1_BL"),
    arm = "ARMCD",
    id = "USUBJID",
    visit = "AVISIT"
  ),
  data = fev_data
)
fev_fit
```

The resulting object consists of many pieces of information pertaining
to the model such as the estimated coefficients, correlation parameters,
etc. Additionally, the `lsmeans()` function from `tern.gee` can be used
to extract the least squares means from any GEE model created using
`fit_gee()`.

```{r}
fev_lsmeans <- lsmeans(fev_fit, data = fev_data)
fev_lsmeans
```

Based on the output, there is evidence to support that treatment leads
to an increase in FEV1 over placebo. The GEE model can be refined by
using different correlation structures and weighting schemes.

### Tabulation

After fitting a GEE model and extracting the LS means you may want to
display your results in a table. The `tern.gee` package contains
functionality to summarize the results of a `lsmeans()` object in an
`rtable` structure, using additional functions from the [`rtables`
package](https://insightsengineering.github.io/rtables/).

```{r}
fev_counts <- fev_data %>%
  dplyr::select(USUBJID, ARMCD) %>%
  unique()

fev_gee_table <- basic_table(show_colcounts = TRUE) %>%
  split_cols_by("ARMCD", ref_group = "PBO") %>%
  summarize_gee_logistic() %>%
  build_table(fev_lsmeans, alt_counts_df = fev_counts)

fev_gee_table
```

First we create a table `fev_counts` to get the number of unique
subjects receiving each treatment. These counts are displayed in the
header of the table under each of the column names by specifying
`show_colcounts = TRUE` when initializing the table via the
`basic_table()` function. The table is split by arm (`ARMCD`), with
`PBO` specified as the reference group to compare the `TRT` group to.
Then the `summarize_gee_logistic()` function from `tern.gee` is applied.
Finally, the `build_table()` function builds the `rtable` using our LS
means dataset with `fev_counts` providing the counts of unique subjects
in each arm.