---
title: "Concatenating cohort records"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{a05_concatanate_cohorts}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
NOT_CRAN <- identical(tolower(Sys.getenv("NOT_CRAN")), "true")

knitr::opts_chunk$set(
  collapse = TRUE, 
  warning = FALSE, 
  message = FALSE,
  comment = "#>",
  eval = NOT_CRAN
)
```


```{r setup}
library(omock)
library(dplyr)
library(CohortConstructor)
library(CohortCharacteristics)
library(ggplot2)
```

For this example we'll use the Eunomia synthetic data from the [omock](https://ohdsi.github.io/omock/) package.

```{r}
cdm <- mockCdmFromDataset(datasetName = "GiBleed", source = "duckdb")
```

Let's start by creating a cohort of users of acetaminophen

```{r}
cdm$medications <- conceptCohort(cdm = cdm, 
                                 conceptSet = list("acetaminophen" = 1127433), 
                                 name = "medications")
cohortCount(cdm$medications)
```
We can merge cohort records using the `collapseCohorts()` function in the CohortConstructor package. The function allows us to specifying the number of days between two cohort entries, which will then be merged into a single record.

Let's first define a new cohort where records within 1095 days (~ 3 years) of each other will be merged.

```{r}
cdm$medications_collapsed <- cdm$medications |> 
  collapseCohorts(
  gap = 1095,
  name = "medications_collapsed"
)
```

Let's compare how this function would change the records of a single individual.

```{r}
cdm$medications |>
  filter(subject_id == 1)
cdm$medications_collapsed |>
  filter(subject_id == 1)
```
Subject 1 initially had 4 records between 1971 and 1982. After specifying that records within three years of each other are to be merged, the number of records decreases to three. The record from 1980-03-15 to 1980-03-29 and the record from 1982-09-11 to 1982-10-02 are merged to create a new record from 1980-03-15 to 1982-10-02. 

Now let's look at how the cohorts have been changed. 

```{r}
summary_attrition <- summariseCohortAttrition(cdm$medications_collapsed)
tableCohortAttrition(summary_attrition)
```

