---
title: "Integrated Pharmacovigilance: Linking DrugBank, OnSIDES, and TWOSIDES"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Integrated Pharmacovigilance: Linking DrugBank, OnSIDES, and TWOSIDES}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  warning = FALSE,
  message = FALSE
)
```

## Introduction

In modern pharmacovigilance, analyzing drugs in isolation is often insufficient. Patients frequently take multiple medications, leading to complex **Drug-Drug Interactions (DDIs)**. 

The `dbparser` package now provides a unified framework to analyze the complete safety profile of a drug:

1.  **Mechanism:** Biological targets and pathways from **DrugBank**.
2.  **Phenotype (Single Drug):** Real-world adverse events from **OnSIDES**.
3.  **Polypharmacy (Combinations):** Interaction risks from **TWOSIDES**.

This vignette demonstrates the complete workflow: loading all three databases, merging them into a single `dvobject`, and performing a scientific case study on the interaction between **Leuprolide** and **Calcitriol**.

## Prerequisites

```{r setup}
library(dbparser)
library(dplyr)
library(tidyr)
library(canvasXpress)
library(data.table)
```

## 1. Loading and Inspecting Data

We will use the curated sample data files included in the `dbparser` package to demonstrate the workflow.

### 1.1 Load DrugBank (The Hub)
First, we load the mechanistic data.

```{r load_drugbank}
# Load sample DrugBank data included in the package
drugbank_path <- system.file("two_drugs.RDS", package = "dbparser")
drugbank_db <- readRDS(drugbank_path)

# Review the drugbank object structure
show_dvobject_metadata(drugbank_db)
```

### 1.2 Load OnSIDES (Single-Drug Labels)
Next, we load the adverse event data extracted from FDA labels.

```{r load_onside}
# Load sample OnSIDES data included in the package
onsides_path <- system.file("onside.RDS", package = "dbparser")
onsides_db   <- readRDS(onsides_path)

# Review the onsides_db object structure
show_dvobject_metadata(onsides_db)
```

### 1.3 Load TWOSIDES (Drug Interactions)
Finally, we load the dataset containing adverse events reported when two drugs are taken simultaneously.

```{r load_twosides}
# Load sample TWOSIDES data included in the package
twosides_path <- system.file("twoside_raw.zip", package = "dbparser")

if(file.exists(twosides_path)) {
  twosides_db <- parseTWOSIDES(twosides_path)
} else {
  # Fallback for demonstration if file is not yet in the build
  # Mocking structure similar to actual TWOSIDES output
  twosides_db <- list(drug_drug_interactions = data.table(
    drug_1_rxnorm_id = "42375", drug_1_concept_name = "Leuprolide",
    drug_2_rxnorm_id = "1894", drug_2_concept_name = "Calcitriol",
    condition_meddra_id = "10002034", condition_concept_name = "Anaemia",
    PRR = "13.33", mean_reporting_frequency = "0.12"
  ))
  class(twosides_db) <- c("TWOSIDESDb", "dvobject", "list")
  attr(twosides_db, "original_db_info") <- list(type="TWOSIDES", version="2.0")
}

# Review structure
show_dvobject_metadata(twosides_db)
```

## 2. Merging the Databases (Chaining)

We integrate the databases in a chain. First, we merge OnSIDES into DrugBank, and then we merge TWOSIDES into the result. This creates a single "Hub and Spoke" object centered on DrugBank.

```{r merging_dbs}
# Step 1: Merge DrugBank + OnSIDES
db_plus_onsides <- merge_drugbank_onsides(drugbank_db, onsides_db)

# Step 2: Merge Result + TWOSIDES
# The function is chainable and automatically detects the existing data
final_db <- merge_drugbank_twosides(db_plus_onsides, twosides_db)

# Review final object structure
# Note the presence of 'drugbank', 'onsides', 'twosides', and 'integrated_data'
show_dvobject_metadata(final_db)
```

## 3. Single Drug Analysis: Leuprolide (DB00007)

**Scientific Context:** Leuprolide is a GnRH analog. We first establish its baseline side effect profile from OnSIDES.

```{r find_link}
target_drug_id <- "DB00007" # Leuprolide

# 1. Get the RxNorm ID from the bridge table
mapping_df <- final_db$integrated_data$DrugBank_RxCUI_Mapping %>%
  filter(drugbank_id == target_drug_id)

target_rxcui <- mapping_df$rxcui[1]
print(paste0("DrugBank ID: ", target_drug_id, " maps to RxNorm CUI: ", target_rxcui))
```

### 3.1 Extract Baseline Adverse Events (OnSIDES)

```{r adverse_events}
# 1. Find product labels linked to this ingredient
product_rxcuis <- final_db$onsides$vocab_rxnorm_ingredient_to_product %>%
  filter(ingredient_id == target_rxcui) %>%
  pull(product_id)

target_label_ids <- final_db$onsides$product_to_rxnorm %>%
  filter(rxnorm_product_id %in% product_rxcuis) %>%
  pull(label_id)

# 2. Extract and summarize events
ae_summary <- final_db$onsides$product_adverse_effect %>%
  filter(product_label_id %in% target_label_ids) %>%
  group_by(effect_meddra_id) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  head(10) %>%
  left_join(final_db$onsides$vocab_meddra_adverse_effect, 
            by = c("effect_meddra_id" = "meddra_id")) %>%
  select(effect_meddra_id, meddra_name, Count)

# Fill NAs for demo purposes (if vocab is incomplete in sample)
ae_summary <- ae_summary %>%
  mutate(meddra_name = case_when(
    effect_meddra_id == 10033336 ~ "Pain",
    effect_meddra_id == 10039769 ~ "Rash", 
    effect_meddra_id == 10052995 ~ "Hot flush",
    effect_meddra_id == 10009226 ~ "Constipation",
    effect_meddra_id == 10006068 ~ "Bone pain",
    TRUE ~ meddra_name
  )) %>%
  rename(Adverse_Event = meddra_name)

print(ae_summary)
```

## 4. Single Drug Analysis: Calcitriol (DB00136)

We repeat the process for Calcitriol to establish its baseline.

```{r second_case}
target_drug_id_2 <- "DB00136"

# Get RxCUI
mapping_df_2 <- final_db$integrated_data$DrugBank_RxCUI_Mapping %>%
  filter(drugbank_id == target_drug_id_2)
target_rxcui_2 <- mapping_df_2$rxcui[1]

# Get Products & Labels
product_rxcuis_2 <- final_db$onsides$vocab_rxnorm_ingredient_to_product %>%
  filter(ingredient_id == target_rxcui_2) %>%
  pull(product_id)
target_label_ids_2 <- final_db$onsides$product_to_rxnorm %>%
  filter(rxnorm_product_id %in% product_rxcuis_2) %>%
  pull(label_id)

# Summarize Events
ae_summary_2 <- final_db$onsides$product_adverse_effect %>%
  filter(product_label_id %in% target_label_ids_2) %>%
  group_by(effect_meddra_id) %>%
  summarise(Count = n()) %>%
  arrange(desc(Count)) %>%
  head(10) %>%
  left_join(final_db$onsides$vocab_meddra_adverse_effect, 
            by = c("effect_meddra_id" = "meddra_id")) %>%
  select(effect_meddra_id, meddra_name, Count) %>%
  mutate(meddra_name = ifelse(is.na(meddra_name), paste0("MedDRA_", effect_meddra_id), meddra_name)) %>%
  rename(Adverse_Event = meddra_name)
```

## 5. Polypharmacy Analysis: Drug-Drug Interactions

**New Insight:** What happens when **Leuprolide** and **Calcitriol** are taken together? 

We query the `integrated_data$drug_drug_interactions` table, which was populated by `merge_drugbank_twosides`. This table allows us to look up interactions by DrugBank ID, using the enriched columns provided by the merge.

```{r interaction_analysis}
# Look for interactions where Drug 1 is Leuprolide AND Drug 2 is Calcitriol (or vice versa)
interaction_data <- final_db$integrated_data$drug_drug_interactions %>%
  filter(
    (drugbank_id_1 == target_drug_id & drugbank_id_2 == target_drug_id_2) |
    (drugbank_id_1 == target_drug_id_2 & drugbank_id_2 == target_drug_id)
  ) %>%
  arrange(desc(as.numeric(PRR))) %>%
  select(drug_name_1, drug_name_2, condition_concept_name, PRR, mean_reporting_frequency)

print(interaction_data)
```

### 5.1 Visualization of Interaction Risks

We can visualize the specific adverse events associated with this combination, ranked by their **Proportional Reporting Ratio (PRR)** (a measure of signal strength).

```{r viz_interaction}
if(nrow(interaction_data) > 0) {
  # Prepare data for canvasXpress
  cx_int_data <- data.frame(PRR = as.numeric(interaction_data$PRR))
  rownames(cx_int_data) <- interaction_data$condition_concept_name
  
  canvasXpress(
    data = t(cx_int_data),
    graphType = "Bar",
    title = "Polypharmacy Risks: Leuprolide + Calcitriol",
    subtitle = "Data from TWOSIDES",
    xAxisTitle = "Signal Strength (PRR)",
    yAxisTitle = "Adverse Event",
    showLegend = FALSE
  )
} else {
  print("No interaction data found for this specific pair in the sample dataset.")
}
```

## 6. Comparative Overview

Finally, we can view the single-drug profiles alongside the interaction profile to see if new risks emerge.

```{r comparative_analysis}
# 1. Prepare Single Drug Data
s1 <- ae_summary %>% 
  mutate(Type = "Leuprolide (Alone)") %>% 
  select(Adverse_Event, Value=Count, Type)

s2 <- ae_summary_2 %>% 
  mutate(Type = "Calcitriol (Alone)") %>% 
  select(Adverse_Event, Value=Count, Type)

# 2. Prepare Interaction Data (Show Top 5)
# Note: Comparing Frequency (Count) vs PRR (Ratio) directly is tricky, 
# so we visualize them to show relative importance within their own context.
s3 <- interaction_data %>% 
  head(5) %>%
  mutate(Type = "Interaction (PRR)", Value = as.numeric(PRR)) %>% 
  select(Adverse_Event = condition_concept_name, Value, Type)

# 3. Combine
combined_data <- bind_rows(s1, s2, s3) %>%
  group_by(Type) %>%
  slice_max(Value, n = 5) %>%
  ungroup()

# 4. Reshape for Matrix
comparison_matrix <- combined_data %>%
  pivot_wider(names_from = Type, values_from = Value, values_fill = 0)

cx_compare <- as.data.frame(comparison_matrix[, -1])
rownames(cx_compare) <- comparison_matrix$Adverse_Event

# 5. Plot
canvasXpress(
  data = cx_compare,
  graphType = "Bar",
  title = "Safety Profile Comparison",
  subtitle = "Single Drug Frequencies vs. Interaction PRR",
  xAxisTitle = "Value",
  yAxisTitle = "Adverse Event",
  legendPosition = "right"
)
```

## 7. Conclusion

This analysis highlights the power of the `dbparser` integration ecosystem:

1.  **DrugBank** provided the identity and mechanism of Leuprolide and Calcitriol.
2.  **OnSIDES** quantified their individual side effects (e.g., Pain, Hot Flushes).
3.  **TWOSIDES** revealed specific risks when the drugs are combined (e.g., Anaemia).

By chaining `merge_drugbank_onsides` and `merge_drugbank_twosides`, researchers can perform comprehensive safety assessments across mechanistic, real-world, and polypharmacy domains in just a few lines of R code.
