---
title: "Getting Started with SingleCellComplexHeatMap"
author: "XueCheng"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting Started with SingleCellComplexHeatMap}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 12,
  fig.height = 8,
  warning = FALSE,
  message = FALSE
)
```

# Introduction

The `SingleCellComplexHeatMap` package provides a powerful and flexible way to visualize single-cell RNA-seq data using complex heatmaps that simultaneously display both gene expression levels (as color intensity) and expression percentages (as circle sizes). This dual-information approach allows for comprehensive visualization of expression patterns across different cell types and conditions.

## Key Features

- **Dual Information Display**: Shows both expression intensity and percentage in a single visualization
- **Gene Grouping**: Organize genes by functional categories with color-coded annotations
- **Time Course Support**: Handle time series data with proper ordering and visualization
- **Cell Type Annotations**: Annotate and group different cell types
- **Flexible Display Options**: Show/hide different annotation layers
- **Highly Customizable**: Extensive parameters for colors, sizes, fonts, and layouts
- **Seurat Integration**: Direct integration with Seurat objects

# Installation

```{r eval=FALSE}
# Install from GitHub
devtools::install_github("FanXuRong/SingleCellComplexHeatMap")
```

# Quick Start

## Loading Libraries and Data

```{r setup}
library(SingleCellComplexHeatMap)
library(Seurat)
library(dplyr)
# Load optional color packages for testing
if (requireNamespace("ggsci", quietly = TRUE)) {
  library(ggsci)
}
if (requireNamespace("viridis", quietly = TRUE)) {
  library(viridis)
}

# For this vignette, we'll use the built-in pbmc_small dataset
data("pbmc_small", package = "SeuratObject")
seurat_obj <- pbmc_small

# Add example metadata for demonstration
set.seed(123)
seurat_obj$timepoint <- sample(c("Mock", "6hpi", "24hpi"), ncol(seurat_obj), replace = TRUE)
seurat_obj$celltype <- sample(c("T_cell", "B_cell", "Monocyte"), ncol(seurat_obj), replace = TRUE)
seurat_obj$group <- paste(seurat_obj$timepoint, seurat_obj$celltype, sep = "_")

# Check the structure
head(seurat_obj@meta.data)
```

## Basic Usage

The simplest way to create a complex heatmap is to provide a Seurat object and a list of features:

```{r basic_example, fig.width=12, fig.height=8}
# Define genes to visualize
features <- c("CD3D", "CD3E", "CD8A", "IL32", "CD79A")

# Create basic heatmap
heatmap_basic <- create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = features,
  group_by = "group"
)
```

## Advanced Usage with Gene Grouping

For more complex analyses, you can group genes by functional categories:

```{r advanced_example, fig.width=12, fig.height=8}
# Define gene groups
gene_groups <- list(
  "T_cell_markers" = c("CD3D", "CD3E", "CD8A", "IL32"),
  "B_cell_markers" = c("CD79A", "CD79B", "MS4A1"),
  "Activation_markers" = c("GZMK", "CCL5")
)

# Get all genes from groups
all_genes <- c("CD3D", "CD3E", "CD8A", "IL32","CD79A", "CD79B", "MS4A1","GZMK", "CCL5")

# Create advanced heatmap with gene grouping
heatmap_advanced <- create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = all_genes,
  gene_classification = gene_groups,
  group_by = "group",
  time_points_order = c("Mock", "6hpi", "24hpi"),
  cell_types_order = c("T_cell", "B_cell", "Monocyte"),
  color_range = c(-2, 0, 2),
  color_palette = c("navy", "white", "firebrick"),
  max_circle_size = 3,
  split_by = "time"
)
```

# Understanding the Visualization

## Dual Information Display

The complex heatmap displays two types of information simultaneously:

1. **Color Intensity**: Represents the scaled expression level of each gene
2. **Circle Size**: Represents the percentage of cells expressing each gene

This dual approach allows you to distinguish between genes that are:
- Highly expressed in few cells (small intense circles)
- Moderately expressed in many cells (large moderate circles)
- Highly expressed in many cells (large intense circles)

## Data Preparation Requirements

For optimal functionality, your group identifiers should follow the format `"timepoint_celltype"`:

```{r data_prep}
# Example of proper data preparation
seurat_obj@meta.data <- seurat_obj@meta.data %>%
  mutate(
    # Create combined group for time course + cell type analysis
    time_celltype = paste(timepoint, celltype, sep = "_"),
    
    # Or create other combinations as needed
    cluster_time = paste(RNA_snn_res.0.8, timepoint, sep = "_")
  )

head(seurat_obj@meta.data[, c("timepoint", "celltype", "time_celltype","cluster_time")])

```

# Customization Options

## Custom Annotation Titles

Here, we test the ability to change the default titles for the annotation tracks.

```{r test_custom_titles, fig.width=12, fig.height=8}
heatmap <- create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = features,
  gene_classification = gene_groups,
  group_by = "group",
  time_points_order = c("Mock", "6hpi", "24hpi"),
  
  # NEW: Custom annotation titles
  gene_group_title = "Gene Function",
  time_point_title = "Time Point", 
  cell_type_title = "Cell Type",
  
  split_by = "time"
)
```

## Color Schemes

You can customize colors for different components:

```{r color_customization, fig.width=12, fig.height=8}
# Custom color schemes
heatmap_colors <- create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = features,
  gene_classification = gene_groups,
  group_by = "group",
  color_range = c(-1.5, 0, 1.5,3),  # 4-point gradient
  color_palette = c("darkblue", "blue", "white", "red"), # 4 colors
  gene_color_palette = "Spectral",
  time_color_palette = "Set2",
  celltype_color_palette = "Pastel1"
)
```

### Using `ggsci` Palettes

```{r test_ggsci_colors, fig.width=12, fig.height=8}
if (requireNamespace("ggsci", quietly = TRUE)) {
  heatmap_colors <- create_single_cell_complex_heatmap(
    seurat_object = seurat_obj,
    features = features,
    gene_classification = gene_groups,
    group_by = "group",
    time_points_order = c("Mock", "6hpi", "24hpi"),
    
    # NEW: Using ggsci color vectors
    gene_color_palette = pal_npg()(3),
    time_color_palette = pal_lancet()(3),
    celltype_color_palette = pal_jama()(4),
    
    # Custom expression heatmap colors
    color_range = c(-2, 0, 2),
    color_palette = c("#2166AC", "#F7F7F7", "#B2182B")
  )
} 
```

### 2.2 Using `viridis` and Custom Colors

```{r test_viridis_custom_colors, fig.width=12, fig.height=8}
if (requireNamespace("ggsci", quietly = TRUE)) {
  heatmap_colors <- create_single_cell_complex_heatmap(
    seurat_object = seurat_obj,
    features = features,
    gene_classification = gene_groups,
    group_by = "group",
    time_points_order = c("Mock", "6hpi", "24hpi"),
    
    # NEW: Using ggsci color vectors
    gene_color_palette = pal_npg()(3),
    time_color_palette = pal_lancet()(3),
    celltype_color_palette = pal_jama()(4),
    
    # Custom expression heatmap colors
    color_range = c(-2, 0, 2),
    color_palette = c("#2166AC", "#F7F7F7", "#B2182B")
  )
} 
```

## Font and Size Adjustments

```{r font_customization, fig.width=12, fig.height=8}
# Publication-ready styling
heatmap_publication <- create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = all_genes,
  gene_classification = gene_groups,
  group_by = "group",
  max_circle_size = 2.5,
  row_fontsize = 12,
  col_fontsize = 12,
  row_title_fontsize = 14,
  col_title_fontsize = 12,
  percentage_legend_title = "Fraction of cells",
  percentage_legend_labels = c("0", "20", "40", "60", "80"),
  legend_side = "right"
)
```

## Visual Element Control

This test demonstrates how to remove cell borders and column annotations for a cleaner look.

```{r test_visual_control, fig.width=12, fig.height=8}
heatmap_con <- create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = features,
  gene_classification = gene_groups,
  group_by = "group",
  
  # NEW: Visual control parameters
  show_cell_borders = FALSE,
  show_column_annotation = FALSE,
  
  # Other parameters for a clean plot
  split_by = "none",
  cluster_cells = TRUE
)
```

## Gene Name Mapping

This test shows how to replace default gene names (e.g., symbols) with custom labels on the y-axis.

```{r test_gene_mapping, fig.width=12, fig.height=8}
# Create a mapping for a subset of genes
gene_mapping <- c(
  "CD3D" = "T-cell Receptor CD3d",
  "CD79A" = "B-cell Antigen Receptor CD79a",
  "GZMK" = "Granzyme K",
  "NKG7" = "Natural Killer Cell Granule Protein 7"
)

heatmap_map <- create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = features,
  gene_classification = gene_groups,
  group_by = "group",
  
  # NEW: Gene name mapping
  gene_name_mapping = gene_mapping,
  
  row_fontsize = 9
)
```

## Clustering Control

You can control clustering behavior for both genes and cells:

```{r clustering_control, fig.width=12, fig.height=8}
# Custom clustering
heatmap_clustering <- create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = features,
  group_by = "group",
  cluster_cells = TRUE,
  cluster_features = TRUE,
  clustering_distance_rows = "pearson",
  clustering_method_rows = "ward.D2",
  clustering_distance_cols = "euclidean",
  clustering_method_cols = "complete"
)
```

# Different Use Cases

## Case 1: Time Course Analysis

For time course experiments, focus on temporal patterns:

```{r time_course, fig.width=12, fig.height=8}
# Time course focused analysis
heatmap_time <- create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = features,
  group_by = "group",
  time_points_order = c("Mock", "6hpi", "24hpi"),
  cell_types_order = c("T_cell", "B_cell", "Monocyte"),
  split_by = "time",
  show_celltype_annotation = TRUE,
  show_time_annotation = TRUE
)
```

## Case 2: Cell Type Comparison

For cell type-focused analysis:

```{r cell_type, fig.width=12, fig.height=8}
# Cell type focused analysis
heatmap_celltype <- create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = features,
  group_by = "celltype",
  split_by = "celltype",
  show_time_annotation = FALSE,
  show_celltype_annotation = TRUE
)
```

## Case 3: Simple Expression Analysis

For basic expression visualization without grouping:

```{r simple, fig.width=8, fig.height=6}
# Simple analysis
heatmap_sample <- create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = features,
  gene_classification = NULL,  # No gene grouping
  group_by = "group",
  show_time_annotation = FALSE,
  show_celltype_annotation = FALSE,
  split_by = "none"
)
```

# Comprehensive Example

This final example combines several new and existing features to create a highly customized, publication-ready plot.

```{r test_comprehensive, fig.width=14, fig.height=12}
create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = features,
  gene_classification = gene_groups,
  group_by = "group",
  time_points_order = c("Mock", "6hpi", "24hpi"),
  
  # --- New Features ---
  gene_group_title = "Functional Category",
  time_point_title = "Time Post-Infection",
  cell_type_title = "Cell Identity",
  show_cell_borders = TRUE,
  cell_border_color = "white",
  gene_name_mapping = c("MS4A1" = "CD20"),
  
  # --- Color Customization ---
  color_range = c(-2, 0, 2),
  color_palette = c("#0072B2", "white", "#D55E00"), # Colorblind-friendly
  gene_color_palette = "Dark2",
  time_color_palette = "Set2",
  celltype_color_palette = "Paired",
  
  # --- Layout and Font ---
  row_fontsize = 10,
  col_fontsize = 9,
  row_title_fontsize = 12,
  col_title_fontsize = 12,
  col_name_rotation = 45,
  legend_side = "right",
  merge_legends = TRUE,
  
  # --- Clustering and Splitting ---
  cluster_features = FALSE, # Rows are already grouped
  cluster_cells = FALSE,    # Columns are already grouped
  split_by = "time"
)
```

# Working with Helper Functions

The package provides helper functions for step-by-step analysis:

## Step 1: Prepare Expression Matrices

```{r helper_1}
# Prepare matrices
matrices <- prepare_expression_matrices(
  seurat_object = seurat_obj,
  features = features,
  group_by = "group",
  idents = NULL  # Use all groups
)

# Check the structure
dim(matrices$exp_mat)
dim(matrices$percent_mat)
head(matrices$dotplot_data)
```

## Step 2: Create Gene Annotations

```{r helper_2}
# Create gene annotations
if (!is.null(gene_groups)) {
  gene_ann <- create_gene_annotations(
    exp_mat = matrices$exp_mat,
    percent_mat = matrices$percent_mat,
    gene_classification = gene_groups,
    color_palette = "Set1"
  )
  
  # Check results
  dim(gene_ann$exp_mat_ordered)
  levels(gene_ann$annotation_df$GeneGroup)
}
```

## Step 3: Create Cell Annotations

```{r helper_3}
# Create cell annotations
cell_ann <- create_cell_annotations(
  exp_mat = matrices$exp_mat,
  percent_mat = matrices$percent_mat,
  time_points_order = c("Mock", "6hpi", "24hpi"),
  cell_types_order = c("T_cell", "B_cell", "Monocyte"),
  show_time_annotation = TRUE,
  show_celltype_annotation = TRUE
)

# Check results
dim(cell_ann$exp_mat_ordered)
head(cell_ann$annotation_df)
```

# Saving and Exporting

You can save plots directly from the function:

```{r save_plot, eval=FALSE}
# Save plot to file
heatmap_saved <- create_single_cell_complex_heatmap(
  seurat_object = seurat_obj,
  features = features,
  gene_classification = gene_groups,
  group_by = "group",
  save_plot = "my_heatmap.png",
  plot_width = 12,
  plot_height = 10,
  plot_dpi = 300
)
```

# Tips and Best Practices

## 1. Data Preparation

- Ensure your metadata columns are properly formatted
- Use consistent naming conventions for time points and cell types
- Consider the number of features - too many can make the plot cluttered

## 2. Color Selection

- Use diverging color palettes for expression data
- Ensure sufficient contrast for accessibility
- Consider colorblind-friendly palettes

## 3. Performance Optimization

- For large datasets, consider subsetting cells or using representative clusters
- Reduce visual complexity with smaller circle sizes and fonts
- Use `merge_legends = TRUE` for cleaner layouts

## 4. Publication Guidelines

- Use high DPI (300+) for publication figures
- Ensure font sizes are readable in the final format
- Include scale bars and legends
- Consider splitting complex plots into multiple panels

# Troubleshooting

## Common Issues

1. **"Features not found"**: Check that gene names match exactly with your Seurat object
2. **"group_by column not found"**: Verify the metadata column name exists
3. **Empty heatmap**: Check that your `idents` parameter includes valid groups
4. **Clustering errors**: Ensure you have enough features for clustering algorithms

## Getting Help

If you encounter issues:

1. Check the function documentation: `?create_single_cell_complex_heatmap`
2. Review the examples in this vignette
3. File an issue on GitHub: https://github.com/FanXuRong/SingleCellComplexHeatMap/issues

# Session Information

```{r session_info}
sessionInfo()
```
