---
title: "Getting Started"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting Started}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 5
)
```

## Introduction

The `charisma` package provides a standardized and reproducible framework for characterizing and classifying discrete color classes from digital images of biological organisms. This vignette walks you through the basic workflow and demonstrates key features of the package.

### What does charisma do?

`charisma` automatically determines the presence or absence of 10 human-visible color categories in images:

- **black**, **blue**, **brown**, **green**, **grey**
- **orange**, **purple**, **red**, **white**, **yellow**

The package uses a biologically-inspired Color Look-Up Table (CLUT) that partitions HSV color space into non-overlapping regions, ensuring each color maps to exactly one category.

## Installation

### System Dependencies

`charisma` depends on spatial R packages that require system-level libraries. Install these first:

**macOS (via Homebrew):**
```bash
brew install udunits gdal proj geos
```

**Ubuntu/Debian:**
```bash
sudo apt-get install libudunits2-dev libgdal-dev libgeos-dev libproj-dev
```

**Fedora/RedHat:**
```bash
sudo dnf install udunits2-devel gdal-devel geos-devel proj-devel
```

### Development Version (GitHub)

```{r, eval = FALSE}
# install.packages("remotes")
remotes::install_github("shawntz/charisma")
```

### Stable Version (CRAN)

```{r, eval = FALSE}
install.packages("charisma")  # Coming soon!
```

## Load the Package

```{r setup}
library(charisma)
```

## Basic Workflow

### Step 1: Load an Image

The package includes an example image of a colorful bird (*Tangara fastuosa*):

```{r}
img_path <- system.file(
  "extdata",
  "Tangara_fastuosa_LACM60421.png",
  package = "charisma"
)
```

### Step 2: Run charisma

The simplest analysis uses default parameters:

```{r, eval = FALSE}
result <- charisma(
  img_path,
  threshold = 0.0,
  interactive = FALSE,
  plot = FALSE,
  pavo = FALSE
)
```

**Key parameters:**

- `threshold`: Minimum proportion of pixels for a color to be retained (0-1)
- `interactive`: Enable manual color merging/replacement
- `plot`: Show diagnostic plots during processing
- `pavo`: Compute color pattern geometry statistics

### Step 3: Visualize Results

```{r, eval = FALSE}
plot(result)
```

This creates a multi-panel visualization showing:

- Original image
- Color-masked image
- Color proportions
- Color histogram

## Understanding the Pipeline

The `charisma` workflow consists of three main stages:

### 1. Image Preprocessing

Images are pre-processed using the `recolorize` package to:

- Perform spatial-color binning
- Remove noisy pixels
- Create a smoothed representation of dominant colors

```{r, eval = FALSE}
# Control preprocessing with bins and cutoff parameters
result <- charisma(
  img_path,
  bins = 4,     # Bins per RGB channel (4^3 = 64 clusters)
  cutoff = 20   # Euclidean distance threshold
)
```

### 2. Color Classification

Each color cluster is converted from RGB to HSV and matched against the CLUT:

```{r}
# Example: Classify a single RGB color
color2label(c(255, 0, 0))    # Red
color2label(c(0, 0, 255))    # Blue
color2label(c(255, 255, 0))  # Yellow
```

### 3. Optional Manual Curation

In interactive mode, you can manually refine classifications:

```{r, eval = FALSE}
result <- charisma(
  img_path,
  interactive = TRUE,
  threshold = 0.0
)
```

**Interactive operations:**

- **Merge**: Combine color clusters (e.g., `c(2,3)`)
- **Replace**: Reassign pixels from one cluster to another
- Complete operation history is saved for reproducibility

## Working with Thresholds

Thresholds automatically remove colors with low pixel proportions:

```{r, eval = FALSE}
# No threshold - keep all colors
result_0 <- charisma(img_path, threshold = 0.0)

# 5% threshold - remove colors < 5% of image
result_5 <- charisma(img_path, threshold = 0.05)

# 10% threshold - remove colors < 10% of image
result_10 <- charisma(img_path, threshold = 0.10)
```

Higher thresholds are useful for:

- Removing image artifacts (shadows, feather overlap in bird specimens)
- Focusing on dominant colors
- Reducing noise in automated workflows

## Saving and Loading Results

Save results for reproducibility:

```{r, eval = FALSE}
# Save with automatic timestamping
out_dir <- file.path("~", "Documents", "charisma_outputs")

result <- charisma(
  img_path,
  threshold = 0.05,
  logdir = out_dir
)
```

This creates:

- `charisma_objects/`: Timestamped .RDS files (full charisma object)
- `diagnostic_plots/`: Timestamped .PDF files (visualization)

Load and re-analyze saved objects:

```{r, eval = FALSE}
# Load saved object
obj <- system.file("extdata", "Tangara_fastuosa.RDS", package = "charisma")
obj <- readRDS(obj)

# Re-analyze with different threshold
result2 <- charisma2(
  obj,
  new.threshold = 0.10
)

# Revert to specific state
result3 <- charisma2(
  obj,
  which.state = "merge",
  state.index = 2
)
```

## Extracting Color Data

The charisma object contains all classification data:

```{r, eval = FALSE}
# Get unique colors present
unique_colors <- unique(result$classification)

# Get number of colors (k)
k <- length(unique_colors)

# Get color proportions
color_props <- result$color_mask_LUT_filtered

# Create presence/absence matrix
summary <- summarize(result)
```

## Custom Color Look-Up Tables

The default CLUT covers 10 human-visible colors, but you can create custom CLUTs:

```{r, eval = FALSE}
# View default CLUT
View(charisma::clut)

# Use custom CLUT
my_clut <- charisma::clut  # Start with default

# ... modify HSV ranges ...
result <- charisma(img_path, clut = my_clut)

# Validate custom CLUT (ensures complete HSV coverage)
validation <- validate(clut = my_clut)
```

**CLUT validation** tests every HSV coordinate to ensure:

1. No gaps (every color maps to a category)
2. No overlaps (each color maps to exactly one category)

## Integration with Evolutionary Analyses

`charisma` output integrates seamlessly with phylogenetic packages:

```{r, eval = FALSE}
# Process multiple species
species_colors <- lapply(image_paths, function(img) {
  result <- charisma(img, threshold = 0.05)
  summarize(result)
})

# Combine into data frame
color_matrix <- do.call(rbind, species_colors)

# Use with geiger, phytools, pavo, etc.
library(geiger)
library(phytools)

# Fit evolutionary models
fit_er <- fitDiscrete(
  phylogeny,
  color_matrix[, "blue"],
  model = "ER"
)

fit_ard <- fitDiscrete(
  phylogeny,
  color_matrix[, "blue"],
  model = "ARD"
)

# Reconstruct ancestral states
ancestral <- ace(
  color_matrix[, "blue"],
  phylogeny,
  type = "discrete"
)
```

## Tips for Best Results

### For Bird Museum Specimens

- Use **manual mode** to remove feather artifact colors (brown/grey from feather bases)
- Set `threshold = 0.0` and manually curate
- Remove bill, leg, and tag pixels before analysis

### For Automated Workflows

- Test different `threshold` values on a subset
- Use `bins = 4` and `cutoff = 20` as starting points
- Save all intermediate results with `logdir`

### For Custom Image Sets

- Validate that the default CLUT works for your images
- Consider creating a custom CLUT for non-biological images
- Always validate custom CLUTs with `validate()`

## Citation

If you use `charisma` in your research, please cite:

> Schwartz, S.T., Tsai, W.L.E., Karan, E.A., Juhn, M.S., Shultz, A.J.,
> McCormack, J.E., Smith, T.B., and Alfaro, M.E. (2025). charisma: An R package
> to perform reproducible color characterization of digital images for biological
> studies. (In Review).

## Getting Help

- **Documentation**: `?charisma`, `?charisma2`, `?color2label`
- **Issues**: https://github.com/shawntz/charisma/issues
- **Email**: shawn.t.schwartz@gmail.com

## Acknowledgments

`charisma` builds upon and integrates with:

- [`recolorize`](https://cran.r-project.org/package=recolorize) (Weller et al. 2024) for image preprocessing
- [`pavo`](https://cran.r-project.org/package=pavo) (Maia et al. 2019) for color pattern geometry
- [`imager`](https://cran.r-project.org/package=imager) (Barthelme, 2025) for image processing operations

We thank the developers of these excellent packages for making this work possible.
