---
title: "Specifying Priors"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Specifying Priors}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
  \usepackage{xcolor}
  \usepackage{bbding}
bibliography: "`r here::here('vignettes', 'library.bib')`"
---

```{r setup, include=FALSE, message=FALSE, warning=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  warning = FALSE,
  message = FALSE,
  fig.retina = 3,
  comment = "#>"
)

set.seed(123)
```

Prior specifications define your assumptions about respondent preferences before collecting data. These priors serve two key purposes in **cbcTools**: optimizing experimental designs (for D-optimal methods) and simulating choices. This article shows how to create and work with prior specifications using `cbc_priors()`.

# What Are Priors?

Priors represent your beliefs about how attributes influence respondent choices before collecting data. They specify:

- **Direction of effects**: Positive values increase utility, negative values decrease it.
- **Magnitude of effects**: Larger absolute values indicate stronger preferences.
- **Heterogeneity in effects**: Random parameters can be used to specify heterogeneity in preferences across the population.

## Sources of Prior Information

- **Literature review**: Published studies in similar contexts.
- **Expert judgment**: Domain knowledge from researchers or practitioners.
- **Pilot studies**: Small preliminary studies to estimate effects.
- **Previous studies**: Your own past research in related areas.
- **Theoretical expectations**: Economic theory or behavioral assumptions.

# Basic Prior Specification

For the purposes of this article, we'll keep working with the same set of profiles about apples:

```{r}
library(cbcTools)

profiles <- cbc_profiles(
  price = c(1, 1.5, 2, 2.5, 3),
  type = c('Fuji', 'Gala', 'Honeycrisp'),
  freshness = c('Poor', 'Average', 'Excellent')
)

profiles
```

## Fixed Parameters

Fixed parameters assume you know the exact coefficient values. Start with the simplest case:

```{r}
# Basic fixed priors
priors_fixed <- cbc_priors(
  profiles = profiles,
  price = -0.25, # Negative = prefer lower prices
  type = c(0.5, 1.0), # Preferences relative to reference level
  freshness = c(0.6, 1.2) # Preferences relative to reference level
)

priors_fixed
```

### Understanding Categorical Variables

For categorical attributes, **the reference level is set by the first level defined in `cbc_profiles()`**, which in this case is `"Fuji"` for **Type** and `"Poor"` for **Freshness**. This would imply the following for the above set of priors:

**Type**:

- **Fuji**: coefficient = 0 (Reference level)
- **Gala**: coefficient = 0.5 (preferred over Fuji)
- **Honeycrisp**: coefficient = 1.0 (most preferred)

**Freshness**:

- **Poor**: coefficient = 0 (Reference level)
- **Average**: coefficient = 0.6
- **Excellent**: coefficient = 1.2

### Using Named Specifications

You can specify the levels for categorical priors using names:

```{r}
priors_named <- cbc_priors(
  profiles = profiles,
  price = -0.25,
  type = c("Gala" = 0.5, "Honeycrisp" = 1.0),
  freshness = c("Average" = 0.6, "Excellent" = 1.2)
)
```

This produces the same set of priors as the `priors_fixed` example above:

```{r}
identical(priors_fixed$pars, priors_named$pars)
```

## Random Parameters

Random parameters allow for preference heterogeneity across respondents. Use `rand_spec()` to define random parameters. In the example below, we use `"n"` to specify normal distributions for the `price` and `freshness` attributes. Note that for `freshness` the `mean` and `sd` are vectors since it is a categorical attribute with three levels (the first level is the reference level):

```{r}
priors_random <- cbc_priors(
  profiles = profiles,
  price = rand_spec(
    dist = "n",
    mean = -0.25,
    sd = 0.1
  ),
  type = c(0.5, 1.0),
  freshness = rand_spec(
    dist = "n",
    mean = c(0.6, 1.2),
    sd = c(0.1, 0.1)
  )
)

priors_random
```

Three distributions are supported:

- `"n"`: normal
- `"ln"`: log-normal (forces positivity)
- `"cn"`: censored normal (forces positivity)

## Parameter Correlations

Model correlations between random parameters can be included using `cor_spec()`:

```{r}
priors_correlated <- cbc_priors(
  profiles = profiles,
  price = rand_spec(
    dist = "n",
    mean = -0.1,
    sd = 0.05,
    correlations = list(
      cor_spec(
        with = "type",
        with_level = "Honeycrisp",
        value = 0.3
      )
    )
  ),
  type = rand_spec(
    dist = "n",
    mean = c("Gala" = 0.1, "Honeycrisp" = 0.2),
    sd = c("Gala" = 0.05, "Honeycrisp" = 0.1)
  ),
  freshness = c(0.1, 0.2)
)

# View the correlation matrix
priors_correlated$correlation
```

### Types of Correlations

General correlation between all levels of two attributes:

```{r eval=FALSE}
cor_spec(
  with = "type",
  value = -0.2
)
```

Correlation with a specific level of a categorical attribute:

```{r eval=FALSE}
cor_spec(
  with = "type",
  with_level = "Honeycrisp",
  value = 0.3
)
```

Correlation from a specific level to another specific level:

```{r eval=FALSE}
cor_spec(
  with = "freshness",
  level = "Gala",
  with_level = "Excellent",
  value = 0.4
)
```

## Interaction Effects

You can include interaction terms in your priors using `int_spec()`:

```{r}
# Create priors with interaction effects
priors_interactions <- cbc_priors(
  profiles = profiles,
  price = -0.25,
  type = c("Fuji" = 0.5, "Honeycrisp" = 1.0),
  freshness = c("Average" = 0.6, "Excellent" = 1.2),
  interactions = list(
    # Price sensitivity varies by apple type
    int_spec(
      between = c("price", "type"),
      with_level = "Fuji",
      value = 0.1
    ),
    int_spec(
      between = c("price", "type"),
      with_level = "Honeycrisp",
      value = 0.2
    ),
    # Type preferences vary by freshness
    int_spec(
      between = c("type", "freshness"),
      level = "Honeycrisp",
      with_level = "Excellent",
      value = 0.3
    )
  )
)

priors_interactions
```

### Interaction Types

Continuous × Categorical: Must specify categorical level

```{r eval=FALSE}
int_spec(
  between = c("price", "type"),
  with_level = "Fuji",
  value = 0.05
)
```

Categorical × Categorical: Must specify both levels:

```{r eval=FALSE}
int_spec(
  between = c("type", "freshness"),
  level = "Gala",
  with_level = "Excellent",
  value = 0.1
)
```

Continuous × Continuous: No level specification needed:

```{r eval=FALSE}
int_spec(
  between = c("price", "weight"),
  value = 0.02
)
```

## No-Choice Priors

For designs with no-choice options, specify the no-choice utility with the `no_choice` argument:

```{r}
# Fixed no-choice prior
priors_nochoice_fixed <- cbc_priors(
  profiles = profiles,
  price = -0.25,
  type = c(0.5, 1.0),
  freshness = c(0.6, 1.2),
  no_choice = -0.5 # Negative values make no-choice less attractive
)

# Random no-choice prior
priors_nochoice_random <- cbc_priors(
  profiles = profiles,
  price = -0.25,
  type = c(0.5, 1.0),
  freshness = c(0.6, 1.2),
  no_choice = rand_spec(dist = "n", mean = -0.5, sd = 0.2)
)

priors_nochoice_fixed
```

# Parameter Draws for Bayesian Analysis

When you specify random parameters, `cbc_priors()` automatically generates parameter draws for Bayesian D-error calculation. You can conrol both the draw type with the `draw_type` argument (`"halton"` or `"sobol"`) and the number of draws with the `n_draws` argument, e.g.:

```{r, fig.width=6, fig.height=4, fig.alt="Histogram showing the distribution of price parameter draws. The distribution appears roughly normal, centered around -0.25, with values ranging from approximately -0.5 to 0, indicating negative price sensitivity as expected."}

priors_bayesian <- cbc_priors(
  profiles = profiles,
  price = rand_spec(
    dist = "n",
    mean = -0.25,
    sd = 0.1
  ),
  type = rand_spec(
    dist = "n",
    mean = c(0.5, 1.0),
    sd = c(0.1, 0.2)
  ),
  freshness = c(0.6, 1.2),
  n_draws = 500, # Default = 100
  draw_type = "sobol" # Default = "halton"
)

# Inspect the parameter draws
price_draws <- priors_bayesian$par_draws[, 1]
cat("Parameter draws dimensions:", dim(priors_bayesian$par_draws), "\n")
cat("Mean of price draws:", mean(price_draws), "\n")
cat("SD of price draws:", sd(price_draws), "\n")

# Plot distribution of one parameter
hist(
  price_draws,
  main = "Distribution of Price Parameter Draws",
  xlab = "Price Coefficient"
)
```

# Common Pitfalls

## Mismatched Scales

Ensure prior magnitudes match your attribute scales:

```{r eval=FALSE}
# Problem: Price in dollars, prior assumes price in cents
profiles_dollars <- cbc_profiles(price = c(1.00, 2.00, 3.00), ...)
priors_cents <- cbc_priors(profiles_dollars, price = -10, ...) # Too large!

# Solution: Match scales
priors_dollars <- cbc_priors(profiles_dollars, price = -0.10, ...) # Appropriate
```

## Wrong Reference Levels

Remember the first level is always the reference:

```{r}
# If you want "Excellent" as reference, reorder profiles
profiles_reordered <- cbc_profiles(
  price = c(1, 1.5, 2, 2.5, 3),
  type = c('Fuji', 'Gala', 'Honeycrisp'),
  freshness = c('Excellent', 'Average', 'Poor') # Excellent now reference
)

priors_reordered <- cbc_priors(
  profiles_reordered,
  price = -0.1,
  type = c(0.1, 0.2),
  freshness = c(-0.1, -0.2) # Negative = worse than excellent
)
```

## Incompatible Restrictions

Ensure your priors are compatible with restricted profiles:

```{r eval=FALSE}
# If you've restricted certain profile combinations,
# make sure your priors don't assume those combinations are common
restricted_profiles <- cbc_restrict(
  profiles,
  type == "Fuji" & price > 2.5
)

# Prior should reflect that expensive Fuji combinations don't exist
priors_compatible <- cbc_priors(restricted_profiles, ...)
```

# Using Priors in Practice

Once created, priors are used in:

1. **Design optimization**: Pass to `cbc_design()` for D-optimal methods. See the [Generating Designs](design.html) article for more details.
2. **Choice simulation**: Pass to `cbc_choices()` for realistic choice patterns. See the [Simulating Choices](choices.html) article for more details.
