---
title: "semboottools"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{semboottools}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
bibliography: references.bib
csl: apa.csl
link-citations: true
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Overview

This vignette demonstrates how to form bootstrapping confidence intervals and examining bootstrap estimates in SEM using [`semboottools`](https://yangzhen1999.github.io/semboottools/), described in @yang_forming_2026.

The following packages will be used:

```{r setup}
library(semboottools)
library(lavaan)
```

# Example: Simple Mediation Model

We use a simple mediation model with a large sample (N = 1000) for demonstration.

This model includes: A predictor `x`, A mediator `m`, An outcome `y`.

Indirect effect (`ab`) and total effect (`total`)
are defined below.

```{r}
# Set seed for reproducibility
set.seed(1234)

# Generate data
n <- 1000
x <- runif(n) - 0.5
m <- 0.20 * x + rnorm(n)
y <- 0.17 * m + rnorm(n)
dat <- data.frame(x, y, m)

# Specify mediation model in lavaan syntax
mod <- '
  m ~ a * x
  y ~ b * m + cp * x
  ab := a * b
  total := a * b + cp
'
```
## Fit the Model with Bootstrapping

Suppose we fit the model using the default
method for standard errors and confidence
intervals for model parameter:

```{r}
fit <- sem(mod,
           data = dat,
           fixed.x = FALSE)
summary(fit,
        ci = TRUE)
```

For the indirect effect, we would like
to use bootstrap confidence intervals.
Instead of refitting the model, we can
call `store_boot()` to do bootstrapping,
and add the bootstrap estimates to the
the original output. The original object
can be safely overwritten.

```{r}
# Ensure bootstrap estimates are stored
# `R`, the number of bootstrap samples, should be ≥2000 in real studies.
# `parallel` should be used unless fitting the model is fast.
# Set `ncpus` to a larger value or omit it in real studies.
# `iseed` is set to make the results reproducible.
fit <- store_boot(fit,
                  R = 500,
                  parallel = "snow",
                  ncpus = 2,
                  iseed = 1248)
```

## Form Bootstrap CIs for Standardized Coefficients

```{r}
# Basic usage: default settings
# Compute standardized solution with percentile bootstrap CIs
std_boot <- standardizedSolution_boot(fit)
print(std_boot)
```

## Form Bootstrap CIs for Unstandardized Coefficients

Although the main feature is for the standardized solution,
the `parameterEstimates_boot()` can be used to compute bootstrap CIs, standard errors, and optional asymmetric *p*-values for unstandardized parameter estimates, including both free and user-defined parameters,
when bootstrapping is conducted by `store_boot()`.

It requires bootstrap estimates stored via `store_boot()`, supports percentile and bias-corrected CIs, and outputs bootstrap SEs as the standard deviation of estimates.

```{r}
# Basic usage: default settings
# Compute unstandardized solution with percentile bootstrap CIs
est_boot <- parameterEstimates_boot(fit)

# Print results
print(est_boot)
```


## Visualize Bootstrap Estimates

To examine the distribution of bootstrap estimates, two functions are available:

-   `hist_qq_boot()`\
    For histogram + normal QQ-plot of **one parameter**.

-   `scatter_boot()`\
    For scatterplot matrix of **two or more parameters**.

### Histogram and QQ Plot: `hist_qq_boot()`

```{r, fig.width = 6, fig.height = 3, fig.align='center'}
# For estimates of user-defined parameters,
# unstandardized
gg_hist_qq_boot(fit,
                param = "ab",
                standardized = FALSE)
# For estimates in standardized solution,
gg_hist_qq_boot(fit,
                param = "ab",
                standardized = TRUE)
```

### Scatterplot Matrix: `scatter_boot()`

```{r}
# standardized solution
gg_scatter_boot(fit,
                param = c("a", "b", "ab"),
                standardized = TRUE)
# unstandardized solution
gg_scatter_boot(fit,
                param = c("a", "b", "ab"),
                standardized = FALSE)
```

## Reference(s)