---
title: "Approximate Case Influence Using Scores and Casewise Likelihood"
date: "2026-03-04"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Approximate Case Influence Using Scores and Casewise Likelihood}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
bibliography: references.bib
csl: apa.csl
---



This vignette explains the computational shortcut implemented in [semfindr](https://sfcheung.github.io/semfindr/) [@cheung_semfindr_2026] to approximate the casewise influence. The approximation is most beneficial when the sample size *N* is large, where the approximation works better and the computational cost for refitting models *N* times is high.


``` r
library(semfindr)
dat <- pa_dat
```


``` r
library(lavaan)
mod <-
"
m1 ~ iv1 + iv2
dv ~ m1
"
fit <- sem(mod, dat)
```



## Using Scores to Approximate Case Influence

`lavaan` provides the handy `lavScores()` function to evaluate
$$s_i(\theta_m) = \frac{\partial \ell_i(\theta_m)}{\partial \theta_m}$$
for observation $i$, where $\ell_i(\theta)$ denotes the casewise loglikelihood function and $\theta_m$ is the $m$th model parameter.

For example,


``` r
head(lavScores(fit)[ , 1, drop = FALSE])
#>           m1~iv1
#> [1,]  0.23975993
#> [2,]  0.06630118
#> [3,] -0.33999129
#> [4,] -0.22536120
#> [5,]  0.61375747
#> [6,]  0.03747179
```

indicates the partial derivative of the casewise loglikelihod with respect to the parameter `m1~iv1`. Because the sum of the partial derivatives across all observations is zero at the maximum likelihood estimate with the full sample ($\hat \theta_m$; i.e., the derivative of loglikelihood of the full data is 0), $- s_i(\theta_m)$ can be used as an estimate of the partial derivative of the loglikelihood at $\hat \theta_m$ **for the sample without observation $i$**. This information can be used to approximate the maximum likelihood estimate for $\theta_m$ when case $i$ is dropped, denoted as $\hat \theta_{m(-i)}$

The second-order Taylor series expansion can be used to approximate the parameter vector estimate with an observation deleted, $\hat \theta_{(-i)}$, as in the iterative [Newton's method](https://en.wikipedia.org/wiki/Newton%27s_method_in_optimization). Specifically,

$$\hat \theta_{(i)} \approx \hat \theta - \frac{N}{N - 1}V(\hat \theta) \nabla \ell(\hat \theta)$$
$$\hat \theta - \hat \theta_{(i)} \approx \frac{N}{N - 1}V(\hat \theta) \nabla \ell_i(\hat \theta),$$
where $\nabla \ell_i(\hat \theta)$ is the gradient vector of the casewise loglikelihood with respect to the parameters (i.e., score). The $N / (N - 1)$ term is used to adjust for the decrease in sample size (this adjustment is trivial in large samples). This procedure should be the same as equation (4) of @tanaka_influence_1991 (p. 3807) and is related to the one-step approximation described by [@cook_residuals_1982, p. 182].

### Comparison

The approximation is implemented in the `est_change_raw_approx()` function:


``` r
fit_est_change_approx <- est_change_raw_approx(fit)
fit_est_change_approx
#> 
#> -- Approximate Case Influence on Parameter Estimates --
#> 
#>    id m1~iv1  id m1~iv2 id  dv~m1  id m1~~m1 id dv~~dv
#> 1  51  0.042  43 -0.025 65  0.037  61  0.042 16  0.106
#> 2  43 -0.040  94  0.023 11 -0.027  85  0.040  9  0.050
#> 3  34 -0.032 100 -0.022 16 -0.024 100  0.037 76  0.049
#> 4  18 -0.028  85  0.021 32 -0.020  18  0.031 25  0.049
#> 5  13  0.028  20  0.020 99  0.020  42  0.028 91  0.043
#> 6  32 -0.025  32  0.019 79  0.018  43  0.025 17  0.039
#> 7  20 -0.024  65  0.019 93  0.018  32  0.023 26  0.028
#> 8  75  0.021  34 -0.018 22  0.017  34  0.022 65  0.027
#> 9  42 -0.020  64 -0.016 61 -0.016  40  0.022 62  0.027
#> 10 68  0.018  52  0.016 25 -0.015  20  0.022 90  0.024
#> 
#> Note:
#> - Changes are approximate raw changes if a case is included.
#> - Only the first 10 case(s) is/are displayed. Set 'first' to NULL to display all cases.
#> - Cases sorted by the absolute changes for each variable.
```

Here is a comparison between the approximation using `semfindr::est_change_raw_approx()` and `semfindr::est_change_raw()`


``` r
# From semfindr
fit_rerun <- lavaan_rerun(fit)
#> The expected CPU time is 39 second(s).
#> Could be faster if run in parallel.
fit_est_change_raw <- est_change_raw(fit_rerun)
# Plot the differences
library(ggplot2)
tmp1 <- as.vector(t(as.matrix(fit_est_change_raw)))
tmp2 <- as.vector(t(as.matrix(fit_est_change_approx)))
est_change_df <- data.frame(param = rep(colnames(fit_est_change_raw),
                                        nrow(fit_est_change_raw)),
                             est_change = tmp1,
                             est_change_approx = tmp2)
ggplot(est_change_df, aes(x = est_change, y = est_change_approx)) +
  geom_abline(intercept = 0, slope = 1) +
  geom_point(size = 0.8, alpha = 0.5) +
  facet_wrap(~ param) +
  coord_fixed()
```

![Compare Changes in Estimates](compare-est-change-1.png)

The results are pretty similar.

### Generalized Cook's distance (*gCD*)

We can use the approximate parameter changes to approximate the *gCD* [see also @tanaka_influence_1991, equation 13, p. 3811]:


``` r
# Information matrix (Hessian)
information_fit <- lavInspect(fit, what = "information")
# Short cut for computing quadratic form (https://stackoverflow.com/questions/27157127/efficient-way-of-calculating-quadratic-forms-avoid-for-loops)
gcd_approx <- (nobs(fit) - 1) * rowSums(
  (fit_est_change_approx %*% information_fit) * fit_est_change_approx
)
```

This is implemented in the `est_change_approx()` function:


``` r
fit_est_change_approx <- est_change_approx(fit)
fit_est_change_approx
#> 
#> -- Approximate Standardized Case Influence on Parameter Estimates --
#> 
#>     m1~iv1 m1~iv2  dv~m1 m1~~m1 dv~~dv gcd_approx
#> 16   0.052 -0.038 -0.228 -0.006  0.572      0.372
#> 43  -0.387 -0.249 -0.135  0.201  0.116      0.270
#> 65   0.150  0.189  0.355  0.071  0.148      0.203
#> 85  -0.170  0.211 -0.118  0.315 -0.054      0.187
#> 51   0.405 -0.052  0.094  0.075 -0.046      0.179
#> 34  -0.306 -0.186 -0.110  0.176  0.028      0.163
#> 32  -0.241  0.190 -0.189  0.181 -0.002      0.161
#> 20  -0.234  0.199 -0.140  0.172 -0.034      0.144
#> 18  -0.269  0.035  0.101  0.246 -0.048      0.143
#> 100 -0.001 -0.221 -0.069  0.290 -0.058      0.137
#> 
#> Note:
#> - Changes are approximate standardized raw changes if a case is included.
#> - Only the first 10 case(s) is/are displayed. Set 'first' to NULL to display all cases.
#> - Cases sorted by approximate generalized Cook's distance.
```


``` r
# Compare to exact computation
fit_est_change <- est_change(fit_rerun)
# Plot
gcd_df <- data.frame(
  gcd_exact = fit_est_change[ , "gcd"],
  gcd_approx = fit_est_change_approx[ , "gcd_approx"]
)
ggplot(gcd_df, aes(x = gcd_exact, y = gcd_approx)) +
  geom_abline(intercept = 0, slope = 1) +
  geom_point() +
  coord_fixed()
```

![Compare gCDs](compare-approx-gcd-1.png)

The approximation tend to  underestimate the actual *gCD* but the rank ordering is almost identical. This is discussed also in @tanaka_influence_1991, who proposed a correction by applying a one-step approximation after the correction (currently not implemented due to the need to recompute scores with updated parameter values).


``` r
cor(gcd_df, method = "spearman")
#>            gcd_exact gcd_approx
#> gcd_exact   1.000000   0.999892
#> gcd_approx  0.999892   1.000000
```

## Approximate Change in Fit

The casewise loglikelihood---the contribution to the likelihood function by an observation---can be computed in `lavaan`, which approximates the change in loglikelihood when an observation is deleted:


``` r
lli <- lavInspect(fit, what = "loglik.casewise")
head(lli)
#> [1] -2.776787 -2.034084 -2.154825 -2.248100 -2.793426 -2.049238
```

Here, $\ell(\hat \theta)$ will drop 2.78 when observation 1 is deleted. This should approximate $\ell(\hat \theta_{(-i)})$ as long as $\hat \theta_{(-i)}$ is not too different from $\hat \theta$. Here's a comparison:


``` r
# Predicted ll without observation 1
fit@loglik$loglik - lli[1]
#> [1] -289.8272
# Actual ll without observation 1
fit_no1 <- sem(mod, dat[-1, ])
fit_no1@loglik$loglik
#> [1] -289.8156
```

They are pretty close. To approximate the change in $\chi^2$, as well as other $\chi^2$-based fit indices, we can use the `fit_measures_change_approx()` function:


``` r
chisq_i_approx <- fit_measures_change_approx(fit)
# Compare to the actual chisq when dropping observation 1
c(predict = chisq_i_approx[1, "chisq"] + fitmeasures(fit, "chisq"),
  actual = fitmeasures(fit_no1, "chisq"))
#> predict.chisq  actual.chisq 
#>      6.871042      6.557397
```

### Comparing exact and approximate changes in fit indices

Change in $\chi^2$


``` r
# Exact measure from semfindr
out <- fit_measures_change(fit_rerun)
# Plot
chisq_change_df <- data.frame(
  chisq_change = out[ , "chisq"],
  chisq_change_approx = chisq_i_approx[ , "chisq"]
)
ggplot(chisq_change_df, aes(x = chisq_change, y = chisq_change_approx)) +
  geom_abline(intercept = 0, slope = 1) +
  geom_point() +
  coord_fixed()
```

![Compare Changes in Chi-Square](plot-change-chisq-1.png)

Change in RMSEA


``` r
# Plot
rmsea_change_df <- data.frame(
  rmsea_change = out[ , "rmsea"],
  rmsea_change_approx = chisq_i_approx[ , "rmsea"]
)
ggplot(rmsea_change_df, aes(x = rmsea_change, y = rmsea_change_approx)) +
  geom_abline(intercept = 0, slope = 1) +
  geom_point() +
  coord_fixed()
```

![Compare Changes in RMSEA](plot-change-rmsea-1.png)

The values aligned reasonably well along the 45-degree line.

# Limitations

- The approximate approach is tested only for models
  fitted by maximum likelihood (ML) with normal theory
  standard errors (the default).

- The approximate approach does not yet support multilevel
models.

The `lavaan` object will be checked by `approx_check()` to see if it is
supported. If not, an error will be raised.

# References
