---
title: "Using the eFCM Package: An Example with European Winter Precipitation"
output:
  rmarkdown::html_vignette:
    code_download: false
vignette: >
  %\VignetteIndexEntry{prEurope}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

## Introduction

This vignette illustrates a complete workflow for fitting the exponential factor copula model using the `eFCM` package, including data preprocessing, model fitting, diagnostic checking, and simulation. We illustrate the workflow using precipitation data from a counterfactual climate scenario, chosen here purely as an example dataset.

## Exponential factor copula model

The exponential factor copula model (eFCM; Castro-Camilo and Huser, 2020) is stochastically defined as

$$
W(s) = Z(s) + V,
$$

where:

- \( Z(s) \), \( s \in \boldsymbol{S} \subset \mathbb{R}^2 \), is a standard Gaussian process with stationary correlation function \( \rho(h) \),  
- \( h = \| s_1 - s_2 \| \) is the distance between two locations \( s_1, s_2 \in \boldsymbol{S} \),
- \( V \sim \text{Exp}(\lambda) \) is an exponentially distributed random variable with rate parameter $\lambda$, independent of both \( Z(s) \) and the spatial location \(s\).

The process W(s) belongs to the class of asymptotically dependent models and may be viewed as a Gaussian location mixture. Dependence is introduced through the common factor V, which induces asymptotic dependence while allowing flexible representation of sub-asymptotic extremal dependence. Because this factor does not vary spatially, the model is particularly well-suited to spatially homogeneous regions, where it is reasonable to assume similar marginal distributions and tail behavior across sites.

If \( Z_j = Z(s_j) \), for \( j = 1,\ldots,d \), the multivariate latent Gaussian vector \( \boldsymbol{Z} = (Z_1, \dots, Z_d)^\top \) follows a multivariate normal distribution, \(\boldsymbol{Z} \sim \Phi(\cdot; \Sigma_Z),\), where the covariance matrix \( \Sigma_Z \) is determined by the correlation function \( \rho(h) \). In this application, we adopt the exponential correlation function,a special case of the Matérn class,
$$
\rho(h) = \exp\left(-\frac{h}{\delta}\right),
$$

where \( \delta > 0 \) is a range parameter controlling the spatial decay. With this, the joint distribution of the process \(W\) is
$$F_d^W(\mathbf{w}) = \Phi_{D}(\mathbf{w};\mathbf{\Sigma}_{\mathbf{Z}})  - \sum_{j = 1}^D\exp\left(\frac{\lambda^2}{2} - \lambda w_j\right)\Phi_{D}\left(\mathbf{q}_{j,0}- \mathbf{\mu}_{j,0};\mathbf\Omega_{j,0}\right),$$
where 
$$
\mathbf{q}_{j,0} = \begin{pmatrix}\mathbf{w}_{-j} - \mathbf{\Sigma}_{\mathbf{Z};-j,j}w_j\\ 0\end{pmatrix},  \quad 
\mathbf{\mu}_{j,0} = \begin{pmatrix} (w_j - \lambda)(\mathbf{1}_{d-1} - \mathbf{\Sigma}_{\mathbf{Z};-j,j})\\ \lambda -w_j \end{pmatrix},$$
$$\mathbf\Omega_{j,0} =\begin{pmatrix} \mathbf{1}_{d-1}\mathbf{1}_{d-1}^T - 2\mathbf{\Sigma}_{\mathbf{Z};-j,j}+\mathbf{\Sigma}_{\mathbf{Z};-j,-j}&\mathbf{\Sigma}_{\mathbf{Z}; -j,j}-\mathbf{1}_{d-1}\\\mathbf{\Sigma}_{\mathbf{Z}; -j,j}-\mathbf{1}_{d-1}&1\end{pmatrix},
$$
In particular, by setting $d=1$, the marginal distribution may be written as 
$$F_1^{\mathbf{W}}(w;\lambda) = \Phi(w) - \exp(\lambda^2/2 - \lambda w)\Phi(w - \lambda).$$

## Data Preparation

We start by loading the package and the pre-processed weekly counterfactual precipitation data.
Specifically, `counterfactual` contains weekly maxima of precipitation under natural-only forcing,
and `LonLat` provides spatial coordinates for each station.

```{r setup, message=FALSE, warning=FALSE}
library(eFCM)

# Load weekly precipitation maxima (pre-aggregated)
data("counterfactual")  # matrix/data frame: [weeks × stations]

# Load station coordinates
data("LonLat")
coord <- LonLat
```

We can visualize how spatially averaged weekly maxima evolve over time:

```{r, fig.width = 6, fig.height = 4, fig.align='center'}
plot(1:nrow(counterfactual), apply(counterfactual, 1, mean),
     type = "l", xlab = "Week", ylab = "Mean Precipitation (mm)",
     main = "Weekly Maxima of Counterfactual Precipitation")
abline(h = quantile(apply(counterfactual, 1, mean), 0.9), col = "red", lty = 2)
```

## Create Spatial Data Objects

The function `fdata()` converts the spatiotemporal precipitation data into the format required for model fitting:

```{r, eval=FALSE}
cf_data = fdata(counterfactual, coord, cellsize = c(1, 1))
```
To reduce vignette build time, we load precomputed results:

```{r, echo=T}
data(cf_data)
data(fit)
```

## Model Fitting

We fit the exponential factor copula model to the counterfactual data using `fcm()`:

```{r, eval=FALSE}
fit = fcm(s = 1, cf_data, boot = T, R = 1000)
```
Here, `s = 1` is the index of the grid point and the and the exceedance threshold is set via `thres`, a quantile level in (0,1) (default 0.9), and `boot = TRUE`, `R = 1000` requests 1,000 bootstrap replications for uncertainty quantification.


## Model Summary and Diagnostics

```{r}
summary(fit)
```

We can extract point estimates of the parameters. Recall that the parameter estimates ($\lambda$ and $\delta$) describe the strength of the common factor and the spatial range of dependence, respectively:

```{r}
coef(fit, method = "hessian")
coef(fit, method = "boot")
```

We can also compute the usual model selection criteria indices:

```{r}
logLik(fit)
AIC(fit)
BIC(fit)
AICc(fit)
```

## Goodness-of-Fit

We can draw Q-Q plots to compare empirical and model-based upper tail quantiles for a given station. By default, `qqplot()` also overlays a generalized Pareto distribution (GPD) fit; this can be suppressed by setting `gpd = FALSE`.

```{r, fig.width = 4, fig.height = 4.5, fig.align='center'}
qqplot(fit, which = 1)
```

We can also assess extremal dependence using the conditional exceedance probability $\chi_h(u)$, which measures the probability of simultaneous exceedances at high but finite thresholds. For two locations $s_1$ and $s_2$ separated by distance $h$, with respective vector components $W(s_1)$ and $W(s_2)$, $\chi_h(u)$ is defined as $\chi_h(u)= \lim_{u\to1}\Pr(W_{s_1}(W(s_1))>u\mid F_{s_2}(W(s_2))>u)$. The function `chiplot()` in `eFCM` plots model-based estimated of $\chi_h(u)$ alongside their empirical counterpart for a range of values of $u$. The package also provides two methods for pointwise confidence intervals: a normal approximation to the MLE or bootstrap resampling. Both approaches are illustrated below.

```{r, eval=T, fig.width = 4, fig.height = 4.5, fig.align='center'}
chiplot(fit, method = "hessian", ylim = c(0, 1))
```

```{r, eval=T, fig.width = 4, fig.height = 4.5, fig.align='center'}
chiplot(fit, method = "boot", ylim = c(0, 1))
```


## Simulation from the Fitted Model

To generate synthetic precipitation fields consistent with the estimated extremal dependence structure, we can simulate from the fitted model as follows:

```{r, eval=FALSE}
lbda <- fit$par[1]
delta <- fit$par[2]
dist <- rdist.earth(fit$coord)
sim  <- rmfcm(lbda, delta, dist, n = 1e4)
```



## Summary

This vignette has demonstrated the complete pipeline for fitting the exponential factor copula model using the `eFCM` package. It covered preprocessing of data (in this case, climate model outputs), model estimation, diagnostic checks, and simulation. For a more detailed discussion of the method and its application to extreme event attribution, see Li and Castro-Camilo (2025+).

## References

Castro-Camilo, D., & Huser, R. (2020). Local likelihood estimation of complex tail dependence structures, with application to U.S. precipitation extremes. *Journal of the American Statistical Association*, *115*(531), 1037–1054. <https://doi.org/10.1080/01621459.2019.1611584>

Li, M., & Castro-Camilo, D. (2025+). On the importance of tail assumptions in climate extreme event attribution. arXiv preprint. <https://doi.org/10.48550/arXiv.2507.14019>
