---
title: "Double RKHS PLS (rkhs_xy): Theory and Usage"
shorttitle: "Double RKHS PLS"
author:
- name: "Frédéric Bertrand"
  affiliation:
  - Cedric, Cnam, Paris
  email: frederic.bertrand@lecnam.net
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    toc: true
vignette: >
  %\VignetteIndexEntry{Double RKHS PLS (rkhs_xy): Theory and Usage}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup_ops, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "figures/double-rkhs-pls-",
  fig.width = 7,
  fig.height = 5,
  dpi = 150,
  message = FALSE,
  warning = FALSE
)

LOCAL <- identical(Sys.getenv("LOCAL"), "TRUE")
set.seed(2025)
```

## Overview

We implement a **double RKHS** variant of PLS, where both the input and the output
spaces are endowed with reproducing kernels:

- \( K_X \in \mathbb{R}^{n\times n} \) with entries \( [K_X]_{ij} = k_X(x_i, x_j) \),
- \( K_Y \in \mathbb{R}^{n\times n} \) with entries \( [K_Y]_{ij} = k_Y(y_i, y_j) \).

We use centered Grams \( \tilde K_X = H K_X H \) and \( \tilde K_Y = H K_Y H \), where
\( H = I - \frac{1}{n}\mathbf{1}\mathbf{1}^\top \).

### Operator and Latent Directions

Following the spirit of *Kernel PLS Regression II* (IEEE TNNLS, 2019), we avoid
explicit square roots and form the **SPD surrogate operator**
\[
\mathcal{M} \, v
= (K_X+\lambda_x I)^{-1} \; K_X \; K_Y \; K_X \; (K_X+\lambda_x I)^{-1} \, v,
\]
with small ridge \( \lambda_x > 0 \) for stability. We compute the first \(A\)
orthonormal latent directions \(T = [t_1,\dots,t_A]\) via power iteration with
Gram–Schmidt orthogonalization on \(\mathcal{M}\).

We then solve a **small** regression in the latent space:
\[
C = (T^\top T)^{-1} (T^\top \tilde Y),
\qquad \tilde Y = Y - \mathbf{1} \bar y^\top,
\]
and form dual coefficients
\[
\alpha \;=\; U \, C, \qquad U \;=\; (K_X+\lambda_x I)^{-1} T,
\]
so that training predictions satisfy
\[
\hat Y \;=\; \tilde K_X \, \alpha + \mathbf{1}\,\bar y^\top .
\]

### Centering for Prediction

Given new inputs \(X_\*\), define the **cross-Gram**
\[
K_\* = K(X_\*, X) .
\]
To apply training centering to \(K_\*\), use
\[
\tilde K_\* \;=\; K_\* \;-\; \mathbf{1}\, \bar k_X^\top \;-\; \bar k_\* \mathbf{1}^\top \;+\; \mu_X,
\]
where:
- \( \bar k_X = \frac{1}{n}\mathbf{1}^\top K_X \) is the **column mean** vector for the (uncentered) training Gram,
- \( \mu_X = \frac{1}{n^2} \mathbf{1}^\top K_X \mathbf{1} \) is its **grand mean**,
- \( \bar k_\* \) is the **row mean** of \(K_\*\) (computed at prediction time).

Predictions then follow the familiar dual form:
\[
\hat Y_\* \;=\; \tilde K_\* \, \alpha + \mathbf{1}_\* \, \bar y^\top .
\]

### Practical Notes

- Choose \(k_X\) (e.g., RBF) to reflect **nonlinear structure** in inputs. A linear \(k_Y\) already produces numeric outputs in \(\mathbb{R}^m\).
- The ridge terms \( \lambda_x, \lambda_y \) stabilize inversions and dampen numerical noise.
- With `algorithm = "rkhs_xy"`, the package returns:
  - `dual_coef` \(=\alpha\),
  - `scores` \(=T\) (approximately orthonormal),
  - `intercept` \(=\bar y\),
  - and uses the centered cross-kernel formula above in `predict()`.

### Minimal Example

```{r, eval=LOCAL, cache=TRUE}
library(bigPLSR)
set.seed(42)
n <- 60; p <- 6; m <- 2
X <- matrix(rnorm(n * p), n, p)
Y <- cbind(sin(X[,1]) + 0.4 * X[,2]^2,
           cos(X[,3]) - 0.3 * X[,4]^2) + matrix(rnorm(n*m, sd=.05), n, m)

op <- options(
  bigPLSR.rkhs_xy.kernel_x = "rbf",
  bigPLSR.rkhs_xy.gamma_x  = 0.5,
  bigPLSR.rkhs_xy.kernel_y = "linear",
  bigPLSR.rkhs_xy.lambda_x = 1e-6,
  bigPLSR.rkhs_xy.lambda_y = 1e-6
)
on.exit(options(op), add = TRUE)

fit <- pls_fit(X, Y, ncomp = 3, algorithm = "rkhs_xy", backend = "arma")
Yhat <- predict(fit, X)
mean((Y - Yhat)^2)
```

References
	•	Rosipal & Trejo (2001) Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space. JMLR 2:97–123. doi:10.5555/944733.944741.
	•	Kernel PLS Regression II: Kernel Partial Least Squares Regression by Projecting Both Independent and Dependent Variables into Reproducing Kernel Hilbert Space. IEEE TNNLS (2019). doi:10.1109/TNNLS.2019.2932014.

