---
title: "RKHS-based Algorithms in bigPLSR"
shorttitle: "RKHS-based Algorithms"
author:
- name: "Frédéric Bertrand"
  affiliation:
  - Cedric, Cnam, Paris
  email: frederic.bertrand@lecnam.net
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    toc: true
vignette: >
  %\VignetteIndexEntry{RKHS-based Algorithms in bigPLSR}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup_ops, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "figures/rkhs-",
  fig.width = 7,
  fig.height = 5,
  dpi = 150,
  message = FALSE,
  warning = FALSE
)

LOCAL <- identical(Sys.getenv("LOCAL"), "TRUE")
set.seed(2025)
```

## Overview

bigPLSR implements two kernel-based partial least squares solvers:

- `algorithm = "rkhs"` (Rosipal & Trejo style) projects only the
  predictor matrix \(X\) into an RKHS;
- `algorithm = "rkhs_xy"` projects both \(X\) and the response matrix
  \(Y\) into (possibly different) RKHSs and couples the latent scores
  through a regularised cross-covariance operator.

Both solvers are available for dense matrices and for
`bigmemory::big.matrix` objects. The big-memory paths stream kernel
blocks and persist centering statistics so predictions remain cheap.

## Dense example

```{r dense-example}
library(bigPLSR)
set.seed(42)
n <- 120; p <- 8; m <- 2
X <- matrix(rnorm(n * p), n, p)
Y <- cbind(
  sin(X[, 1]) + 0.3 * X[, 2]^2 + rnorm(n, sd = 0.1),
  cos(X[, 3]) - 0.2 * X[, 4] + rnorm(n, sd = 0.1)
)

fit_rkhs <- pls_fit(X, Y, ncomp = 3, algorithm = "rkhs",
                    kernel = "rbf", gamma = 1 / p, scores = "r")

options(bigPLSR.rkhs_xy.lambda_x = 1e-6)
options(bigPLSR.rkhs_xy.lambda_y = 1e-6)

fit_rkhs_xy <- pls_fit(X, Y, ncomp = 3, algorithm = "rkhs_xy",
                       kernel = "rbf", gamma = 1 / p,
                       scores = "none")

head(predict(fit_rkhs, X))
head(predict(fit_rkhs_xy, X))
```

Both fits run in well under five seconds for this moderately sized
example. The RKHS-XY variant stores kernel centering statistics for both
sides so that `predict()` can re-use them without recomputing the entire
Gram matrix.

## Streaming example

```{r, eval=FALSE}
library(bigmemory)
Xbm <- as.big.matrix(X)
Ybm <- as.big.matrix(Y)

fit_stream <- pls_fit(Xbm, Ybm, ncomp = 3, backend = "bigmem",
                      algorithm = "rkhs", kernel = "rbf",
                      gamma = 1 / p, chunk_size = 1024L,
                      scores = "none")
```

The streaming call attaches training descriptors (`$X_ref`) and kernel
centering summaries (`$kstats`) automatically. When `predict()` is
invoked on new data with `Xtrain = fit_stream$X_ref`, the package streams
the cross-kernel blocks and avoids materialising the full \(n_\text{new}
\times n_\text{train}\) Gram matrix.

## Logistic response

Kernel logistic PLS (`algorithm = "klogitpls"`) builds on the RKHS
infrastructure. After extracting latent scores from the centered Gram
matrix the algorithm runs a logistic IRLS procedure in score space with
support for class weighting and optional alternating score updates. Small
datasets (hundreds of observations) remain well within the five-second
budget.

```{r, eval=FALSE}
y <- as.integer(X[, 1]^2 + X[, 2]^2 + rnorm(n, sd = 0.2) > 1)
fit_logit <- pls_fit(X, y, ncomp = 2, algorithm = "klogitpls",
                     kernel = "rbf", gamma = 1 / p)
mean(predict(fit_logit, X))
```