---
title: "Benchmarking PLS2 Implementations"
shorttitle: "Benchmarking PLS2 Implementations"
author:
- name: "Frédéric Bertrand"
  affiliation:
  - Cedric, Cnam, Paris
  email: frederic.bertrand@lecnam.net
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    toc: true
vignette: >
  %\VignetteIndexEntry{Benchmarking PLS2 Implementations}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup_ops, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "figures/benchmarking-pls2-",
  fig.width = 7,
  fig.height = 5,
  dpi = 150,
  message = FALSE,
  warning = FALSE
)

LOCAL <- identical(Sys.getenv("LOCAL"), "TRUE")
set.seed(2025)
```

```{r setup, message=FALSE}
library(bigPLSR)
library(bigmemory)
library(bench)
set.seed(456)
```

## Overview

The package offers dense (`pls2_dense`) and streaming (`pls2_stream`)
solvers for multi-response partial least squares regression (PLS2).
This vignette demonstrates how to benchmark both variants on a synthetic
dataset featuring three correlated response variables.

### Recent additions

Beyond the dense and streaming SIMPLS/NIPALS solvers, bigPLSR now ships
with Kalman-filter PLS (`algorithm = "kf_pls"`), double-RKHS modelling
(`algorithm = "rkhs_xy"`) and optional coefficient thresholding. The
resampling helpers (`pls_cross_validate()`, `pls_bootstrap()`) can also
leverage the [`future`](https://future.futureverse.org) ecosystem for
parallel execution.

To benchmark these variants simply change the `algorithm` parameter in
the chunks below, for example:

```{r, eval=FALSE}
bench::mark(
  dense = pls_fit(X[], Y_mat, ncomp = ncomp, algorithm = "rkhs_xy"),
  streaming = pls_fit(X, Y, ncomp = ncomp, backend = "bigmem",
                      algorithm = "kf_pls", chunk_size = 1024L)
)
```

and remember to reset your `future` plan after enabling parallelism:

```{r, eval=FALSE}
future::plan(future::multisession, workers = 2)
pls_cross_validate(X[], Y_mat, ncomp = 4, folds = 3,
                   parallel = TRUE)
future::plan(future::sequential)
```

Multi-response benchmarks follow the same principles as the PLS1 case.
We focus on the `pls_fit()` API and contrast its dense and streaming
backends before reporting the stored results against third-party
packages.

## Simulated data

```{r data-generation, eval=LOCAL, cache=TRUE}
n <- 1200
p <- 60
q <- 3
ncomp <- 4

X <- bigmemory::big.matrix(nrow = n, ncol = p, type = "double")
X[,] <- matrix(rnorm(n * p), nrow = n)

loading_matrix <- matrix(rnorm(p * q), nrow = p)
latent_scores <- matrix(rnorm(n * q), nrow = n)
Y_mat <- scale(latent_scores %*% t(loading_matrix[1:q, , drop = FALSE]) +
                 matrix(rnorm(n * q, sd = 0.5), nrow = n))

Y <- bigmemory::big.matrix(nrow = n, ncol = q, type = "double")
Y[,] <- Y_mat

X[1:6, 1:6]
Y[1:6, 1:min(6, q)]
```

## Internal benchmarks

```{r internal-benchmark, eval=LOCAL, cache=TRUE}
internal_bench <- bench::mark(
  dense_simpls = pls_fit(as.matrix(X[]), Y_mat, ncomp = ncomp,
                         backend = "arma", algorithm = "simpls"),
  streaming_simpls = pls_fit(X, Y, ncomp = ncomp, backend = "bigmem",
                             algorithm = "simpls", chunk_size = 512L),
  dense_nipals = pls_fit(as.matrix(X[]), Y_mat, ncomp = ncomp,
                         backend = "arma", algorithm = "nipals"),
  streaming_nipals = pls_fit(X, Y, ncomp = ncomp, backend = "bigmem",
                             algorithm = "nipals", chunk_size = 512L),
  iterations = 15,
  check = FALSE
)
internal_bench
```

The dense path again excels when memory allows, whereas the streaming
backend prioritises scalability via block-wise processing.

## External references

```{r external-benchmark}
data("external_pls_benchmarks", package = "bigPLSR")
subset(external_pls_benchmarks, task == "pls2")
```

The stored table mirrors the structure of the PLS1 benchmark and was
produced with the script in `inst/scripts/external_pls_benchmarks.R`.

## Key messages

* Dense SIMPLS remains the fastest option for well-sized dense matrices.
* Streaming NIPALS offers robustness when responses are numerous or when
  the predictor matrix is file-backed.
* External comparisons help position bigPLSR relative to established
  alternatives without adding heavyweight dependencies to the vignette.