---
title: "2. Guided Partial Least Squares (guided-PLS)"
author:
- name: Koki Tsuyuzaki
  affiliation: Laboratory for Bioinformatics Research,
    RIKEN Center for Biosystems Dynamics Research
  email: k.t.the-answer@hotmail.co.jp
date: "`r Sys.Date()`"
bibliography: bibliography.bib
package: guidedPLS
output: rmarkdown::html_vignette
vignette: |
  %\VignetteIndexEntry{2. Guided Partial Least Squares (guided-PLS)}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

# Introduction

In this vignette, we consider a novel supervised dimensional reduction method guided partial least squares (guided-PLS).

Test data is available from `toyModel`.

```{r data, echo=TRUE}
library("guidedPLS")
data <- guidedPLS::toyModel("Easy")
str(data, 2)
```

You will see that there are three blocks in the data matrix as follows.

```{r data2, echo=TRUE, fig.height=6, fig.width=6}
suppressMessages(library("fields"))
layout(c(1,2,3))
image.plot(data$Y1_dummy, main="Y1 (Dummy)", legend.mar=8)
image.plot(data$Y1, main="Y1", legend.mar=8)
image.plot(data$X1, main="X1", legend.mar=8)
```

# Guided Partial Least Squares (guided-PLS)

Here, suppose that we have two data matrices $X_1$ ($N \times M$) and $X_2$ ($S \times T$), and the row vectors of them are assumed to be centered. Since these two matrices have no common row or column, integration of them is not trivial. Such a data structure is called "diagonal" and known as a barrier to omics data integration [@diagonal].

Here is a simpler way to set up the problem; suppose that we have another set of matrices $Y_1$ ($M \times I$) and $Y_2$ ($T \times I$), which are the label matrices for $X_1$ and $X_2$, respectively.

In guided-PLS, the data matrices $X_1$ and $X_2$ are projected into lower dimension via $Y_1$ and $Y_2$, and then PLS-SVD are performed against the $Y_{1} X_{1}$ and $Y_{2} X_{2}$ as follows:

$$
\max_{W_{1},W_{2}} \mathrm{tr}
\left(
W_{1}^{T} X_{1}^{T} Y_{1}^{T} Y_{2} X_{2} W_{2}
\right)\ \mathrm{s.t.}\ W_{1}^{T}W_{1} = W_{2}^{T}W_{2} = I_{K}
$$

## Basic Usage

`guidedPLS` is performed as follows.

```{r plssvd, echo=TRUE}
out <- guidedPLS(X1=data$X1, X2=data$X2, Y1=data$Y1, Y2=data$Y2, k=2)
```

```{r plotplssvd, echo=TRUE, fig.height=4, fig.width=4}
plot(rbind(out$scoreX1, out$scoreX2), col=c(data$col1, data$col2),
pch=c(rep(2, length=nrow(out$scoreX1)), rep(3, length=nrow(out$scoreX2))))
legend("bottomleft", legend=c("XY1", "XY2"), pch=c(2,3))
```

# Session Information {.unnumbered}

```{r sessionInfo, echo=FALSE}
sessionInfo()
```

# References