---
title: "hyperoverlap"
author: "Matilda Brown"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{hyperoverlap}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(knitr)
library(rgl)
knit_hooks$set(webgl = hook_webgl)

if (!requireNamespace("rmarkdown", quietly = TRUE) ||
!rmarkdown::pandoc_available("1.14")) {
warning(call. = FALSE, "These vignettes assume rmarkdown and pandoc
version 1.14. These were not found. Older versions will not work.")
knitr::knit_exit()
}
```

Hyperoverlap can be used to detect and visualise overlap in n-dimensional space. 

## Data: iris
To explore the functions in hyperoverlap, we'll use the `iris` dataset. This dataset contains 150 observations of three species of iris ("setosa", "versicolor" and "virginica"). These data are four-dimensional (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) and are documented in `?iris`. We'll set up five test datasets to explore the different functions:
1. `test1` two entities (setosa, virginica); three dimensions (Sepal.Length, Sepal.Width, Petal.Length)
1. `test2` two entities (versicolor, virginica); three dimensions (as above)
1. `test3` two entities (setosa, virginica); four dimensions
1. `test4` two entities (versicolor, virginica); four dimensions
1. `test5` all entities, all dimensions

```{r, results='show'}
test1 <- iris[which(iris$Species!="versicolor"),c(1:3,5)]
test2 <- iris[which(iris$Species!="setosa"),c(1:3,5)]
test3 <- iris[which(iris$Species!="versicolor"),]
test4 <- iris[which(iris$Species!="setosa"),]
test5 <- iris
```
Note that entities may be species, genera, populations etc.

## Examining overlap between two entities in 3D
To plot the decision boundary using `hyperoverlap_plot`, the data cannot exceed three dimensions. For high-dimensional visualisation, see `hyperoverlap_lda`.

```{r, results='show'}
library(hyperoverlap)
setosa_virginica3d <- hyperoverlap_detect(test1[,1:3], test1$Species)
versicolor_virginica3d <- hyperoverlap_detect(test2[,1:3], test2$Species)
```
To examine the result:
```{r, results='show', fig.height=5,fig.width=7, webgl=TRUE, fig.align="center"}
setosa_virginica3d@result             #gives us the result: overlap or non-overlap?
versicolor_virginica3d@result

setosa_virginica3d@shape              #for the non-overlapping pair, was the decision boundary linear or curvilinear? 


hyperoverlap_plot(setosa_virginica3d) #plot the data and the decision boundary in 3d
```

```{r, results='show', fig.height=5,fig.width=7, webgl=TRUE, fig.align="center"}
hyperoverlap_plot(versicolor_virginica3d) 
```
Note the points on the 'wrong side' of the boundary when comparing versicolor and virginica

## Examining overlap between two entities in n-dimensions
To visualise overlap in n-dimensions, we need to use ordination techniques. The function `hyperoverlap_lda` uses a combination of linear discriminant analysis (LDA) and principal components analysis (PCA) to choose the best two (or three) axes for visualisation. To plot these using other methods (e.g. `ggplot2`), the point coordinates are returned as output, here named `transformed_data`.

```{r, results='show'}
setosa_virginica4d <- hyperoverlap_detect(test3[,1:4], test3$Species)
versicolor_virginica4d <- hyperoverlap_detect(test4[,1:4], test4$Species)
```
To examine the result:
```{r, results='show',  fig.height=4,fig.width=5, fig.show='hold', fig.align='center'}
setosa_virginica4d@result             #gives us the result: overlap or non-overlap?
versicolor_virginica4d@result

setosa_virginica4d@shape              #for the non-overlapping pair, was the decision boundary linear or curvilinear? 

transformed_data <- hyperoverlap_lda(setosa_virginica4d)  #plots the best two dimensions for visualising overlap
transformed_data <- hyperoverlap_lda(versicolor_virginica4d) 
```

In three dimensions: 
```{r, results='show',  fig.height=5,fig.width=7, webgl = hook_webgl,fig.align="center"}
close3d()  #close previous device
transformed_data <- hyperoverlap_lda(setosa_virginica4d, visualise3d=TRUE) 
```

```{r, results='show',  fig.height=5,fig.width=7, webgl = hook_webgl,fig.align="center"}
close3d()  #close previous device
transformed_data <- hyperoverlap_lda(versicolor_virginica4d, visualise3d=TRUE) #plots the best three dimensions for visualising overlap
```
## Examining patterns of overlap in groups of entities
We might want to know which species overlap in certain variables from an entire genus. To do this, we can use `hyperoverlap_set` and visualise the results using `hyperoverlap_pairs_plot`

```{r, results='show', fig.dim=c(5,3),fig.align="center"}
all_spp <- hyperoverlap_set(test5[,1:4],test5$Species)
all_spp_plot <- hyperoverlap_pairs_plot(all_spp)
all_spp_plot
```

