---
title: "Using distance based edge-list generating functions, dyad_id and fusion_id"
author: "Alec L. Robitaille, Quinn Webber and Eric Vander Wal"
output: 
  rmarkdown::html_vignette:
    number_sections: true
    toc: true
vignette: >
  %\VignetteIndexEntry{Using distance based edge-list generating functions, dyad_id and fusion_id}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  eval = FALSE,
  echo = TRUE,
  comment = "#>"
)
```

`spatsoc` can be used in social network analysis to generate distance based
edge-lists from GPS relocation data using either the `edge_dist` or the
`edge_nn` function.

---

See the other vignettes for further information:

- [Introduction to spatsoc](https://docs.ropensci.org/spatsoc/articles/intro.html)
  - temporal grouping
  - spatiotemporal grouping with `group_pts`, `group_lines`, `group_polys`
  - distance based edge-list generation with `edge_dist`
- [Frequently asked questions about spatsoc](https://docs.ropensci.org/spatsoc/articles/faq.html)
  - install
  - function details for `group_times`, `group_pts`, `group_lines`, `group_polys`,
  `edge_dist`, `edge_nn`, and `randomizations`
  - package design including modify-by-reference, data.table column allocation
  - calculating summary information
- [Using spatsoc in social network analysis](https://docs.ropensci.org/spatsoc/articles/using-in-sna.html)
  - generating gambit-of-the-group data
  - generating observed networks
  - data stream randomization, randomized networks
  - network metrics
- [Using distance based edge-lists generating functions, dyad_id, and fusion_id](https://docs.ropensci.org/spatsoc/articles/using-edge-and-dyad.html)
  - generate distance based edge-lists with `edge_dist` and `edge_nn`
  - generate dyad identifiers for edge-lists with `dyad_id`
  - identify fusion events with `fusion_id`
- [Geometry interface](https://docs.ropensci.org/spatsoc/articles/geometry-interface-and-spatial-measures.html)
  - using `get_geometry` to setup a geometry column and use the geometry interface
  - details of underlying distance, direction and centroid spatial measures
  - converting to and from related packages
- [Interspecific interactions](https://docs.ropensci.org/spatsoc/articles/interspecific-interactions.html)
  - combine two movement datasets
  - identify interspecific interactions


## Generate edge-lists

spatsoc provides users with one temporal (`group_times`) and two distance based
edge-list generating functions (`edge_dist`, `edge_nn`) to generate edge-lists
from GPS relocations. Users can consider edges defined by either the spatial
proximity between individuals (with `edge_dist`), by nearest neighbour (with
`edge_nn`) or by nearest neighbour with a maximum distance (with `edge_nn`). The
edge-lists can be used directly by the animal social network package `asnipe` to
generate networks.

### 1. Load packages and prepare data

`spatsoc` expects a `data.table` for all `DT` arguments and date time columns to
be formatted `POSIXct`.

```{r, eval = TRUE}
## Load packages
library(spatsoc)
library(data.table)
```

```{r, echo = FALSE, eval = TRUE}
data.table::setDTthreads(1)
```

```{r, eval = TRUE}
## Read data as a data.table
DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))

## Cast datetime column to POSIXct
DT[, datetime := as.POSIXct(datetime)]
```

Next, we will group relocations temporally with `group_times` and generate edges
lists with one of `edge_dist`, `edge_dist`. Note: these are mutually exclusive,
only select one edge-list generating function at a time.

### 2. a) `edge_dist` 

Distance based edge-lists where relocations in each timegroup are considered
edges if they are within the spatial distance defined by the user with the
`threshold` argument. Depending on species and study system, relevant temporal
and spatial distance thresholds are used. In this case, relocations within 5
minutes and 50 meters are considered edges.

This is the non-chain rule implementation similar to `group_pts`. Edges are
defined by the distance threshold and NAs are returned for individuals within
each timegroup if they are not within the threshold distance of any other
individual (if `fillNA` is TRUE).

Optionally, `edge_dist` can return the distances between individuals (less than
the threshold) in a column named 'distance' with argument `returnDist = TRUE`.

```{r, eval = TRUE}
# Temporal groups
group_times(DT, datetime = 'datetime', threshold = '5 minutes')

# Edge-list generation
edges <- edge_dist(
  DT,
  threshold = 100,
  id = 'ID',
  coords = c('X', 'Y'),
  timegroup = 'timegroup',
  returnDist = TRUE,
  fillNA = TRUE
)
```

### 2. b) `edge_nn`

Nearest neighbour based edge-lists where each individual is connected to their
nearest neighbour. `edge_nn` can be used to generate edge-lists defined either
by nearest neighbour or nearest neighbour with a maximum distance. As with
grouping functions and `edge_dist`, temporal and spatial threshold depend on
species and study system.

NAs are returned for nearest neighbour for an individual was alone in a
timegroup (and/or splitBy) or if the distance between an individual and its
nearest neighbour is greater than the threshold.

Optionally, `edge_nn` can return the distances between individuals (less than
the threshold) in a column named 'distance' with argument `returnDist = TRUE`.

```{r, eval = FALSE}
# Temporal groups
group_times(DT, datetime = 'datetime', threshold = '5 minutes')

# Edge-list generation
edges <- edge_nn(
  DT,
  id = 'ID',
  coords = c('X', 'Y'),
  timegroup = 'timegroup'
)

# Edge-list generation using maximum distance threshold
edges <- edge_nn(
  DT, 
  id = 'ID', 
  coords = c('X', 'Y'),
  timegroup = 'timegroup', 
  threshold = 100
)

# Edge-list generation using maximum distance threshold, returning distances
edges <- edge_nn(
  DT, 
  id = 'ID', 
  coords = c('X', 'Y'),
  timegroup = 'timegroup', 
  threshold = 100,
  returnDist = TRUE
)

```


## Dyads

### 3. `dyad_id`

The function `dyad_id` can be used to generate a unique, undirected dyad
identifier for edge-lists.

```{r, eval = TRUE}
# In this case, using the edges generated in 2. a) edge_dist
dyad_id(edges, id1 = 'ID1', id2 = 'ID2')
```


Once we have generated dyad ids, we can measure consecutive relocations, start
and end relocation, etc. **Note:** since the edges are duplicated A-B and B-A,
you will need to use the unique timegroup*dyadID or divide counts by 2.


## Fusion events

### 4. `fusion_id`

The function `fusion_id` can be used to identify fusion events in distance based
edge-lists. The "n_min_length" argument defines the minimum number of successive
fixes that are required to establish a fusion event. The "n_max_missing"
argument defines the the maximum number of allowable missing observations for
the dyad within a fusion event. The "allow_split" argument defines if a single
observation can be greater than the threshold distance without initiating
fission event.


```{r, eval = TRUE}
fusion_id(
  edges = edges,
  threshold = 100,
  n_min_length = 1,
  n_max_missing = 0,
  allow_split = FALSE
)

# Print first 10 fusion events
print(edges[fusionID <= 5])
```


