---
title: "1. Overview of sftrack"
output:
  rmarkdown::html_vignette:
    toc: true
vignette: >
  %\VignetteIndexEntry{1. Overview of sftrack}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(
    echo = TRUE,
    fig.width = 6,
    fig.asp = 0.7
)

```

## Definitions of movement data

We start by defining the main objects that `sftrack` is dealing
with. At the top level, there is a **path**, i.e. the real-world
(continuous) curve made by moving objects (Fig. 1).

```{r, out.width = "75%", fig.align = "center", fig.cap = "Fig.1: Glossary introducing the main concepts.", echo = FALSE}
knitr::include_graphics("glossary.png")

```

In practice, this curve is sampled (recorded) at discrete times to
collect **locations**, with or without temporal attributes (think of
non-invasive monitoring of animals such as snow tracks vs. direct
observations), i.e. in the form of $(x, y)$ or $(x, y, t)$
coordinates. Taken together, locations form the bulk of **occurrence
datasets**, such as the massive amount of biodiversity data found on
[Global Biodiversity Information Facility
(GBIF)](https://www.gbif.org/).

If the organism monitored is known and identified ($id$ attribute), we
can collect **tracking locations** (Fig. 1), i.e. $(x, y, t, id)$
coordinates. An ordered series of tracking locations together are in
turn the vertices of a **track**. A track is thus composed of a series
of locations, i.e. the "complete spatio-temporal record of a followed
organism, from the beginning to the end of observations"[^turchin].
Note that $id$ can be further refined as a general grouping factor
(e.g. collar, season, behavior, etc.), which basically split tracking
data into sub-tracks.

[^turchin]: Turchin, P. (1998). Quantitative analysis of movement: measuring and modeling population redistribution in animals and plants. Sinauer Associates, Sunderland, MA, USA.

Finally, tracking data can be modeled considering the temporal
autocorrelation between successive locations as part of the movement
information[^martin]. The most common approach for that purpose is
based on a model of **steps** (Fig. 1), defined as straight-line
segments between two successive tracking locations. Treating movement
steps as straight lines is of course an idealization, and the most
parsimonious approach (other complex approaches exist in some fields,
such as splines and Bezier curves), but at sufficient resolution, no
significant information is lost. A sequence of steps from the same
individual (not necessarily connected) finally forms a **trajectory**.

[^martin]: Martin, J., Tolon, V., Van Moorter, B., Basille, M., & Calenge, C. (2009). On the use of telemetry in habitat selection studies. In D. Barculo, & J. Daniels (Eds.), Telemetry: Research, Technology and Applications (pp. 37–55). Nova Science Publishers Inc.


### Tracking and movement data analyses

Because of the dual nature of tracking and movement data, each data
type allows different analyses based on the underlying geometric
representation (points vs. steps). The following table summarizes in
which case the use of tracking vs. movement data is appropriate:

| Data                     | Tracking data                | Movement data                                       |
|:-------------------------|:----------------------------:|:---------------------------------------------------:|
| Geometric representation | Points                       | Steps                                               |
| Applications             | Home ranges                  | Random walks                                        |
|                          | Resource selection functions | Step selection functions                            |
|                          | K-select                     | typical Hidden-Markov models and State-space models |
|                          |                              | typical segmentation methods                        |
|                          |                              | First passage time and Residence time               |
Table: Table 1: Overview of applications based on tracking and movement data.

While this is out of the scope of the `sftrack` package to provide
methods for these space use analyses, the package however provide the
underlying infrastructure for each case, and a few tools to handle
movement of both living organisms and inanimate objects.


## A conceptual model for movement

To organize these elements of data and how they relate to one another
and to properties of the real world entities (here a moving object),
we followed the methodology for conceptual modeling of geographic
information that is specified in the International Standards (ISO/TC
211), which relies on a standard conceptual model specified in Unified
Modeling Language (UML)[^roswell].

[^roswell]: Roswell, C. (2011). Modeling of geographic information. Springer Handbook of Geographic Information, 3–6. http://dx.doi.org/10.1007/978-3-540-72680-7_1


### Tracking locations

The basic element of the movement data is a **tracking location**. A
tracking location is, in its simplest form, a point defined by its
spatial coordinates $(x, y, z, t)$, with $z$ being optional,
associated to the corresponding timestamp $t$ (note that the time can
also be a simple integer defining the order, for instance in the case
of a track in the snow), and the identity of the moving object. This
is the raw data of movement, which is usually directly provided by a
tracking device.

![Fig. 2: The location model.](conceptual-model-loc.png "Fig. 2: The location model.")

Additionally, "grouping" information can be provided for the location:
this needs to be *a minima* the individual (`ind`), e.g. the actual
moving object (for instance an animal), but can also be extended to
other hierarchical level of the data, such as the year, the grouping, the
movement behavior, etc.

Spatial points are never exact and can be characterized by an
error. Typically, GPS and Argos data will provide information on the
quality of the location, such as satellite data or dilution of
precision (for GPS data). The challenge with this information on the
error is the complete lack of standardization — as such, this will be
left open for the time being.

Finally, a tracking location can also include any information at the
level of the point, such as environmental data measured at the
location (e.g. land cover).


### Tracking data

Tracking data, in the form of **tracks**, is a series of **tracking
locations**. This can be modeled as a collection of tracking
locations, unique and ordered in time by individual. As such,
locations are the vertices of a single track object; conversely, all
locations can be directly extracted from a single track (without any
loss of information).  The strong constraint on time is necessary and
sufficient to organize the data in a track.

![Fig. 3: The track model.](conceptual-model-track.png "Fig. 3: The track model.")


### Steps

A **step** is specifically defined as the straight-line segment
connecting two successive **tracking locations**. In other words, it
takes two and only two locations to define a step, with the
constraints that they belong to the same grouping level (individual in
its simplest form), and that the second location comes temporally
after the first one.

![Fig. 4: The step model.](conceptual-model-step.png "Fig. 4: The step model.")

Note that a step can also include information at the level of the line
segment. This can be for instance the amount of forest along the step,
or the movement rate (i.e. speed).


### Trajectories

Similarly to tracks, **trajectories** are defined by a series of
**steps**.  A trajectory is thus modeled as a collection of steps,
unique and ordered in time by individual. As such, steps are the
vertices of a single trajectory object; conversely, all steps can be
directly extracted from a single trajectory (without any loss of
information).  The strong constraint on time is necessary and
sufficient to organize the data in a trajectory.

![Fig. 5: The trajectory model.](conceptual-model-traj.png "Fig. 5: The trajectory model.")

Given the double equivalence between tracking locations and tracks, on
the one hand, and steps and trajectories, on the other hand, it
follows that there is also a direct equivalence between a track and a
trajectory (based on the same movement data); lossless conversion is
possible in both directions.

![Fig. 6: Full model.](conceptual-model-full.png "Fig. 6: Full model.")


## Implementation in the `sftrack` package

The `sftrack` package deals with all elements presented above,
specifically **tracks** and **trajectories**. We'll begin with a brief
overview of an `sftrack` object. An `sftrack` object is a data object
that describes the movement of a subject, it has $(x,y)$ coordinates
(sometimes $z$), some measurement of time $t$ (clock time or a
sequence of integers), and some grouping variable that identifies the
subjects. For the spatial aspects of `sftrack`, we are using the
package `sf`, a powerful tool that lets us quickly calculate spatial
attributes and plot with ease with its full integration with `rgdal`.

An `sftrack` object has 4 parts to it, 3 of which are required:

 - **Geometry**: This is stored as an `sfc` object from `sf`. Accepts
   $(x,y,z)$ coordinates.
 - **Group**: This is the grouping variables, which contains at
   minimum an `id` field to identify the subject. Grouping variables
   can be turned on at various stages based on the 'active' grouping,
   allowing for the column to be more dynamic than a standard grouping
   column.
 - **Time**: Either a POSIX time object or an integer.
 - **Error**: (optional). This is the field with error data in it. At
   present the column name is being stored as an attribute, but it
   does not have any real functionality yet.

There are two different classes of objects: `sftrack` for **tracks**,
`sftraj` for **trajectories**.


### `sftrack` objects

An `sftrack` object stores tracking data (set of **tracks**) in a
`data.frame`, where the geometries are stored as a `POINT` for each
row.

```{r sftrack-overview, echo = FALSE}
library("sftrack")
data(raccoon)
raccoon$timestamp = as.POSIXct(raccoon$timestamp, tz = "EST5EDT")
my_sftrack <- as_sftrack(
  data = raccoon,
  coords = c("longitude","latitude"),
  time = "timestamp",
  group = "animal_id",
  crs = 4326)
head(my_sftrack)
plot(my_sftrack)

```


### `sftraj` object

An `sftraj` object stores movement data (set of **trajectories**) in a
`data.frame`, where the geometries are stored as a `LINESTRING` for
each row. The linestring is a line built from the two points at 
$t_1 → t_2$. This means that when an `sftraj` object is modified the
steps likely need to be recalculated. Internally, anytime the active
group is changed, then the geometry is recalculated.

```{r sftraj-overview, echo = FALSE}
my_sftraj <- as_sftraj(my_sftrack)
head(my_sftraj)
plot(my_sftraj)

```
