---
title: 'Getting started with irg'
author: 'Alec L. Robitaille'
date: '`r Sys.Date()`'
output: 
  rmarkdown::html_vignette:
    toc: true
vignette: >
  %\VignetteIndexEntry{Getting started with irg}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r knitrsetup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = '#>',
  eval = FALSE
)
options(scipen = 9999)
```

The `irg` package opts for a tabular calculation of the instantaneous rate of green-up (IRG) as opposed to a raster based approach. Sampling MODIS imagery is left up to the user and a prerequisite for all functions. The main input (`DT`) for all functions is a [`data.table`](https://github.com/Rdatatable/data.table) of an NDVI time series. The sampling unit (`id`) is flexible (a decision for the user) though we would anticipate points or polygons, or maybe a pixel. All functions leverage the speed of `data.table` to efficiently filter, scale and model NDVI time series, and calculate IRG. 


## Installation
Install with CRAN

```{r, eval = FALSE}
# Install 
install.packages('irg')
```

or R-universe

```{r, eval = FALSE}
# Enable the robitalec universe
options(repos = c(
    robitalec = 'https://robitalec.r-universe.dev',
    CRAN = 'https://cloud.r-project.org'))

# Install 
install.packages('irg')
```

### Packages
`irg` depends on three packages (and `stats`):

* [`data.table`](https://github.com/Rdatatable/data.table) for all tabular processing 
* [`RcppRoll`](https://github.com/kevinushey/RcppRoll) for fast rolling medians in `filter_roll`. 
* [`chk`](https://github.com/pre-processing-r/chk) for internal checks. 

No external dependencies. 

### Input data
`irg` requires an NDVI time series in a  `data.table`. 

Though names can be different and specified at input, but the default names and required columns are:

* id: sampling unit identification (point id, polygon id, ...)
* yr: year of sample
* DayOfYear: julian day
* NDVI: sampled NDVI value
* SummaryQA: Quality reliability of pixel

SummaryQA details:

* 0: Good data, use with confidence
* 1: Marginal data, useful but look at detailed QA for more information
* 2: Pixel covered with snow/ice
* 3: Pixel is cloudy


Let's take a look at the example data. 

```{r extdata, eval = TRUE}
library(irg)
library(data.table)

ndvi <- fread(system.file('extdata', 'sampled-ndvi-MODIS-MOD13Q1.csv', package = 'irg'))

# or look at the help page
?ndvi
```

```{r printdata, eval = TRUE, echo = FALSE}
knitr::kable(ndvi[90:95])
```

#### data.frame
If your data is a `data.frame`, convert it by reference:

```{r setdt}
# Pretend
DF <- as.data.frame(ndvi)

# Convert by reference
setDT(DF)
```

#### Sampling NDVI
Though `irg` is not involved in the sampling step, it is important that the
input data matches the package's expectations.

We used the incredible [Google Earth Engine](https://earthengine.google.com/) to
sample MODIS NDVI (MOD13Q1.006). There are also R packages specific to MODIS
([`MODIStsp`](https://github.com/ropensci/MODIStsp)) and general purpose raster
operations ([`raster`](https://github.com/rspatial/raster)), and 
an R interface to Earth Engine [`rgee`](https://github.com/r-spatial/rgee).

Update: we recently added the `use_example_ee_script()` function which 
offers an example script for extracting NDVI in Earth Engine. There
are two versions, one for sampling MODIS MOD13Q1 and another for 
sampling Landsat 8.

#### Temporal extent
Filtering steps in `irg` use a baseline 'winterNDVI' and upper quantile as
described by Bischoff et al. (2012). These steps require multiple years of
sampled NDVI for each `id`. In addition, make sure to include all samples
throughout the year, leaving the filtering for `irg`.



## Workflow 
```{r functGraphViz, eval = TRUE, echo = FALSE, self_contained = FALSE}
library(DiagrammeR)
g <- grViz(
	"
	digraph irg_functions  {
	graph [rankdir=LR, compound=TRUE, fontsize = 28]

	node[shape=none, fontsize=28]

	subgraph cluster_filt{
	label= '1)'; labeljust='l';
	Filtering -> filter_ndvi [dir=none]
	filter_ndvi -> filter_qa [dir=none]
	filter_ndvi -> filter_winter [dir=none]
  filter_ndvi -> filter_roll [dir=none]
  filter_ndvi -> filter_top [dir=none]

	filter_top -> filter_roll -> filter_winter -> filter_qa [dir=back]
	{rank=same; filter_qa; filter_winter; filter_roll; filter_top}
	}

	subgraph cluster_scal{
  label= '2)' labeljust='l';

	Scaling -> scale_doy [dir=none]
	Scaling -> scale_ndvi [dir=none]
	}

	subgraph cluster_mod{
  label= '3)' labeljust='l';

	Modeling -> model_start [dir=none]
	Modeling -> model_params [dir=none]
	Modeling -> model_ndvi [dir=none]

	model_ndvi -> model_params -> model_start [dir=back]
	{rank=same; model_ndvi; model_params; model_start}
	}

	subgraph cluster_irg{
  label= '4)' labeljust='l';

	IRG -> calc_irg [dir=none]
	}

  Filtering -> Scaling -> Modeling -> IRG
	
	# irg -> Filtering
	# irg -> Scaling
	# irg -> Modeling
	# irg -> IRG

	}
	", width = 700, height = 600)
g
```




```{r, eval = TRUE, echo = FALSE}
fs <-
	data.table(functions = as.character(lsf.str('package:irg')))[, 
             arguments := paste(unlist(formalArgs(functions)), 
            									 collapse = ', ' ), 
             by = functions]
```

There are `r nrow(fs[grepl('filter', functions)])` filtering functions, 
`r nrow(fs[grepl('scale', functions)])` scaling functions, 
`r nrow(fs[grepl('model', functions)])` modeling functions and 
`r nrow(fs[grepl('irg', functions)])` IRG functions. 

The `irg::irg` function is a wrapper for all steps - 
filtering, scaling, modeling and calculating IRG in one step. 
At this point, only defaults. Here's 5 rows from the result. 

For options, head to the steps below. 

```{r, eval = TRUE}
out <- irg(ndvi)
```

```{r, eval = TRUE, echo = FALSE}
knitr::kable(out[between(t, 0.4, 0.5)][1:5, .(id, yr, t, fitted, irg)])
```


## Filtering
There are `r nrow(fs[grepl('filter', functions)])` filtering functions. 

```{r, echo = FALSE, eval = TRUE}
# fs[grepl('qa', functions), order := 1]
# fs[grepl('winter', functions), order := 2]
# fs[grepl('roll', functions), order := 3]
# fs[grepl('top', functions), order := 4]
knitr::kable(fs[grepl('filter', functions), .(functions, arguments)])
```


```{r, eval = FALSE}
# Load data.table
library(data.table)
library(irg)

# Read in example data
ndvi <- fread(system.file('extdata', 'sampled-ndvi-MODIS-MOD13Q1.csv', package = 'irg'))

# Filter NDVI time series
filter_qa(ndvi, qa = 'SummaryQA', good = c(0, 1))

filter_winter(ndvi, probs = 0.025, limits = c(60L, 300L),
							doy = 'DayOfYear', id = 'id')

filter_roll(ndvi, window = 3L, id = 'id', method = 'median')

filter_top(ndvi, probs = 0.925, id = 'id')

```




## Scaling
Two scaling functions are use to scale the day of year column 
and filtered NDVI time series between 0-1. 


```{r}
# Scale variables
scale_doy(ndvi, doy = 'DayOfYear')
scale_ndvi(ndvi)
```



## Modeling 
Three functions are used to model the NDVI times series to a double 
logistic curve, as described by Bischoff et al. (2012). 

$$fitted = \frac{1}{1 + e^ \frac{xmidS - t}{scalS}} - \frac{1}{1 + e^ \frac{xmidA - t}{scalA}}$$

Two options from this point are available: fitting NDVI and
calculating IRG for observed data only, or for the full year. 

To calculate for every day of every year, specify `returns = 'models'` 
in `model_params`, `observed = FALSE` in `model_ndvi` and assign the 
output of `model_ndvi`. 

```{r}
# Guess starting parameters
model_start(ndvi, id = 'id', year = 'yr')

# Double logistic model parameters given starting parameters for nls
mods <- model_params(
  ndvi,
  returns = 'models',
  id = 'id', year = 'yr',
  xmidS = 'xmidS_start', xmidA = 'xmidA_start',
  scalS = 0.05,
  scalA = 0.01
)

# Fit double log to NDVI
fit <- model_ndvi(mods, observed = FALSE)
```

Alternatively, to calculate for the observed data only, specify 
`returns = 'columns'` in `model_params` and  `observed = TRUE` in `model_ndvi`. 

```{r}
# Guess starting parameters
model_start(ndvi, id = 'id', year = 'yr')

# Double logistic model parameters given starting parameters for nls
model_params(
  ndvi,
  returns = 'columns',
  id = 'id', year = 'yr',
  xmidS = 'xmidS_start', xmidA = 'xmidA_start',
  scalS = 0.05,
  scalA = 0.01
)

# Fit double log to NDVI
model_ndvi(ndvi, observed = TRUE)
```




## IRG

$$IRG = \frac{e ^ \frac{t + xmidS}{scalS}}{2 scalS e ^ \frac{t + xmidS}{scalS} + scalS e ^ \frac{2t}{scalS} + scalS e ^ \frac{2midS}{scalS}}$$
Finally, calculate IRG:

```{r}
# Calculate IRG for each day of the year
calc_irg(fit)

# or for observed data
calc_irg(ndvi)
```

