---
title: "Tests for trends in time series"
author: Vyacheslav Lyubchich
date: "`r Sys.Date()`"
output: 
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 3
    number_sections: true
bibliography: vignrefs.bib
vignette: >
  %\VignetteIndexEntry{Tests for trends in time series}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---


```{r, echo = FALSE, message = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#")
library(funtimes)
# devtools::load_all(".") #remove this line
```

# Introduction

The majority of studies focus on detection of linear or monotonic trends, using  

* classical t-test (for linear trends) or
* rank-based Mann--Kendall test (for monotonic trends)  

typically under the assumption of uncorrelated data.

There exist two main problems:

* dependence effect, that is, the issue of inflating significance due to dependent observations when the test is developed for independent data (always check assumptions of the testing method!), and
* change points or regime shifts that affect the linear or monotonic trend hypothesis. For example, when testing the null hypothesis ($H_0$) of no trend against the alternative hypothesis ($H_1$) of linear trend, using t-test, it is easier to reject $H_0$ and accept $H_1$ in case A below, than in case B, and especially hard to reject $H_0$ in case C. Case C reminds us that a test with proper alternative hypothesis should be chosen, and that non-rejection of $H_0$ does not mean it is true.

These problems can be addressed by using tests for **non-monotonic trends** assuming that **observations can be autocorrelated**.

```{r}
set.seed(777)
n <- 100
Time <- c(1:n)
X0 <- arima.sim(list(order = c(1, 0, 0), ar = 0.5), n = n, n.start = 100, sd = 0.5)
X1 <- 2*Time/n + X0
X2 <- 2*(Time/n)^0.5 + X0
X3 <- 0.5*(Time - n/2)/n - 6*((Time - n/2)/n)^2 + X0
X <- as.data.frame(cbind(X0, X1, X2, X3))
```


```{r, echo = FALSE, fig.width = 7, dpi = 96}
library(ggplot2)
library(patchwork)
p1 <- ggplot(X, aes(x = Time, y = X1)) + geom_line() + theme_minimal()
p2 <- ggplot(X, aes(x = Time, y = X2)) + geom_line() + theme_minimal()
p3 <- ggplot(X, aes(x = Time, y = X3)) + geom_line() + theme_minimal()

p1 + p2 + p3 + 
  plot_annotation(
    title = 'Time series simulated for different alternative hypotheses',
    tag_levels = 'A'
  )
```

The time series above were simulated:  
A) `X1` with linear trend,  
B) `X2` with square root -- nonlinear monotonic -- trend, and   
C) `X3` with quadratic -- nonlinear non-monotonic -- trend,  
with stationary autocorrelated innovations `X0`: $X0_t = 0.5X0_{t-1} + e_t$, where $e_t \sim N(0, 0.5^2)$.

```{r, echo = FALSE}
ggplot(X, aes(x = Time, y = X0)) + geom_line() + theme_minimal()
```

Let's test these time series using the functions from package `funtimes`, using significance level $\alpha = 0.05$.

To install and load the package, run
```{r eval = FALSE}
install.packages("funtimes")
library(funtimes)
```


# Testing for presence of a trend

Function `notrend_test` tests the null hypothesis of no trend against different alternatives defined by the corresponding tests.


## Linear trend

Consider the following pair of hypotheses  
$H_0$: no trend  
$H_1$: linear trend  
that can be tested specifically using t-test.

Assuming the time series may be autocorrelated (which is the usual case with observational data), we apply sieve-bootstrap version of the **t-test**, by adapting the approach of @Noguchi_etal_2011:
```{r}
notrend_test(X0)
```
The large $p$-value correctly indicates that there is not enough evidence to reject the hypothesis of no trend in `X0` in favor of the alternative hypothesis of a linear trend. 

For the other time series, $p$-values are reported below:
```{r}
apply(X[,-1], 2, function(x) notrend_test(x)$p.value)
```
indicating that the null hypothesis of no trend could be rejected and hypothesis of a linear trend could be accepted for `X1` and `X2`. While `X3` has a trend (based on the way it was simulated and the time series plot above), the alternative hypothesis of a linear trend does not fit in this case, so the test for linear trend (t-test) failed to reject the null hypothesis.


## Monotonic trend

Since a linear trend is also a monotonic trend, we may expect seeing similar results when testing the following pair of hypotheses  
$H_0$: no trend  
$H_1$: monotonic trend  
using Mann--Kendall test.

Apply **Mann--Kendall test**, also with the sieve-bootstrap enhancement for potentially autocorrelated data; $p$-values are shown below:
```{r}
apply(X, 2, function(x) notrend_test(x, test = "MK")$p.value)
```
indicating that the null hypothesis of no trend could be rejected and hypothesis of a monotonic trend could be accepted for `X1` and `X2`. For `X0` and `X3`, the null hypothesis could not be rejected, because `X0` does not have a trend, and `X3` has a trend that does not match the alternative hypothesis.


## Any trend

If the interest is in testing for any, potentially non-monotonic trend, consider testing the following pair of hypotheses  
$H_0$: no trend  
$H_1$: any trend  
using local regression-based WAVK test [@Wang_etal_2008].

Apply **WAVK test**, also with the sieve-bootstrap enhancement for potentially autocorrelated data:
```{r}
apply(X, 2, function(x) notrend_test(x, test = "WAVK", 
                                     factor.length = "adaptive.selection")$p.value)
```
The results indicate that WAVK test was correct in non-rejecting the null hypothesis for `X0`, and correctly rejected it for the time series with trends `X1`, `X2`, and `X3`.

@Lyubchich_etal_2013_wavk originally implemented *hybrid* bootstrap to this test statistic, available from the `wavk_test` function described in the next section.


# Testing a specific parametric form of trend

Function `wavk_test` is developed for the following goodness-of-fit question [@Lyubchich_etal_2013_wavk]:  
$H_0$: trend is of form $f(\theta,t)$  
$H_1$: trend is not of form $f(\theta,t)$   
where $f$ belongs to a known family of smooth parametric functions, and $\theta$ are its parameters. 

**Note** Considering $f(\theta,t)$ being some polynomial function, non-rejection of the null hypothesis means that function $f(\theta,t)$ or its simpler form (lower-order polynomial) is sufficient for describing the trend in the tested time series. 

**Note** The case of $f(\theta,t) \equiv 0$ corresponds to testing for no trend (in other words, for a constant trend, same as in the previous section), and the following code differs only in the type of bootstrap used, 

* sieve bootstrap in `notrend_test` (WAVK statistic is calculated on original time series and simulated autoregressive series) and 
* hybrid bootstrap in `wavk_test` (WAVK statistic is calculated on time series after the trend $f(\theta,t)$ and autoregressive dependence are removed, and on simulated independent normal series)
```{r}
notrend_test(X0, test = "WAVK", factor.length = "adaptive.selection") # WAVK with sieve bootstrap
wavk_test(X0 ~ 0, factor.length = "adaptive.selection") # WAVK with hybrid bootstrap
```

To test a linear trend $f(\theta,t) = \theta_0 + \theta_1 t$, use
```{r}
wavk_test(X0 ~ t, factor.length = "adaptive.selection")
```
Note that the time sequence `t` is specified automatically within the function.

For the other time series, $p$-values are shown below:
```{r}
apply(X[,-1], 2, function(x) wavk_test(x ~ t, factor.length = "adaptive.selection")$p.value)
```

The function `poly` could also be used, for example, test quadratic trend $f(\theta,t) = \theta_0 + \theta_1 t + \theta_2 t^2$ and show the trend estimates using the argument `out = TRUE`:
```{r}
wavk_test(X3 ~ poly(t, 2), factor.length = "adaptive.selection", out = TRUE)
```


# Citation {-}

This vignette belongs to R package `funtimes`. If you wish to cite this page, please cite the package:
```{r}
citation("funtimes")
```


# References
