---
title: "Introduction to grangersearch"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introduction to grangersearch}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

## Overview

The `grangersearch` package provides tools for performing Granger causality tests on pairs of time series. Granger causality is a statistical concept that tests whether one time series helps predict another.

```{r setup}
library(grangersearch)
```

## What is Granger Causality?

A variable X is said to **Granger-cause** Y if past values of X contain information that helps predict Y, above and beyond the information contained in past values of Y alone. This is not true causality in the philosophical sense, but rather predictive causality based on temporal precedence.

The test works by fitting Vector Autoregressive (VAR) models and comparing restricted vs unrestricted models using F-tests.

## Basic Usage

### Vector Input

The simplest way to use the package is with two numeric vectors:

```{r basic-usage}
# Generate example time series
set.seed(123)
n <- 100

# X is a random walk
x <- cumsum(rnorm(n))

# Y depends on lagged X (so X should Granger-cause Y)
y <- c(0, x[1:(n-1)]) + rnorm(n, sd = 0.5)

# Perform the test
result <- granger_causality_test(x = x, y = y)
print(result)
```

### Detailed Summary

Use `summary()` for a more detailed output:

```{r summary}
summary(result)
```

## Tidyverse Integration

The package supports tidyverse-style syntax, making it easy to use with data frames and pipes.

### Using with Data Frames

```{r tidyverse-df}
library(tibble)

# Create a tibble with time series
df <- tibble(
  price = cumsum(rnorm(100)),
  volume = c(0, cumsum(rnorm(99)))
)

# Use column names directly
result <- granger_causality_test(df, price, volume)
print(result)
```

### Using Pipes

```{r pipes}
# With base R pipe
df |>
  granger_causality_test(price, volume)
```

### Tidy Output

For programmatic access to results, use `tidy()`:
```{r tidy}
result <- granger_causality_test(x = x, y = y)
tidy(result)
```

Use `glance()` for model-level summary:

```{r glance}
glance(result)
```

## Adjusting Parameters

### Lag Order

The lag parameter controls the number of lagged values used in the VAR model:

```{r lag}
# Using lag = 2
result_lag2 <- granger_causality_test(x = x, y = y, lag = 2)
print(result_lag2)
```

### Significance Level

Adjust the significance level with the `alpha` parameter:

```{r alpha}
# More conservative test with alpha = 0.01
result_strict <- granger_causality_test(x = x, y = y, alpha = 0.01)
print(result_strict)
```

## Interpreting Results

The function tests causality in **both directions**:

1. **X → Y**: Does X help predict Y?
2. **Y → X**: Does Y help predict X?

Possible outcomes:

- **Unidirectional causality**: Only one direction is significant
- **Bidirectional causality**: Both directions are significant
- **No causality**: Neither direction is significant

### Accessing Individual Results

```{r components}
result <- granger_causality_test(x = x, y = y)

# Logical indicators
result$x_causes_y
result$y_causes_x

# P-values
result$p_value_xy
result$p_value_yx

# Test statistics
result$test_statistic_xy
```

## Example: Financial Data

A common application is testing whether one financial variable predicts another:

```{r finance-example}
set.seed(42)
n <- 250  # About one year of trading days

# Simulate stock returns
stock_returns <- rnorm(n, mean = 0.0005, sd = 0.02)

# Trading volume often leads price movements
# Volume is partially predictive of next-day returns
volume <- abs(rnorm(n, mean = 1000, sd = 200))
volume_effect <- c(0, 0.001 * scale(volume[1:(n-1)]))
price_with_volume <- stock_returns + volume_effect

df <- tibble(
  returns = price_with_volume,
  volume = volume
)

# Test if volume Granger-causes returns
result <- df |> granger_causality_test(volume, returns)
print(result)
```

## Important Notes

1. **Stationarity**: Granger causality tests assume stationary time series. Consider differencing non-stationary data.

2. **Lag selection**: The choice of lag order matters. Too few lags may miss dynamics; too many reduce power.

3. **Sample size**: More observations give more reliable results. The minimum is `2 * lag + 2`.

4. **Not true causality**: Granger causality indicates predictive relationships, not true causal mechanisms.

## References

Granger, C. W. J. (1969). Investigating Causal Relations by Econometric Models and Cross-spectral Methods. *Econometrica*, 37(3), 424-438.