---
title: "Getting Started with Price Index Calculation using REPS"
author: ""
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting Started with Price Index Calculation using REPS}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
bibliography: ./REFERENCES.bib
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup, include=FALSE}
library(REPS)
data("data_constraxion")
```

## Introduction

The `calculate_hedonic_index()` function is the central entry point in `REPS` for computing price indices using various **hedonic-based methods**. It supports six commonly used approaches:

- **Laspeyres** – hedonic double imputation base-weighted index  
- **Paasche** – hedonic double imputation current-weighted index  
- **Fisher** – geometric average of Laspeyres and Paasche (both hedonic double imputation)  
- **HMTS** – Hedonic Multilateral Time Series re-estimation Splicing  
- **Time Dummy** – single-regression hedonic index using time dummies  
- **Rolling Time Dummy** – chained index based on overlapping time-dummy regressions
- **Repricing** – quasi-repeat-sales method comparing observed vs predicted price changes

This vignette demonstrates how to apply each method using a consistent interface, making it easy to compare results across approaches.

The HMTS method implemented in `REPS` is a multilateral, time-series-based index that balances stability, limited revision, and early detection of turning points in the context of property price indices [@51c4602ed48c4adbb7b7d15176d2da7a].

For broader context and international guidelines on the compilation of property price indices, including traditional methods such as hedonic double imputation Laspeyres, Paasche, Fisher and Repricing, we refer to Eurostat's *Handbook on Residential Property Price Indices (RPPIs)* [@eurostat2013rppi]. For (Rolling) Time Dummy we refer to Hill et al. [@hill2018repricing; @hill2022rolling].


## Required Data

Before running any calculations, ensure that your dataset is available and contains the necessary variables:

```{r}
# Example dataset (you should already have this loaded)
head(data_constraxion)
```

The required variables include:

- `period_variable`: the time period  
- `dependent_variable`: usually price  
- `numerical_variables`: e.g., `floor_area`  
- `categorical_variables`: e.g., `neighbourhood_code`

Typically, for some numerical variables you may want to apply a log transformation. For example, `floor_area` is often log-transformed to improve linearity, stabilize variance, and reduce the impact of extreme values. Log-transforming variables can help meet regression assumptions by making relationships between variables more linear and residuals more homoscedastic (constant variance).

Example of log-transforming `floor_area`:

```{r}
dataset <- data_constraxion
dataset$floor_area <- log(dataset$floor_area)
```

## Using `calculate_hedonic_index()`

The `calculate_hedonic_index()` function provides a unified interface for estimating hedonic price indices. You only need to specify the method via the `method` argument — the function handles the rest.

Supported methods:

- `"laspeyres"`: base-period weighted double imputation  
- `"paasche"`: current-period weighted double imputation  
- `"fisher"`: geometric average of Laspeyres and Paasche  
- `"hmts"`: Hedonic Multilateral Time Series with re-estimation and splicing  
- `"timedummy"`: single time-dummy regression (log-linear hedonic model with time factors)  
- `"rolling_timedummy"`: chained index from overlapping time-dummy regressions  
- `"repricing"`: compares observed vs predicted price growth in consecutive periods (quasi-repeat-sales)


### Example: Single Index Method - Time Dummy

```{r}
Tbl_TD <- REPS::calculate_hedonic_index(
  dataset = dataset,
  method = "timedummy",
  period_variable = "period",
  dependent_variable = "price",
  numerical_variables = c("floor_area", "dist_trainstation"),
  categorical_variables = c("dummy_large_city", "neighbourhood_code"),
  reference_period = 2015,
  number_of_observations = FALSE
)

head(Tbl_TD)
```

### Example: Multiple Index Methods - Fisher, Paasche and Laspeyres

```{r}
multi_result <- REPS::calculate_hedonic_index(
  method = c("fisher","hmts", "laspeyres", "paasche", "repricing", "timedummy", "rolling_timedummy"),
  dataset = dataset,
  period_variable = "period",
  dependent_variable = "price",
  numerical_variables = c("floor_area", "dist_trainstation"),
  categorical_variables = c("neighbourhood_code", "dummy_large_city"),
  reference_period = "2015",
  number_of_observations = FALSE,
  periods_in_year = 4,
  number_preliminary_periods = 1,
  window_length = 4,
  production_since = NULL,
  resting_points = FALSE,
  imputation = FALSE
)

head(multi_result$fisher)
```

### Visualizing the Index

For quick and clear visualizations, the `plot_price_index()` utility function can be used to generate time-series plots of the calculated indices.

While we encourage users to create custom visualizations suited to their analytical needs, this built-in plotting function provides a convenient starting point for simple and consistent line plots.


```r
plot_price_index(multi_result)

```
```{r echo=FALSE, out.width="100%", fig.align="center"}
knitr::include_graphics("multi_index.png")
```

## Summary

The `calculate_hedonic_index()` function streamlines access to multiple hedonic index methods via a consistent interface. This allows analysts to easily compare outputs and select the most appropriate method for their context.

## References
