---
vignette: >
  %\VignetteIndexEntry{Example: cumulative median}
  %\VignetteEngine{litedown::vignette}
  %\VignetteEncoding{UTF-8}
---

In this vignette we compare two different implementations of the
cumulative median. The cumstats package provides a naive method, which
uses the standard median function in a for loop. Each call to the
standard median function is log-linear, so the total expected
complexity is log-quadratic. The binsegRcpp package provides a
different implementation that uses a log-linear algorithm, previously
described in [the 2017 NeurIPS research paper Maximum Margin Interval
Trees by Alexandre Drouin, Toby Hocking, Francois
Laviolette](https://proceedings.neurips.cc/paper/2017/hash/2288f691b58edecadcc9a8691762b4fd-Abstract.html).

```{r}
## copied from library(cumstats), thanks @TimTaylor, https://github.com/tdhock/atime/issues/89
cummedian <- function(x) sapply(seq_along(x), function(k, z) median(z[1:k]), z = x)
expr.list <- c(
  if(requireNamespace("binsegRcpp"))atime::atime_grid(
    "binsegRcpp::cum_median"=binsegRcpp::cum_median(data.vec)),
  atime::atime_grid(
    cumsum=cumsum(data.vec),
    cummedian=cummedian(data.vec)))
atime.list <- atime::atime(
  N=2^seq(1, 20),
  setup={
    set.seed(1)
    data.vec <- rnorm(N)
  },
  result=TRUE,
  expr.list=expr.list,
  times=5)
plot(atime.list)
```

```{r}
(best.list <- atime::references_best(atime.list))
## try() to avoid CRAN error 'from' must be a finite number, on
## https://www.stats.ox.ac.uk/pub/bdr/Rblas/README.txt, due to
## https://github.com/r-lib/scales/issues/307
plot(best.list)
```

Exercise for the reader: increase `seconds.limit` and max `N` until
you can clearly show that `binsegRcpp::cum_median` should be the
preferred method for computing the cumulative median.
