---
title: "Parameter Estimation of the Geometric Model of Visual Meteor Magnitudes"
date: "`r Sys.Date()`"
output:
    rmarkdown::html_vignette:
        toc: true
        fig_width: 6
        fig_height: 4
vignette: >
  %\VignetteIndexEntry{Parameter Estimation of the Geometric Model of Visual Meteor Magnitudes}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
library(vismeteor)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

## Introduction

The geometric model of visual meteor magnitudes is a commonly used statistical approach
to describe the magnitude distribution of a meteor shower. The observable magnitude distribution
of meteors is then
$$
P[M = m] \sim
\begin{cases}
  f(m_{\mathrm{lim}} - m)\, r^m, & \text{if } m_{\mathrm{lim}} - m > -0.5,\\[5pt]
  0 & \text{otherwise,}
\end{cases}
$$

where $m_{\mathrm{lim}}$ denotes the limiting (non-integer) magnitude of the observation,
and $m$ the integer meteor magnitude.
The function $f(\cdot)$ denotes the perception probability function.

The estimation of the population index r, briefly called the r-value,
is a common task in the evaluation of meteor magnitudes.
Here we demonstrate two methods to estimate this parameter.

First, we obtain some magnitude observations from the example data set,
which also includes the limiting magnitude.

```{r, echo=TRUE, results='hide'}
observations <- with(PER_2015_magn$observations, {
    idx <- !is.na(lim.magn) & sl.start > 135.81 & sl.end < 135.87
    data.frame(
        magn.id = magn.id[idx],
        lim.magn = lim.magn[idx]
    )
})
head(observations, 5) # Example values
```

```{r, echo=FALSE, results='asis'}
knitr::kable(head(observations, 5))
```

Next, the observed meteor magnitudes are matched with the corresponding observations.
This is necessary as we need the limiting magnitudes of the observations to determine
the r-value.

Using

```{r, echo=TRUE, results='hide'}
magnitudes <- with(new.env(), {
  magnitudes <- merge(
    observations,
    as.data.frame(PER_2015_magn$magnitudes),
    by = 'magn.id'
  )
  magnitudes$magn <- as.integer(as.character(magnitudes$magn))
  subset(magnitudes, (magnitudes$lim.magn - magnitudes$magn) > -0.5)
})
head(magnitudes, 5) # Example values
```

we obtain a data frame with the absolute observed frequencies `Freq` for each
observation of a magnitude class. The expression
`subset(magnitudes, (magnitudes$lim.magn - magnitudes$magn) > -0.5`
ensures that meteors fainter than the limiting magnitude are not used if they exist.

```{r, echo=FALSE, results='asis'}
knitr::kable(head(magnitudes[magnitudes$Freq>0,], 5))
```

This data frame contains a total of `r sum(magnitudes$Freq)` meteors.
This is a sufficiently large number to estimate the r-value.


## Maximum Likelihood Method

The maximum likelihood method can be used to estimate the r-value in an asymptotically
unbiased manner. For this, the function `dvmgeom()` is needed, which returns the
probability density of the observable meteor magnitudes when the r-value and the
limiting magnitudes are known.

The following algorithm estimates the r-value by maximizing the likelihood with
the `optim()` function. The function `ll()` returns the negative log-likelihood,
as `optim()` identifies a minimum.

```{r, echo=TRUE, results='hide'}
# maximum likelihood estimation (MLE) of r
result.ml <- with(magnitudes, {
    # log likelihood function
    ll <- function(r) -sum(Freq * dvmgeom(magn, lim.magn, r, log=TRUE))
    r.start <- 2.0 # starting value
    r.lower <- 1.2 # lowest expected value
    r.upper <- 4.0 # highest expected value
    # find minimum
    optim(r.start, ll, method='Brent', lower=r.lower, upper=r.upper, hessian=TRUE)
})
```

This gives the expected value and the variance of the r-value:

```{r, echo=TRUE}
print(result.ml$par) # mean of r
print(1/result.ml$hessian[1][1]) # variance of r
```

We can additionally visualize the likelihood function here.

```{r, echo=TRUE, results='hide'}
with(new.env(), {
  data.plot <- data.frame(r = seq(2.0, 2.8, 0.01))
  data.plot$ll <- mapply(function(r){
    with(magnitudes, {
      # log likelihood function
      sum(Freq * dvmgeom(magn, lim.magn, r, log = TRUE))
    })
  }, data.plot$r)
  data.plot$l <- exp(data.plot$ll - max(data.plot$ll))
  data.plot$l <- data.plot$l / sum(data.plot$l)
  brks <- seq(min(data.plot$r) - 0.02, max(data.plot$r) + 0.02, by = 0.02)
  plot(data.plot$r, data.plot$l,
    #breaks = brks,
    type = "l",
    col = "blue",
    xlab = "r",
    xaxt = "n",
    ylab = "likelihood"
  )
  xlabels = seq(min(round(data.plot$r, 1)) - 0.1, max(round(data.plot$r, 1)) + 0.1, by = 0.1)
  axis(
    side = 1,
    at = xlabels,
    labels = sprintf("%.1f", xlabels)
  )
  abline(v = result.ml$par, col = "red", lwd = 1)
})
```

The likelihood function is approximately a normal distribution in this case. This is important in this context because the variance of the estimated r-value is derived from the curvature (the second derivative at the maximum) of the log-likelihood function.


## By Rate

This method leverages the idea that we can estimate how many meteors would have been observed if the limiting magnitude had been reduced by `1`. This yields the statistic

$$
\mathbb{E}\!\left[\frac{1}{r}\right] = \frac{1}{n}\sum \frac{
    f(m_{\mathrm{lim}} - m - 1)
}{
    f(m_{\mathrm{lim}} - m)
}
$$

where $f(\cdot)$ denotes the perception probability function.

The appeal of this statistic is that the r-value can be estimated by taking a mean of the ratio above.
Very bright meteors are inherently grouped into a single magnitude class whenever their
perception probability is approximately `1.0`, which suppresses outliers in this part of the magnitude range.
The resulting estimator is therefore inexpensive to compute and directly comparable to the likelihood-based approaches discussed earlier.

We estimate $r = 1/a$ from the sample mean of $a$. Since $r$ is a function
of $a$, we apply the delta method to correct for bias and to compute the variance of $r$.
The delta method uses a Taylor expansion around the mean of $a$ to approximate the
distribution of $r$.

```{r, echo=TRUE, results='hide'}
result.rate <- with(magnitudes, {
    N <- sum(Freq)
    a <- vmperception(lim.magn - magn - 1)/vmperception(lim.magn - magn)
    a.mean <- as.numeric(weighted.mean(a, w = Freq))
    a.var <- as.numeric(cov.wt(cbind(a), wt = Freq)$cov) / N
    # apply the delta method and return the result
    list(
        'mean' = 1/a.mean - a.var/a.mean^3,
        'var' = a.var/a.mean^4
    )
})
```

This gives the expected value and the variance of the r-value:

```{r, echo=TRUE}
print(result.rate$mean) # mean of r
print(result.rate$var) # variance of r
```

Using the bootstrap method, it can be assessed whether the mean is normally distributed.

```{r, echo=TRUE, results='hide'}
# Bootstrapping Method
r.means <- with(magnitudes, {
  N <- sum(Freq)
  a <- vmperception(lim.magn - magn - 1)/vmperception(lim.magn - magn)
  replicate(50000, {
    s <- sample(a, size = N, replace = TRUE, prob = Freq)
    s.mean <- mean(s)
    s.var <- var(s)/N
    1/s.mean - s.var/s.mean^3
  })
})
```

The graphical representation indicates that this is indeed approximately the case.

```{r, echo=TRUE, results='show'}
with(new.env(), {
  r.sd <- sqrt(result.rate$var)
  r.min <- result.rate$mean - 3 * r.sd
  r.max <- result.rate$mean + 3 * r.sd
  r <- subset(r.means, r.means > r.min & r.means < r.max)
  brks <- seq(min(r) - 0.02, max(r) + 0.02, by = 0.02)
  hist(r,
    breaks = brks,
    col = "skyblue",
    border = "black",
    main = "Histogram of mean r",
    xlab = "r",
    xaxt = "n",
    ylab = "count"
  )
  xlabels = seq(min(round(r, 1)) - 0.1, max(round(r, 1)) + 0.1, by = 0.1)
  axis(
    side = 1,
    at = xlabels,
    labels = sprintf("%.1f", xlabels)
  )
  abline(v = result.rate$mean, col = "red", lwd = 1)
})
```


## Variance-Stabilizing Transformation as an Alternative Method

Estimation based on the maximum likelihood principle is computationally demanding.
As an alternative, a variance-stabilizing transformation can be applied.
This transformation maps meteor magnitudes onto a different scale,
yielding a distribution whose variance no longer depends on the parameter `r`.

The variance-stabilizing transformation has the following additional advantages:

- Since the variance does not depend on the r-value, the accuracy of estimators (e.g., maximum likelihood estimators) is easier to assess,
- The variance bound (Cramér–Rao lower bound) then often depends only on the sample size `n`, but not on the true r-value,
- Estimators such as the sample mean are homoscedastic, i.e., their dispersion is constant across the parameter space. As a result, many classical results of linear estimation theory (BLUE properties, Gauss–Markov theorem) hold without additional transformations,
- Because the variance is fixed, the calculation of standard errors is independent of the unknown r-value,
- Confidence intervals can be more easily standardized and are equally well calibrated across the entire parameter space,
- Test statistics (e.g., likelihood ratio or score tests) exhibit a more uniform distribution, since the variance does not need to be treated as an additional unknown. This improves test power and simplifies asymptotic approximations.

The resulting procedure is straightforward: it suffices to compute the mean of
the transformed meteor magnitudes, from which an estimate of the parameter `r` is obtained.

To convert the mean of the transformed values back to the original `r` scale, we apply
the delta method. This accounts for the nonlinearity of the back-transformation and
provides both a bias correction (second-order term).

```{r, echo=TRUE, results='show'}
result.vs <- with(magnitudes, {
  N <- sum(Freq)
  tm <- vmgeomVstFromMagn(magn, lim.magn)
  tm.mean <- sum(Freq * tm)/N
  tm.var <- sum(Freq * (tm - tm.mean)^2)/(N-1)
  tm.mean.var <- tm.var / N

  # Delta method: variance and variance of r
  r.hat <- vmgeomVstToR(tm.mean)
  dr_dtm <- vmgeomVstToR(tm.mean, deriv.degree = 1L)
  d2r_dtm2 <- vmgeomVstToR(tm.mean, deriv.degree = 2L)
  r.hat <- r.hat - 0.5 * d2r_dtm2 * tm.mean.var
  r.var <- dr_dtm^2 * tm.mean.var + 0.5 * d2r_dtm2^2 * tm.mean.var^2

  list('mean' = r.hat, 'var' = r.var)
})
```

Thus, one obtains the mean and the variance of the mean of `r``.

```{r, echo=TRUE, results='show'}
print(paste('r mean:', result.vs$mean))
print(paste('r var:', result.vs$var))
```

Using the bootstrap method, it can be assessed whether the mean is normally distributed.

```{r, echo=TRUE, results='hide'}
# Bootstrapping Method
r.means <- with(magnitudes, {
  N <- sum(Freq)
  tm <- vmgeomVstFromMagn(magn, lim.magn)
  replicate(50000, {
    s <- sample(tm, size = N, replace = TRUE, prob = Freq)
    s.mean <- mean(s)
    s.var <- var(s)/N

    r.hat <- vmgeomVstToR(s.mean)
    d2r_ds2 <- vmgeomVstToR(s.mean, deriv.degree = 2L)
    r.hat - 0.5 * d2r_ds2 * s.var
  })
})
```

The graphical representation indicates that this is indeed approximately the case.

```{r, echo=TRUE, results='show'}
with(new.env(), {
  r.sd <- sqrt(result.vs$var)
  r.min <- as.vector(result.vs$mean - 3 * r.sd)
  r.max <- as.vector(result.vs$mean + 3 * r.sd)
  r <- subset(r.means, r.means > r.min & r.means < r.max)
  brks <- seq(min(r) - 0.02, max(r) + 0.02, by = 0.02)
  hist(r,
    breaks = brks,
    col = "skyblue",
    border = "black",
    main = "Histogram of mean r",
    xlab = "r",
    xaxt = "n",
    ylab = "count"
  )
  xlabels = seq(min(round(r, 1)) - 0.1, max(round(r, 1)) + 0.1, by = 0.1)
  axis(
    side = 1,
    at = xlabels,
    labels = sprintf("%.1f", xlabels)
  )
  abline(v = result.vs$mean, col = "red", lwd = 1)
})
```

## Residual Analysis

So far, we have operated under the assumption that the real distribution of meteor magnitudes
is exponential and that the perception probabilities are accurate.
We now use the Chi-Square goodness-of-fit test to check whether the observed frequencies match
the expected frequencies. Then, using the estimated r-value, we retrieve the relative
frequencies `p` for each observation and add them to the data frame `magnitudes`:

```{r, echo=TRUE, results='hide'}
magnitudes$p <- with(magnitudes, dvmgeom(m = magn, lm = lim.magn, result.rate$mean))
```

We must also consider the probabilities for the magnitude class with the brightest meteors.

```{r, echo=TRUE, results='hide'}
magn.min <- min(magnitudes$magn)
```

The smallest magnitude class `magn.min` is `r magn.min`. In calculating the probabilities,
we assume that the magnitude class `r magn.min` contains meteors that are either brighter
or equally bright as `r magn.min` and thus use the function `pvmgeom()` to determine
their probability.

```{r, echo=TRUE, results='asis'}
idx <- magnitudes$magn == magn.min
magnitudes$p[idx] <- with(
    magnitudes[idx,],
    pvmgeom(m = magn + 1L, lm = lim.magn, result.rate$mean, lower.tail = TRUE)
)
```

This ensures that the probability of observing a meteor of any given magnitude is 100%.
This is known as the normalization condition. Accordingly, the Chi-Square goodness-of-fit test
will fail if this condition is not met.

We now create the contingency table `magnitutes.observed` for the observed meteor magnitudes
and its margin table.

```{r, echo=TRUE}
magnitutes.observed <- xtabs(Freq ~ magn.id + magn, data = magnitudes)
magnitutes.observed.mt <- margin.table(magnitutes.observed, margin = 2) 
print(magnitutes.observed.mt)
```

Next, we check which magnitude classes need to be aggregated so that each contains
at least 10 meteors, allowing us to perform a Chi-Square goodness-of-fit test.

The last output shows that meteors of magnitude class `0` or brighter must be combined into
a magnitude class `0-`. Meteors with a brightness less than `4` are grouped here in the
magnitude class `4+`, and a new contingency table magnitudes.observed is created:

```{r, echo=TRUE}
magnitudes$magn[magnitudes$magn <= 0] <- '0-'
magnitudes$magn[magnitudes$magn >= 4] <- '4+'
magnitutes.observed <- xtabs(Freq ~ magn.id + magn, data = magnitudes)
print(margin.table(magnitutes.observed, margin = 2))
```

We now need the corresponding expected relative frequencies

```{r, echo=TRUE}
magnitutes.expected <- xtabs(p ~ magn.id + magn, data = magnitudes)
magnitutes.row_freq <- margin.table(magnitutes.observed, margin = 1)
magnitutes.expected <- sweep(magnitutes.expected, 1, magnitutes.row_freq, `*`)
magnitutes.expected <- magnitutes.expected/sum(magnitutes.expected)
print(sum(magnitudes$Freq) * margin.table(magnitutes.expected, margin = 2))
```

and then carry out the Chi-Square goodness-of-fit test:

```{r, echo=TRUE, results='asis'}
chisq.test.result <- chisq.test(
    x = margin.table(magnitutes.observed, margin = 2),
    p = margin.table(magnitutes.expected, margin = 2)
)
```

As a result, we obtain the p-value:

```{r, echo=TRUE}
chi2.df <- chisq.test.result$parameter - 1
chi2.pval <- pchisq(chisq.test.result$statistic, df = chi2.df, lower.tail = FALSE)
print(chi2.pval)
```

If we set the level of significance at 5 percent, then it is clear that the p-value with
`r unname(chi2.pval)` is greater than 0.05. Thus, under the assumption that the
magnitude distribution follows an geometric meteor magnitude distribution and assuming that
the perception probabilities are correct (i.e., error-free or precisely known),
the assumptions cannot be rejected. However, the converse is not true; the assumptions
may not necessarily be correct. The total count of meteors here is too small for such
a conclusion.

To verify the p-value, we also graphically represent the Pearson residuals:

```{r, fig.show='hold'}
chisq.test.residuals <- with(new.env(), {
    chisq.test.residuals <- residuals(chisq.test.result)
    v <- as.vector(chisq.test.residuals)
    names(v) <- names(chisq.test.residuals)
    v
})
plot(
    chisq.test.residuals,
    main="Residuals of the chi-square goodness-of-fit test",
    xlab="m",
    ylab="Residuals",
    ylim=c(-3, 3),
    xaxt = "n"
)
abline(h=0.0, lwd=2)
axis(1, at = seq_along(chisq.test.residuals), labels = names(chisq.test.residuals))
```