---
title: "Troubleshooting model fits"
author: "Vikram B. Baliga"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Troubleshooting model fits}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

Automatically finding good initial values for parameters in a nonlinear model
(i.e. `stats::nls()`) is an art. Given that each of the formulas represented by
the `model` argument of `fit_gaussian_2D()` contains 5 to 7 parameters,
`stats::nls()` will often encounter singular gradients or step size errors. 

Code within `fit_gaussian_2D()` will first scan the supplied dataset to
guesstimate sensible initial parameters, which hopefully sidesteps these issues.
But there is no guarantee this strategy will always work.

This vignette will offer some guidance on what to do when `stats::nls()` fails
to converge, including the use of optional parameters in `fit_gaussian_2D()`
that are meant to help you address these issues.

Let's start by loading `gaussplotR` and getting some sample data loaded up:
```{r setup}
library(gaussplotR)

## Load the sample data set
data(gaussplot_sample_data)

## The raw data we'd like to use are in columns 1:3
samp_dat <-
  gaussplot_sample_data[,1:3]
```

## Singular gradients
One common problem is that of singular gradients. I will intentionally
comment out the next block of code because running it will produce the singular
gradient error, and generating errors in an R Markdown file will prevent its
rendering. Please un-comment the example below to see.

```{r singular_gradient}
## Un-comment this example if you'd like to see a singular gradient error
# gauss_fit_cir <-
#   fit_gaussian_2D(samp_dat,
#                   constrain_amplitude = TRUE,
#                   method = "circular")
```
The output from the above example should be:  
```{r error_msg_1}
#> Error in stats::nls(response ~ Amp_init * exp(-((((X_values - X_peak)^2)/(2 *  : 
#>   singular gradient
#> Called from: stats::nls(response ~ Amp_init * exp(-((((X_values - X_peak)^2)/(2 * 
#>     X_sig^2) + ((Y_values - Y_peak)^2)/(2 * Y_sig^2)))), start = c(X_peak = #> _peak_init, >
#>     Y_peak = Y_peak_init, X_sig = X_sig_init, Y_sig = Y_sig_init), 
#>     data = data, trace = verbose, control = list(maxiter = maxiter, 
#>         ...))
#> Error during wrapup: unimplemented type (29) in 'eval'
#> 
#> Error: no more error handlers available (recursive errors?); invoking 'abort' restart
#> Error during wrapup: INTEGER() can only be applied to a 'integer', not a 'unknown type #> #29'
#> Error: no more error handlers available (recursive errors?); invoking 'abort' restart
```


There are a couple tools in `gaussplotR` that can help you address this problem.

A good first step is to enable the optional argument `print_initial_params` in
`fit_gaussian_2D()` by setting it to `TRUE`. Again, please un-comment this next
block, since it will still produce an error:

```{r singular_gradient_print_params}
## Un-comment this example if you'd like to see a singular gradient error
# gauss_fit_cir <-
#   fit_gaussian_2D(samp_dat,
#                   constrain_amplitude = TRUE,
#                   method = "circular",
#                   print_initial_params = TRUE)
```

Though this block of code will not work, you will at least see something helpful
at the beginning of the error message:
```{r error_msg_2}
#> Initial parameters:
#>       Amp    X_peak    Y_peak     X_sig     Y_sig 
#> 25.725293 -2.000000  3.000000  2.482892  2.500000 
#> Error in stats::nls(response ~ Amp_init * exp(-((((X_values - X_peak)^2)/(2 *  : 
#>   singular gradient
#> Called from: stats::nls(response ~ Amp_init * exp(-((((X_values - X_peak)^2)/(2 * 
#>     X_sig^2) + ((Y_values - Y_peak)^2)/(2 * Y_sig^2)))), start = c(X_peak = #> _peak_init, >
#>     Y_peak = Y_peak_init, X_sig = X_sig_init, Y_sig = Y_sig_init), 
#>     data = data, trace = verbose, control = list(maxiter = maxiter, 
#>         ...))
#> Error during wrapup: unimplemented type (29) in 'eval'
#> 
#> Error: no more error handlers available (recursive errors?); invoking 'abort' restart
#> Error during wrapup: INTEGER() can only be applied to a 'integer', not a 'unknown type #> #29'
#> Error: no more error handlers available (recursive errors?); invoking 'abort' restart
```

Those first three lines indicate the initial values that were used. Often,
singular gradients will arise when initial values for parameters were poorly
chosen (sorry!). 

What you can do is supply your own set of initial values. To do this, use the 
optional argument `user_init` within `fit_gaussian_2D()`. You will need to 
supply a numeric vector that is of the same length as the number of parameters
for your chosen model. The values you supply must be provided in the same order
they appear in the Initial parameters message. They do not need to be named; 
values alone will suffice. 

This next example will work. I'll keep `print_initial_params` on too, since
it can be nice to see:
```{r no_singular_user_init}
## This should run with no errors
gauss_fit_cir_user <-
  fit_gaussian_2D(samp_dat,
                  constrain_amplitude = TRUE,
                  method = "circular",
                  user_init = c(25.72529, -2.5, 1.7, 1.3, 1.6),
                  print_initial_params = TRUE)
gauss_fit_cir_user
```

Note that although we are constraining the amplitude, the value of `Amp` must
still be provided (here it is `25.72529`). 

It may take some trial and error to find a set of `user_init` values that gets
your model to converge. It often makes sense to think about what values are
feasible for each parameter. For example, it should be relatively straightforward
to think of ranges of possible values for `X_peak` and `Y_peak`. I often find
that finding good initial values for the "spread" parameters (such as `X_sig`
and `Y_sig`) is the tough nut to crack, so I recommend tweaking those parameters
first.

## Additional control arguments to `nls()`
The `fit_gaussian_2D()` function also allows you to pass additional control 
arguments to `stats::nls.control()` via the `...` argument. 

To put this in more technical terms, arguments supplied to `...` are handled as:  
`stats::nls(control = list(maxiter, ...))`

Therefore, if you are interested in changing e.g. `minFactor` to `1/2048`:  
`fit_gaussian_2D(data, model, minFactor = 1/2048)`

See the Help file for `stats::nls.control()` for further guidance on what these
control arguments are.

Please also note that that tweaking `maxiter` should not be handled via `...`
but rather by the `maxiter` argument to `fit_gaussian_2D()`.


## Use parameter constraints with caution

Our scapegoat here is setting `constrain_amplitude = TRUE`. Often, when
constraining parameters in a nonlinear model, you'll find yourself in a scenario
where the QR decomposition of the gradient matrix is not of full column rank.

Constraining parameters (amplitude or orientation) will lead to poorer-fitting
Gaussians anyway, so these features should only be used if you have an *a
priori* reason to do so (see examples in Priebe et al. 2003^[Priebe NJ,
Cassanello CR, Lisberger SG. The neural representation of speed in macaque area
MT/V5. J Neurosci. 2003 Jul 2;23(13):5650-61. doi:
10.1523/JNEUROSCI.23-13-05650.2003.])

Turning off the `constrain_amplitude` constraint alleviates the problem in this
particular case:

```{r no_singular}
## This should run with no errors
gauss_fit_cir <-
  fit_gaussian_2D(samp_dat,
                  method = "circular")
gauss_fit_cir
```

Hope this helps!

🐢