---
title: "Allowing the lower asymptote parameter to vary freely"
author: "Thomas Matheis, Phineus Choi, Sam Butler, Mira Terdiman, Johanna Hardin"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Allowing the lower asymptote parameter to vary freely}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Introduction

There may be situations where we want to estimate the lower asymptote of $h_0$ freely in our model rather than assuming it always starts at zero, which is what **sicegar** assumes by default. 
For this purpose, the functions `fitAndCategorize()` and `figureModelCurves()` contain the argument `use_h0` (which has a default value set to `FALSE`). 
Setting the argument to `TRUE` results in the same process as usual, using functions ending in `_h0` instead of their default counterparts.
For example, the functions `multipleFitFunction()`, `doublesigmoidalFitFormula()`, `parameterCalculation()`, and `normalizeData()` have `_h0` counterparts, `multipleFitFunction_h0()`, `doublesigmoidalFitFormula_h0()`, `parameterCalculation_h0()`, and `normalizeData_h0()`.

```{r install_packages, echo=FALSE, warning=FALSE, results='hide',message=FALSE}

###*****************************
# INITIAL COMMANDS TO RESET THE SYSTEM
seedNo=14159
set.seed(seedNo)
###*****************************

###*****************************
require("sicegar")
require("dplyr")
require("ggplot2")
require("cowplot")
###*****************************
```

We will demonstrate the differences between letting $h_0$ be estimated freely and assuming it is fixed at zero, first generating data where $h_0$ is not zero:

```{r generate_data}
noise_parameter <- 1
reps <- 5
time <- rep(seq(3, 24, 3), reps)
mean_values <- doublesigmoidalFitFormula_h0(time,
                                       finalAsymptoteIntensityRatio = .3,
                                       maximum = 10,
                                       slope1Param = 1,
                                       midPoint1Param = 7,
                                       slope2Param = 1,
                                       midPointDistanceParam = 8,
                                       h0 = 3)
intensity <- rnorm(n = length(mean_values), mean = mean_values, sd = rep(noise_parameter, length(mean_values)))

dataInput <- data.frame(time, intensity)
ggplot(dataInput, aes(time, intensity)) + 
  geom_point() + 
  scale_y_continuous(limits = c(-1, 13), expand = expansion(mult = c(0, 0))) + 
  theme_bw()
```

## Fitting the models to the data

`fitAndCategorize()` can be applied to the data, first with default arguments and second by setting the argument `use_h0` to `TRUE`:

```{r}
fitObj_zero <- fitAndCategorize(dataInput,
                           threshold_minimum_for_intensity_maximum = 0.3,
                           threshold_intensity_range = 0.1,
                           threshold_t0_max_int = 1E10,
                           use_h0 = FALSE)   # Default

fitObj_free <- fitAndCategorize(dataInput,
                           threshold_minimum_for_intensity_maximum = 0.3,
                           threshold_intensity_range = 0.1,
                           threshold_t0_max_int = 1E10,
                           use_h0 = TRUE)
```

Using `figureModelCurves()`, we can visualize the differences between using the default arguments and letting $h_0$ be freely estimated.

```{r zero_free_plots, fig.height=4, fig.width=8}
# Double-sigmoidal fit with parameter related lines
fig_a <- figureModelCurves(dataInput = fitObj_zero$normalizedInput,
                                  doubleSigmoidalFitVector = fitObj_zero$doubleSigmoidalModel,
                                  showParameterRelatedLines = TRUE,
                                  use_h0 = FALSE)   # Default

fig_b <- figureModelCurves(dataInput = fitObj_free$normalizedInput,
                                  doubleSigmoidalFitVector = fitObj_free$doubleSigmoidalModel,
                                  showParameterRelatedLines = TRUE,
                                  use_h0 = TRUE)

plot_grid(fig_a, fig_b, ncol = 2) # function from the cowplot package
```

It is clear that in this situation, using the default arguments result in a worse fit than when $h_0$ is allowed to be estimated freely.

## Model fitting components (h0 free)

To fit and plot individual models using a freely estimated $h_0$, we must directly call the `_h0` counterparts of each **sicegar** function. We have already generated the data (with $h_0 = 2$), so now we can normalize the data.

```{r normalize_data}
normalizedInput_free <- normalizeData(dataInput = dataInput, 
                                 dataInputName = "doubleSigmoidalSample")
head(normalizedInput_free$timeIntensityData) # the normalized time and intensity data
```

We can now call `multipleFitFunction_h0()` on our data to be fitted, calculating additional parameters using `parameterCalculation_h0()`:

```{r}
# Fit the double-sigmoidal model
doubleSigmoidalModel_free <- multipleFitFunction_h0(dataInput=normalizedInput_free,
                                            model="doublesigmoidal")

doubleSigmoidalModel_free <- parameterCalculation_h0(doubleSigmoidalModel_free)
```

Now that we have obtained a fit, we can use `figureModelCurves()` to plot:

```{r plot_raw_fit, echo=TRUE, message=FALSE, warning=FALSE, comment=FALSE, fig.height=4, fig.width=6}
# double-sigmoidal fit
figureModelCurves(dataInput = normalizedInput_free,
                  doubleSigmoidalFitVector = doubleSigmoidalModel_free,
                  showParameterRelatedLines = TRUE,
                  use_h0 = TRUE)
```

## Model parameters

Recall that the original model parameters (which generated the data) are given as `finalAsymptoteIntensityRatio = 0.3, maximum = 10, slope1Param = 1, midPoint1Param = 7, slope2Param = 1, midPointDistanceParam = 8, h0 = 2`.

We can recover the parameter estimates from both of the `doubleSigmoidalModel` objects created above.
`fitObj_zero` does not return a value for $h_0$ (because it is not part of the estimation process).
When $h_0$ is allowed to vary freely, the full set of parameters are estimated to be much closer to the data generating parameters (as opposed to when $h_0 = 0$ is forced).

```{r}
fitObj_zero$doubleSigmoidalModel |>
  dplyr::select(finalAsymptoteIntensityRatio_Estimate, maximum_Estimate, slope1Param_Estimate, midPoint1Param_Estimate,
         slope2Param_Estimate, midPointDistanceParam_Estimate) |> 
  c()


fitObj_free$doubleSigmoidalModel |>
  dplyr::select(finalAsymptoteIntensityRatio_Estimate, maximum_Estimate, slope1Param_Estimate, midPoint1Param_Estimate,
         slope2Param_Estimate, midPointDistanceParam_Estimate, h0_Estimate) |> c()


```