---
title: "Modeling Howto"
author: "N. Kehrein and contributors"
date: "24 October, 2024"
output:
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 3
  github_document:
    toc: true
  word_document:
    toc: true
vignette: >
  %\VignetteIndexEntry{Modeling Howto}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  fig.path = "../doc/figures/howto-",
  comment = "#>"
)
```

```{r setup, include=FALSE}
library(cvasi)
library(dplyr)
#options(conflicts.policy=FALSE,warn.conflicts=FALSE)
```

This Howto provides instructions on how to address certain modeling challenges and
offers additional details and context for certain features of the package. A final 
section provides a complete worked out example.
For a more general overview, please refer to the [manual](cvasi-1-manual.html).


## How to access scenario properties

The package provides a number of `set*()` functions to modify scenario
properties such as `set_param()` and `set_times()`. Analogous `set*` functions
are not provided because, generally, it should not be necessary to retrieve data
from scenarios. However, it is possible (but not recommended) to access all
scenario settings via objects *slots*. A *slot* is the name for an object
attribute in *R*. The slots of an object can be accessed by using the `@`
operator. It behaves similar to the `$` operator on named lists:


```{r}
# Create a new scenario object
myscenario <- Lemna_Schmitt()

# Get model name
get_model(myscenario)

# Set a custom tag to identify the scenario
myscenario %>%
  set_tag("Lab experiment #1") -> myscenario

# Get custom tag
get_tag(myscenario)

# The tag is also displayed when printing scenario properties
myscenario

# Accessing scenario slots and their default values
myscenario@forcings.req # forcings required for effect calculations
myscenario@endpoints    # available effect endpoints
myscenario@control.req  # are control runs required for effect calculation?
```

The previous example displays some of the default values of a
*Lemna_Schmitt* scenario. The set of available slots depends on the model
type and is documented in the package help. For instance, scenario
properties shared by all models are documented in the effect scenario
class:

  
```{r, eval=FALSE}
# Call the help page of effect scenarios class
?scenarios
```

A scenario class inherits all slots from its ancestors. A notable class
which modifies simulation behavior and provides additional scenario properties
is `Transferable`: it provides capabilities
to consider biomass transfers at defined time points during simulation.
Details about its class slots and functionality are described in the help pages.

```{r, eval=FALSE}
# Call the help page of the biomass transfer class
?Transferable
```

## Using *tidyr* syntax

The *tidyr* syntax, popularized by the *tidyverse* packages in R, provides a 
coherent and efficient approach to data manipulation and analysis. The *tidyverse*,
which includes the widely used *dplyr* and *ggplot2* packages, follows a 
standardized grammar that makes code more readable and intuitive. *tidyr* 
syntax emphasizes the use of functions with clear and descriptive names. This 
makes it easier for users to understand and reproduce analyses. The `%>%` (pipe) 
operator is a key component of *tidyr* syntax and enables fluent and expressive
concatenation of operations. Overall, the introduction of 
*tidyr* syntax improves code readability, reproducibility, and collaboration, 
resulting in maintainable data analysis pipelines.

In brief, the advantages of *tidyr* syntax are:

- a series of statements can be combined to an intuitive workflow using the
  pipeline (`%>%`) operator
- a short cut for the pipe (`%>%`) operator is Ctrl+Shift+M
- pipelines reduce the need for intermediary variables (but thoughtful intermediates are recommended)
- *tidy* verbs generally take `list` data types as input and return these also as output
- some verbs enrich outputs with additional information, effectively extending
  the output data to a table

```{r}
# The example scenario `metsulfuron` based on the Lemna model by Schmitt et al. (2013)
# is modified by setting a new exposure series and initial state. Then, it is
# simulated.
  metsulfuron %>%
    set_noexposure() %>%  # set no exposure (i.e., a control run)
    set_init(c(BM = 50)) %>%  # set initial biomass
    simulate()
```

## Predictions

TKTD models have both species and substance specificity. If risks are identified
at Tier 1 (standard test species approach) and exposure duration is expected to 
be shorter than in standard tests, the simplest solution is to develop TKTD 
models for standard test species. However, if Tier 2A (geometric mean/evidence 
weighting approach) or Tier 2B (species sensitivity distribution approach) 
information is available, it may be more appropriate to develop TKTD models for 
a wider range of species to increase the accuracy of the risk assessment. 
Validated TKTD models for these different species can be used as an alternative 
to assess specific risks using available field exposure profiles. This includes 
the calculation of exposure profile specific LPx/EPx values (multiplication 
factor for a specific overall exposure profile causing x % lethality or effect) 
based on an appropriate aquatic exposure assessment.

```{r}
# Initialize the random numbers generator
set.seed(123)
# Generating a random exposure series spanning 14 days
random_conc <- runif(15, min=0, max=0.1)
exposure_profile <- data.frame(time=0:14, conc=random_conc)
# Run EPx calculations
minnow_it %>%
  set_exposure(exposure_profile) %>%  # set a specific exposure scenario
  epx()  # run EPx calculations
```

## Moving exposure windows

A moving time window is a computing technique used in data analysis. Data is 
systematically analysed within a window of fixed length that moves or slides 
through the data set, in our case an exposure time-series. The window captures a 
subset of successive data points, and as it moves through the data; simulations
and effect calculations are performed for each window.


`effect()` can report effect levels for all evaluated exposure windows
on demand:
```{r}
# Derive effect levels of all exposure windows
metsulfuron %>% 
  set_window(length=7, interval=1) %>%
  effect(max_only=FALSE)
```

The resulting table describes how effect levels change when the exposure
window moves along the exposure series. It is also possible to specify the marginal
effect threshold of reported results (this prevents overinterpretation of spurious 
effect levels originating from marginal numerical errors introduced during 
simulation), as in the following example:

```{r}
# Only report effect levels larger than 1e-5 = 0.001%
metsulfuron %>% 
  set_window(length=7, interval=1) %>%
  effect(max_only=FALSE, marginal_effect=1e-5)
```

The effect over all moving windows can be visualized using `ggplot`:

```{r}
set.seed(123)
# Generate a random exposure series spanning 14 days
ts <- data.frame(time = 0:14,
                 conc = runif(15, min=0, max=0.1))

# Run EPx calculations for a window length of 7 days and a step size of 1 day
metsulfuron %>%
  set_exposure(ts) %>%
  set_window(length=7, interval=1) %>%
  effect(max_only=FALSE) -> results
results

# Create a plot of effects over time
library(ggplot2)
ggplot(results) +
  geom_point(aes(dat.start, BM*100)) +
  labs(x="Start of window (day)", y="Effect on biomass (%)")
```
The effect plot shows the effect for the time point where each window starts.
Effects are not available, and therefore not plotted, for time points where the
window exceeds the simulated timeframe.

## Simulating biomass transfers

A transfer refers to an event where a certain amount of biomass (BM) is moved to
a new medium after a period of time. This feature replicates a procedure 
occurring e.g. in *Lemna* effect studies and may be necessary to recreate study 
results. At each transfer, a defined amount of biomass is transferred to a new 
medium. This is modeled by interrupting the simulation at a transfer time point, 
modifying the biomass level BM, and scaling affected compartments according to 
new biomass levels. Scaling of compartments depending on biomass, such as 
internal toxicant mass, is necessary to correctly reflect mass balances and 
concentrations over time.

Option 1: Regular intervals
```{r}
metsulfuron %>%
  set_init(c(BM=1)) %>%
  set_noexposure() %>%
  set_transfer(interval=3, biomass=1) %>%
  simulate() -> result
result

library(ggplot2)
ggplot(result) +
  geom_line(aes(time, BM)) +
  labs(x="Time (days)", y="Biomass (g_dw/m2)", title="Biomass transfer every three days")

```

Option 2: Custom time points and custom biomass
```{r}
metsulfuron %>%
  set_init(c(BM=1)) %>%
  set_noexposure() %>%
  set_transfer(times=c(3, 6), biomass=c(1, 0.5)) %>%
  simulate() -> result2

library(ggplot2)
ggplot(result2) +
  geom_line(aes(time, BM)) +
  labs(x="Time (days)", y="Biomass (g_dw/m2)", title="Biomass transfer at custom time points")
```

```{r, eval=FALSE}
# Call the help page of set_transfer
?set_transfer
```

## Fitting model parameters

Parameters of a model can be fitted (calibrated) against observed effect data.
To fit a scenario to observed effect data:

1. Combine the necessary inputs (model, exposure time-series, effect data).
2. Fit the scenario for the given parameters and observed effect data.

For the first step, two options are available:

* When only one exposure level (e.g., one experimental treatment) and 
corresponding effect dataset (effect data only contain observations corresponding 
to one exposure scenario) are of interest, then a scenario consisting of
model parameters and exposure time-series can directly be fitted to
effect data using the `calibrate()` function. 

* As a more complex and versatile option, scenarios and effect data are combined into
one or more *calibration sets*. Each *calibration set* has exactly one scenario that
describes the parameters of the experiment. If more than one *calibration set*
is defined, scenario parameters can differ between set, but don't have to,
A numerical weight can be assigned to each *calibration set* to control its
impact on the calculated error term and derived fitting procedure. 
All *calibration sets* are then passed to the `calibrate()` function.

The following describes an example which fits selected parameters of the
*Lemna* model by Schmitt *et al.* (2013) to observed frond numbers from
experiments with the herbicide *metsulfuron*: 

Option 1: direct calibration with one scenario and one effect dataset

```{r warning=FALSE}
library(dplyr)
# set k_phot_fix, k_resp and Emax to the defaults recommended in Klein et al. (2022)
# for Tier 2C studies. Set initial biomass to 12.0 (original data is in fronds
# rather than biomass, but for sake of simplicity, no conversion was performed).
control <- metsulfuron %>% 
  set_param(c(k_phot_fix=TRUE, k_resp=0, Emax=1)) %>% 
  set_init(c(BM=12)) %>%
  set_noexposure() %>%
  set_bounds(list(k_phot_max=c(0, 0.5)))
# `metsulfuron` is an example scenario based on Schmitt et al. (2013) and therefore
# uses the historical, superseded nomenclature by Schmitt et al. We recommend using
# the newer SETAC Lemna model by Klein et al. (2022) for current applications,
# see `Lemna_SETAC()` for details.

# Get control data from Schmitt et al. (2013)
obs_control <- schmitt2013 %>%
  filter(trial == "T0")

# Fit parameter `k_phot_max` to observed data: `k_phot_max` is a physiological
# parameter which is typically fitted to the control data
fit1 <- calibrate(
  x = control,
  par = c(k_phot_max = 1),
  data = obs_control,
  output = "BM",
  method="Brent" # Brent is recommended for one-dimensional optimization
)
fit1$par

# Update the scenario with fitted parameter and simulate it
fitted_growth <- control %>% 
  set_param(fit1$par) %>%
  set_bounds(list(
    EC50=c(0, 1000),
    b=c(0.1, 10), 
    P_up=c(0, 0.1)
  ))
sim_mean <- fitted_growth %>%
  simulate() %>%
  mutate(trial="control")

# Exposure and observations in long format for plotting
treatments <- obs_control %>%
  select(time, conc) %>%
  mutate(trial="control")
obs_mean <- obs_control %>%
  select(time, BM=obs) %>%
  mutate(trial="control")

# Plot results
plot_sd(
  model_base = fitted_growth,
  treatments = treatments,
  rs_mean = sim_mean,
  obs_mean = obs_mean
)
```

Option 2: Create a list of *calibration sets*  and then fit TK/TD model parameters
  on all datasets and exposure levels at the same time:

```{r warning=FALSE}
# Use all trials and exposure series from Schmitt et al. (2013) as input for fitting
fit2 <- calibrate(
  fitted_growth,
  data=schmitt2013,
  par=c(EC50=0.3, b=4.16, P_up=0.005),
  output="BM",
  method="L-BFGS-B",
  verbose=FALSE
)
fit2$par

# Update the scenario with fitted parameter and simulate all trials
fitted_tktd <- fitted_growth %>%
  set_param(fit2$par)

treatments <- schmitt2013 %>% select(time, conc, trial)
rs_mean <- fitted_tktd %>%
  batch(treatments) %>%
  simulate()

# Observations in long format for plotting
obs_mean <- schmitt2013 %>% select(time, obs, trial)

# Plot results
plot_sd(
  model_base = fitted_tktd,
  treatments = treatments,
  rs_mean = rs_mean,
  obs_mean = obs_mean
)
```

The resulting scenario with fitted parameters shows a very good fit with
the observed effects from experiments.

## Obtaining confidence intervals for parameters

After model parameters have been calibrated during model fitting against 
observational data, the parameter estimates obtained can be 
further evaluated by assessing the 95% confidence intervals for each parameter 
through likelihood profiling. 

First, the profiling is conducted using the `lik_profile()` function. Then, the 
profiles can be visualized using `plot()`.

```{r}
# using the calibration set and calibrated parameters from the previous Lemna 
# Schmitt example, a likelihood profiling is done
# We set parameter boundaries to constrain the likelihood profiling within these 
# boundaries (this is optional)

# Calibrate a model to a list of calibration sets
fit <- calibrate(
  fitted_growth,
  data=schmitt2013,
  par=c(EC50=0.3, b=4.16, P_up=0.005),
  output="BM",
  method="L-BFGS-B",
  verbose=FALSE
)

# Conduct profiling
res <- lik_profile(x = fitted_growth,
                   data = schmitt2013,
                   par = fit$par, # the parameter values after calibration
                   output = "BM", # the observational output against which to 
                                  # compare the model fit
                   bounds = list(EC50_int = list(0.1, 4), 
                                 b = list(1, 5),
                                 P = list(0.0001, 0.2)))
# Visualise profiling results
plot(res)

# Access 95% confidence intervals of profiled parameters
res$EC50$confidence_interval
res$b$confidence_interval

```

Optionally, we can adapt caliset weights and tags, for example, a study 
was done by 2 labs (A and B), where the data of lab 2 should 
be considered to have more weight. In this case, the `individual` argument
needs to be set to TRUE.

```{r}
# Create calisets
cs <- td2cs(fitted_growth, tox_data(schmitt2013))
# Define tags and weights
lab_tags <- c("A", "B", "B", "A", "B", "B", "B")
lab_weights <- c(1, 1.5, 1.5, 1, 1.5, 1.5, 1.5)

for(i in 1:length(cs)){
  cs[[i]]@tag <- lab_tags[i]
  cs[[i]]@weight <- lab_weights[i]
}

res_2 <- lik_profile(x = cs, # the calibration set
                   par = fit$par, # the parameter values after calibration
                   output = "BM", # the observational output against which to 
                   # compare the model fit
                   bounds = list(EC50_int = list(0.1, 4), 
                                 b = list(1, 5),
                                 P = list(0.0001, 0.2)),
                   individual = TRUE)
                   
plot(res_2)


```

## Assessing parameter correlations and identifiability

Relations between estimates of the calibrated parameters can be 
visualized using a parameter space explorer. This supports identifying potential
parameter correlations and identifiability issues.

The parameter space explorer draws random parameter sets from the 95% confidence
intervals obtained for each parameter during likelihood profiling, and evaluates 
the fit (likelihood) of each of these parameter sets by comparing the model with
the original parameter set against the model with the new, randomly draw set.
 
First, the parameter space exploration is conducted using the `explore_space()` 
function. Then, the space can be visualized using `plot()`.


```{r, eval=FALSE}
# Call the help page for more information about the parameter space explorer
?explore_space

```

```{r}

# Calibrate a model to a list of calibration sets
cs <- td2cs(fitted_growth, tox_data(schmitt2013))
fit <- calibrate(
  cs,
  par=c(EC50=0.3, b=4.16, P_up=0.005),
  output="BM",
  method="L-BFGS-B",
  verbose=FALSE
)
# conduct profiling
res <- lik_profile(x = cs, # the calibration set
                   par = fit$par, # the parameter values after calibration
                   output = "BM", # the observational output against which to 
                                  # compare the model fit
                   bounds = list(EC50_int = list(0.1, 4), 
                                 b = list(1, 5),
                                 P = list(0.0001, 0.2)))
# Conduct space exploration
res_space <- explore_space(
  x = cs,
  par = fit2$par,
  res = res, # output of the likelihood profiling function
  output = "BM")

# Visualize the parameter space
plot(res_space)

```

## Changes in parameter values over time

Sequences can be used to represent changing conditions over time, such as a
change in model parameters which would otherwise be constant. This can be
used to represent events such as a pump failure or change in temperature.

The function `sequence()` creates an object which represents a sequence of several
scenarios. The sequence is treated as a single scenario and each scenario is
simulated one after the other.


```{r}
# A base scenario is created for the whole period, but parameters are only
# valid until day 7
sc1 <- metsulfuron %>%
  set_times(0:14)

# A parameter change occurs at day 7: k_loss increases from 0 to 0.1 d-1
sc2 <- sc1 %>%
  set_param(c(k_loss=0.1))

# Combine both scenarios to a sequence with a break at *t = 7*:
# scenario `sc1` will be simulated for *t = [0, 7]*, and
# scenario `sc2` will be simulated for *t = [7, 14]*.
sq <- sequence(list(sc1, sc2), breaks=7)
simulate(sq)
```
For more information:

```{r, eval=FALSE}
# Call the help page of `sequence`
?sequence
```

## Decrease assessment runtime

There are multiple ways to decrease the runtime of a simulation. One option 
is to increase the maximum step length of the solver, `hmax`. It is an
optional argument which can be passed to either `simulate()`, `effect()`, or
`epx()`. The larger `hmax`, the faster the simulation generally completes. 
However, be careful: The larger `hmax`, the greater the risk that simulation 
results will be inaccurate and simulations may even fail due to numerical issues.

```{r}
# Simulations with a maximum solver step length of hmax=0.01
metsulfuron %>%
  set_times(0:7) %>%
  simulate(hmax=0.1)
```

The calculation of *EPx* values may take a lot of time if one or more of the following
conditions apply:

1) a large number of scenarios is assessed
2) the simulated exposure time-series are very long
3) moving exposure windows are considered
   which are very small in comparison to the length of the exposure series

The `epx()` function is implemented in a way that it can parallelize
calculations by using more than one CPU. To enable parallel processing,
a command like the following needs to preceed the call to `epx()`:

```{r eval=FALSE}
future::plan(future::multisession)
```

If the computer running the calculation has `n` physical CPU cores, then the
time necessary to calculate *EPx* values will (in the best case) decrease by
the same factor.
For more information on how to make use of parallelization in R, please refer to the
`future` package:

```{r eval=FALSE}
vignette("future-1-overview", package="future")
```


## Implementing custom models

The set of supported models can be extended by users as needed.
Experimental or one-time use models can be implemented in a user's script and
inserted into the package's workflow.

The starting point to add a new model is an implementation of the model's
rules and dynamics in *R* code. In most cases, this will be a code snippet
which describes the model's Ordinary Differential Equations (ODE):

```{r, warning=FALSE}
## An exemplary implementation of the GUTS-RED-SD TKTD model ##
# Model ODEs following the deSolve specification
sd_ode <- function(t, state, params) {
  with(as.list(c(state, params)), {
    dDw <- kd * (Cw(t) - Dw)            # dDw/dt, scaled damage
    dH <- kk * max(0, Dw - z)           # dH/dt, cumulative hazard
    
    list(c(dDw, dH))                    # return derivatives
  })
}
```

The previous example implements the model equations in a way that
can be processed by the `deSolve` package for numerical integration. For a
detailed description on how to use `deSolve`, please refer to its vignette:

```{r, eval=FALSE}
vignette("deSolve", package="deSolve")
```

With some additional scenario data such as initial state, output
time points, and model parameters, we are able to simulate our custom
TKTD model:

```{r, warning=FALSE}
## Properties of a sample scenario ##
init <- c(Dw=0, H=0)                         # initial state
times <- 0:5                                 # output time points [0,5]
param <- c(kd=22, hb=0.01, z=0.5, kk=0.08)   # model parameters
exp <- data.frame(time=0:5,                  # exposure time-series
                  conc=c(0,1,1,0.5,0.6,0.2))

# Create a linear interpolation of the exposure time-series
expf <- approxfun(x=exp$time, y=exp$conc, method="linear", rule=2)
# Extend parameter set by interpolated exposure series
paramx <- as.list(c(param, Cw=expf))

# Numerically solve the ODE
deSolve::ode(y=init, times=times, parms=paramx, func=sd_ode)
```

When we have made sure that our custom model can be simulated and works as expected,
we can continue with integrating it into the package's workflow. First, we will
define a new scenario class that identifies our custom model by a unique name.
Next, we have to tell the package how the models of this class can be simulated.
In a final step, we may have to describe how effects are calculated for these
models.

We start by defining a new scenario class that derives from a suitable ancestor.
In general, we can inherit from the base class `EffectScenario` which
provides the general scenario capabilities. However, if the
custom model is just a variant of an existing model category, then it will be
easier to derive from specialized scenario class that already provides certain
features such as effect calculation. The following
class tree gives an overview of the scenario classes in use:

    EffectScenario
    |    
    |   Transferable
    |___|__ Lemna
    |   |   |__ LemnaSchmitt
    |   |   |__ LemnaSetac
    |   |__ Magma
    |   |__ Algae
    |       |__ AlgaeWeber
    |       |__ AlgaeTKTD
    |       |__ AlgaeSimple 
    |
    |__ GutsRedSd
    |__ GutsRedIt
    |
    |__ Deb
        |__ DebAbj
        |__ DebTox

To give an example, to implement a variant of a *Lemna* model, it would be advisable
to derive from the class `Lemna` to benefit from already implemented features
such as effect endpoint calculation and simulation of biomass transfers.
For the custom GUTS-RED-SD model, the ideal choice would be to derive from
`GutsRedSd` to minimize the implementation overhead. However, we will
derive from the general `EffectScenario` class for the sake of our example and
create a scenario object:

```{r}
## Integrate a new model class into the package workflow ##
# Create a unique class that derives from 'EffectScenario'
setClass("MyGuts", contains="EffectScenario")

# Create an object of the new class and assign scenario properties
new("MyGuts", name="My custom model") %>%
  set_init(init) %>%
  set_times(times) %>%
  set_param(param) %>%
  set_exposure(exp, reset_times=FALSE) %>%
  set_endpoints("L") -> myscenario

myscenario
```

The object `myscenario` now carries all the settings and data which we also
used for our test simulation. In the next step, the `solver()` function will
be overloaded to handle objects
of the newly defined scenario class `MyGuts`. The adapted `solver()` function will
collect the properties of a given scenario object, call the ODE solver, and return
simulation results:

```{r}
# the actual function calling deSolve can have a different signature
solver_myguts <- function(scenario, ...) {
  # get relevant data from scenario
  init <- scenario@init
  param <- scenario@param
  times <- scenario@times
  exp <- scenario@exposure@series
  if(nrow(exp) == 1) { # extend exposure series to have at least two rows
    row2 <- exp[1,]
    row2[[1]] <- row2[[1]]+1
    exp <- rbind(exp, row2)
  }
  
  # Create a linear interpolation of the exposure time-series
  expf <- approxfun(x=exp[,1], y=exp[,2], method="linear", rule=2)
  # Extend parameter set by interpolated exposure series
  paramx <- as.list(c(param, Cw=expf))
  
  # pass everything to ODE solver for numerical integration
  as.data.frame(deSolve::ode(y=init, times=times, parms=paramx, func=sd_ode, ...))
}

## Overload the solver() function ##
# The functions signature, i.e. the number and names of its arguments, must stay as is
setMethod("solver", "MyGuts", function(scenario, ...) solver_myguts(scenario, ...))
```

Overloading an *S4* function is done using `setMethod()`. The first argument
to `setMethod()` identifies which function we want to overload, in this case it is
`solver`. The second argument, `MyGuts`, defines which object type our
overloaded function will accept. In this case, we want to provide an implementation
to simulate `MyGuts` scenarios. *R* will decide during runtime which of the
function candidates to execute when `solver()` is called based on the type
of the first argument to `solver()`. The third argument, `function(object, ...) solver_myguts(scenario=object, ...)`
forwards calls to an appropriate function which can handle
`MyGuts` objects. The function signature `function(object, ...)` must stay exactly 
as it is and must not be modified.

The code within the body of `solver_myguts()` is almost identical to the original
code used to simulate the prototyped model. We have to deal with one corner
case, though: some exposure time-series will contain only a single row, representing
constant exposure over time, but this would raise an error in `approxfun()`.
In order to avoid this error, we will duplicate the first row and append it to
the series. Technically, this issue should be handled in the prototyped code as
well but was left out for reasons of brevity.

The parameterized scenario object `myscenario` of class `MyGuts` can be
passed to the overloaded `simulate()` function:

```{r}
myscenario %>% simulate()
```


The custom model can now be simulated using the framework but it is still
missing effect endpoints. If, on the other hand, we had decided to inherit our
scenario class from a suitable ancestor, such as `GutsRedSd`, we could
have stopped at this point as the ancestor would provide routines to calculate
effects.

By default, any state variable is available
as an effect endpoint and the calculated effect would reflect the change in the
state variables' value at the end of the simulation. However, we need to
calculate the *survival* endpoint in a different manner for GUTS-RED type models.
To implement this specialized endpoint, we need to overload the
function `fx()` which calculates effect endpoints:

```{r}
## Overload effect endpoint calculation ##
# fx() is called by effect()
setMethod("fx", "MyGuts", function(scenario, ...) fx_myguts(scenario, ...))

# @param scenario Scenario object to assess
# @param ... any additional parameters
fx_myguts <- function(scenario, ...) {
  # simulate the scenario (it is already clipped to the moving exposure window)
  out <- simulate(scenario, ...)
  # only use model state at the end of the simulation
  end <- tail(out, 1)
  # length of the simulated period, first column of result contains time
  t <- tail(out[, 1], 1)- out[[1, 1]]
  # calculate survival according to EFSA Scientific Opinion on TKTD models
  # p. 33, doi:10.2903/j.efsa.2018.5377
  survival <- exp(-end$H) * exp(-scenario@param$hb * t)
  return(c("L"=survival))
}

# Derive effect levels for our sample scenario
myscenario %>% effect()
```

`fx()` is used by `effect()` to derive effect endpoints for each model.
By convention, any overload of the `fx()` function must return a named
numerical vector containing the effect endpoints for the provided scenario
and moving exposure window. The scenario will already be parameterized to
only simulate the current exposure window. The return value of `fx()` will then
be used to calculate effect levels. In case of models requiring a control
scenario, the effect level will generally be calculated as `1 - effect/control`. For models not
requiring a control, the overloaded `fx()` must return the final effect
value.


## Complete working example

In this section, a complete example is given to make predictions and 
evaluate toxic effects using a calibrated model, as might be conducted in a risk
assessment context. Specifically, the 
*Lemna* model will be set up to match the study of
Schmitt et al. (2003, doi: 10.1016/j.ecolmodel.2013.01.017) who exposed *Lemna*
to metsulfuron-methyl.

The model will be parameterized with the values as described in the study. Then,
the model will be inspected and used to make predictions for 
exposure scenarios, plot results, and get EPx calculations.


### Setting up a scenario
Setting up the model (i.e. creating a scenario) involves defining the parameters,
the environmental variables (external forcings and chemical exposure), and the initial
conditions. Further, a tag can be assigned to easily identify the 
scenario by name. Also, for primary producer models, a transfer of biomass
can be defined to match the experimental design of the study.

```{r}
## Get all model parameters
# Lemna model parameters taken from file 'mm2.r' included
# in supporting material of Schmitt et al. (2013)
param_study <- list(
  #     - Effect -
  Emax     = 0.784,   # [same as conc. data]  maximum effect concentration
  EC50     = 0.3,     # [same as conc. data]  Midpoint of effect curve
  b        = 4.16,    # [-]  Slope of effect curve
  #     - Toxicokinetics -
  P_up     = 0.0054,  # [cm/d]  Permeability for uptake
  AperBM   = 1000,    # [cm^2/g_dw]  Frond area/dry weight
  Kbm      = 0.75,    # []  Biomass(fw)-water partition coefficient
  P_Temp   = F,       # Switch for temperature dependence of cuticle permeability
  MolWeight = 381,    # [g/mol] Molmass of molecule (determines Q10_permeability)
  #     - Fate of biomass -
  k_phot_fix  = F,    # If True k_G_max is not changed by environmental factors
  k_phot_max  = 0.47, # [1/d]  Maximum photosynthesis rate
  k_resp   = 0.05,    # [1/d]  Respiration rate at ref. temperature
  k_loss   = 0.0,     # [1/d]  Some rate of loss (e.g. Flow rate)
  #     - Temperature dependence -
  Tmin     = 8.0  ,   # [°C]  Minimum growth temperature 
  Tmax     = 40.5 ,   # [°C]  Maximum growth temperature
  Topt     = 26.7 ,   # [°C]  Optimum growth temperature 
  t_ref    = 25,      # [°C]  Reference temperature for respiration rate
  Q10      = 2,       # [-]   Temperature dependent factor for respiration rate
  #     - Light dependence (Linear) -
  k_0      = 3 ,      # [1/d]  Intercept of linear part
  a_k      = 5E-5 ,   # [(1/d)/(kJ/m^2/d)]   Slope of linear part
  #     - Phosphorus dependence (Hill like dependence) -
  C_P      = 0.3,     # [mg/L]  Phosporus concentration in water
  CP50     = 0.0043,  # [mg/L]  P-conc. where growth rate is half
  a_P      = 1,       # []  Hill coefficient
  KiP      = 101,     # [mg/L]  P-inhibition constant for very high P-conc.
  #     - Nitrogen dependence (Hill like dependence) -
  C_N      = 0.6,     # [mg/L]  Nitrogen concentration in water
  CN50     = 0.034,   # [mg/L]  N-conc. where growth rate is half
  a_N      = 1,       # []  Hill coefficient
  KiN      = 604,     # [mg/L]  n-inhibition constant for very high P-conc.
  #     - Density dependence -
  BM50     = 176,     # [g_dw/m^2] Cut off BM
  #     - Others -
  mass_per_frond = 0.0001,  # [g_dw/frond]  Dry weight per frond
  BMw2BMd  = 16.7     # [g_fresh/g_dry]  Fresh- / dryweight 
)


## Define the forcing variables
forc_temp <- data.frame(t = 0, tmp = 12) # [°C]  Current temperature (may also be a table)
forc_rad  <- data.frame(t = 0, rad = 15000) # [kJ/m²/d]  Radiation  (may also be given as table)


## Define a simple exposure pattern
# t 0..6  concentration 1 ug/L
# t 7..14 concentration 0 ug/L
exposure <- data.frame(time = 0:14, 
                       conc = c(rep(1, 7), rep(0, 8))
                       )


## Set initial values 
# given in file 'mmc2.r' of Schmitt et al. (2013)
init <- c(
  BM       = 50,     # [g_dw/m^2]  Dry Biomass dry weight per m2
  E        = 1,      # (0-1)  (Toxic) Effect = Factor on growth rate  (Range: 0 - 1, 1=no effect)
  M_int    = 0       # [ug]   Amount of toxicant in biomass
)

## create a scenario object, containing the model (with parameters) and the exposure time-series
Lemna_Schmitt() %>%               # the Lemna model by Schmitt et al. (2013)
  set_tag("metsulfuron") %>%      # set a tage for the specific implementation of the model
  set_init(init) %>%              # set the starting values (as prepared above)
  set_param(param_study) %>%      # set the parameters (as prepared above)
  set_exposure(exposure) %>%      # set the exposure scenario (exposure time series)
  set_forcings(temp=forc_temp, rad=forc_rad) -> metsulfuron # set the external forcing 
# variables, and save everything as an object


```


### Simulating a scenario and plotting

```{r out.width = "100%"}
## simulate with model, under a range of different exposure scenarios
# create several exposure scenarios
exp_scen <- schmitt2013 %>% select(time, conc, trial)

# simulate for all these scenarios
results <- metsulfuron %>%
  batch(exp_scen) %>%
  simulate()

# plot results
plot_sd(
  model_base = metsulfuron,
  treatments = exp_scen,
  rs_mean = results,
)


## simulate with model, under a range of different exposure scenarios, and including 
## a biomass transfer
# simulate for all scenarios
results <- metsulfuron %>%
  set_transfer(interval = c(5), biomass = 10) %>% # implement a biomass transfer every 5 days
  batch(exp_scen) %>%
  simulate()

# plot results
plot_sd(
  model_base = metsulfuron,
  treatments = exp_scen,
  rs_mean = results,
)

```

