---
title: "R package **fdesigns**: optimal designs for functional models"
author: |
  Damianos Michaelides\footnote{Corresponding author: \texttt{dm3g15@soton.ac.uk}}, Antony Overstall, Dave Woods
date: ""
output: pdf_document
vignette: >
  %\VignetteIndexEntry{Introduction to fdesigns}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
geometry: margin=1.25in
---

# Using the **fdesigns** package

This vignette provides examples of the usage of the `R` package **fdesigns**. The package implements optimal experimental designs for functional models. The two main functions are `pflm()` and `pfglm()`. Both implement optimal experimental designs for models involving profile factors. The package also includes a support function `P()`, as well as printing and plotting functions for the resulting objects.  

- `pflm()` = parallel functional linear models (coordinate exchange algorithm).  
- `pfglm()` = parallel functional generalised linear models (coordinate exchange algorithm).  
- `P()` = constructs polynomials for profile factors.  

---

# Examples

The examples focus on the main functions `pflm()` and `pfglm()`. The support function `P()` is used inside the `formula` argument to specify polynomials. The functions are computationally efficient for multiple profile factors. Generalised models are more computationally expensive, so examples are kept simple.  

---

## FLM with one profile factor

In this example, a functional linear model with a single profile factor is considered.  

The profile factor is represented by a BS basis of degree 0 with 3 equally spaced interior knots, and time boundaries between 0 and 1. This gives 4 basis functions. The number of runs is 4.

```{r, eval=FALSE}
tbounds <- c(0, 1)
nruns <- 4
npf <- 1
dx <- c(0)
knotsx <- list(c(0.25, 0.50, 0.75))
nx <- rep(0, npf)
for (j in 1:npf) {
  nx[j] <- dx[j] + length(knotsx[[j]]) + 1
}
```

One hundred starting designs are considered. The argument `startd` is left to its default `NULL`. Thus, the starting designs are automatically generated within the function `pflm()`.  

The functional parameter is assumed to be a linear power series, and the goal is to find an A-optimal design. To achieve this, the argument `criterion` is set to `"SE"`, which corresponds to the squared error loss function. The smoothing parameter `lambda` is set to zero, i.e., an improper prior that reduces the criterion to A-optimality. All other arguments are kept at their default values.  

```{r, eval=FALSE}
example1a <- pflm(formula = ~ x1, nsd = 100, mc.cores = 1, 
                  npf = npf, tbounds = tbounds, 
                  nruns = nruns, dx = dx, knotsx = knotsx, 
                  pars = c("power"), db = c(1), 
                  knotsb = list(c()), criterion = "SE", 
                  lambda = 0) 
```

Printing the resulting `"flm"` object using the code:  

```{r, eval=FALSE}
print(example1a)
```

provides the summary of the outcome.

```
The number of profile factors is: 1
The number of runs is: 4
The loss function is: SE
The objective value is: 8.75
The number of iterations is: 5
The computing elapsed time is: 00:00:00
```

The final design is extracted using the code,

```{r, eval=FALSE}
example1a$design
```

and the outcome is a $4 \times 4$ design matrix.

```     
      [,1] [,2] [,3] [,4]
[1,]    1    1    1    1
[2,]    1    1   -1   -1
[3,]   -1   -1    1    1
[4,]   -1   -1    1    1
```
The optimal functions of the profile factor are plotted using the code, 

```{r, eval=FALSE}
par(mfrow = c(2,2))
plot(example1a, pf = 1)
```

where 1 corresponds to the profile factor to plot. 

Alternatively, suppose that the interest becomes D-optimality (Shannon Information loss function with an improper prior) and the functional parameter is represented by a quadratic power basis, with everything else remain unchanged, then the code is,

```{r, eval=FALSE}
example1b <- pflm(formula = ~ x1, nsd = 100, mc.cores = 1, 
                  npf = npf, tbounds = tbounds, 
                  nruns = nruns, dx = dx, knotsx = knotsx, 
                  pars = c("power"), db = c(2), 
                  knotsb = list(c()), criterion = "SI", 
                  lambda = 0) 
                 				
print(example1b)                 				
```

and the summary of the outcome is,

```
The number of profile factors is: 1
The number of runs is: 4
The loss function is: SI
The objective value is: 4.618802
The number of iterations is: 3
The computing elapsed time is: 00:00:00
\end{verbatim}
```

---

## FLM with one profile factor and roughness penalty {#packex2}

A single profile factor is still considered as in the previous section. The addition to the previous example is that the complexity of the parameter is penalised through a smoothing parameter $\lambda = 10$. The loss function is weighted squared error and the parameter basis is a quadratic power basis. The optimal design can be found using the code:

```{r eval=FALSE}
example2 <- pflm(formula = ~ x1, nsd = 100, mc.cores = 1, 
                 npf = 1, tbounds = c(0, 1), nruns = 4, 
                 dx = dx, knotsx = knotsx, 
                 pars = c("power"), db = c(2), 
                 knotsb = list(c()), 
                 criterion = "WSE", lambda = 10)

print(example2)
```

```
The number of profile factors is: 1
The number of runs is: 4
The loss function is: WSE
The objective value is: 1.416839
The number of iterations is: 5
The computing elapsed time is: 00:00:00
```

---

## FLM with one profile factor and three scalar factors

In this example, the FLM with a single profile factor and three scalar factors from Section \@ref(packex2) is considered.  
It is assumed that control of the profile factor is represented by a B-spline basis of degree zero with three equally spaced interior knots and time boundaries again being 0 and 1.  

Main effects of the scalar factors are considered. The number of runs is 12. The scalar factors are passed as profile factors to the function `pflm()`, with degree zero and no interior knots to specify that they are scalar factors. Thus, the argument `npf` is equal to 4.  

```{r eval=FALSE}
tbounds <- c(0, 1)
nruns <- 12
npf <- 4
dx <- c(0, 0, 0, 0)
knotsx <- list(c(0.25, 0.50, 0.75), c(), c(), c())
nx <- rep(0, npf)
for (j in 1:npf) {
  nx[j] <- dx[j] + length(knotsx[[j]]) + 1
}
```

Fifty starting designs are generated and passed to the function manually, with the bounds of the factors set to the defaults -1 and 1. The factors are named $x_1, x_2, x_3, x_4$ and must match the factors in the formula argument.

```{r eval=FALSE}
indd <- list()
startd <- list()
dlbound <- -1
dubound <- 1
nsd <- 50
for (c in 1:nsd) {
  set.seed(c)
  for (i in 1:npf) {
    indd[[i]] <- matrix(runif(nruns * nx[i], dlbound, dubound), 
                    nrow = nruns, ncol = nx[i])
    names(indd)[i] <- paste0("x", i, sep="")
  }
  startd[[c]] <- indd
}
```

The functional parameter is assumed to be represented by a linear power series and the scalar parameters are represented through a power series of degree zero. Moreover, the objective criterion is A-optimality, i.e., SE loss function with improper prior.

```{r eval=FALSE}
example3a <- pflm(formula = ~ x1 + x2 + x3 + x4, nsd = nsd, 
                  mc.cores = 1, npf = npf, tbounds = tbounds, 
                  nruns = nruns, startd = startd, dx = dx, 
                  knotsx = knotsx, 
                  pars = c("power", "power", "power", "power"), 
                  db = c(1, 0, 0, 0), 
                  knotsb = list(c(), c(), c(), c()), 
                  criterion = "A", lambda = 0, dlbound = dlbound, 
                  dubound = dubound, tol = 0.0001)
```

Printing the resulting `"flm"` object using the code:  

```{r, eval=FALSE}
print(example3a)
```

provides the summary of the outcome.

```
The number of profile factors is: 4
The number of runs is: 12
The loss function is: SE
The objective value is: 2.833333
The number of iterations is: 10
The computing elapsed time is: 00:00:01
```


---

## FGLM with one profile factor and quadrature approximation

A functional logistic model depending on a single profile factor is considered. Thus, the `family` choice in R is `"binomial"`.  

It is assumed that control of the profile factor is represented by a B-spline basis of degree zero and for this example, the choice of seven equally spaced interior knots is considered, with time boundaries being 0 and 1.   Thus, there are eight basis functions for the profile factor. The number of experimental runs is eight.  

```{r eval=FALSE}
tbounds <- c(0, 1)
nruns <- 8
npf <- 1
dx <- c(0)
knotsx <- list(c(0.125, 0.250, 0.375, 0.500, 
                 0.625, 0.750, 0.875))
nx <- rep(0, npf)
for (j in 1:npf) {
  nx[j] <- dx[j] + length(knotsx[[j]]) + 1
}
```

Fifty starting designs are considered, which are generated automatically within `pfglm()`.  
The bounds of the profile factor are set to the defaults.  

The functional parameter is assumed to be a linear power series.  
The objective is to identify pseudo-Bayesian A-optimal designs.  
The prior of the parameters is assumed to be normal, with mean zero and variance one.  
To approximate the expectation with respect to the prior, a normal quadrature scheme with abscissas and weights is applied.  
All other arguments are kept to their default values.  

```{r eval=FALSE}
example4 <- pfglm(formula = x1, nsd = 50, mc.cores = 1, 
                  npf = npf, tbounds = tbounds, 
                  nruns = nruns, dx = dx, knotsx = knotsx, 
                  pars = c("power"), db = c(1), 
                  knotsb = list(c()), criterion = "A", 
                  family = binomial, method = c("quadrature"), 
                  level = NULL, B = NULL, 
                  prior = list(mu = c(0), sigma2 = c(1)), 
                  dlbound = -1, dubound = 1)
```

Printing the resulting `"fglm"` object using the code:  

```{r eval=FALSE}
print(example4)
```

```
The number of profile factors is: 1
The number of runs is: 8
The objective criterion is: A-optimality
The objective value is: 21.64537
The number of iterations is: 5
The method of approximation is: quadrature
The family distribution and the link function are: binomial and logit
The computing elapsed time is: 00:00:00
```




---

## FGLM with one profile factor depending on main effect and MC approximation

In this example, the functional Poisson model depending on a single profile factor is considered. Thus, the `family` choice in **R** is `"poisson"`. It is assumed that control of the profile factor is represented by a B-spline basis of degree one, with the choice of four equally spaced interior knots. Thus, there are six basis functions for the profile factor. The time boundaries are 0 and 1, and the number of runs is 8.  

The model with main and quadratic effect of the profile factor is tackled. The parameters are assumed to be represented by a B-spline basis of degree one and a single knot at *t = 0.5*.  

```{r eval=FALSE}
tbounds <- c(0, 1)
nruns <- 8
npf <- 1
dx <- c(1)
knotsx <- list(c(0.20, 0.40, 0.60, 0.80))
nx <- rep(0, npf)
for (j in 1:npf) {
  nx[j] <- dx[j] + length(knotsx[[j]]) + 1
}
```

The use of 8 runs and B-spline basis for the parameters increases the complexity, and hence the computational expense.  

For this reason, the Monte Carlo approximation, with a normal prior (mean zero and variance two), is preferred. The `prior` argument, when `method = "MC"`, must be a function. The function used is:

```{r eval=FALSE}
set.seed(100)
prmc <- function(B, Q){
  matrix(rnorm(B * Q, mean = 0, sd = sqrt(2)), nrow = B, ncol = Q)
}
```

```{r eval=FALSE}
example5 <- pfglm(formula = ~ 1 + x1 + P(x1, 2), nsd = 1, mc.cores = 1, 
                  npf = 1, tbounds = tbounds, nruns = nruns, 
                  startd = NULL, dx = dx, knotsx = knotsx, 
                  pars = c("bspline", "bspline"), db = c(1, 1), 
                  knotsb = list(c(0.5), c(0.5)), lambda = 0, 
                  criterion = "D", family = poisson, method = c("MC"), 
                  level = NULL, B = 10000, prior = prmc, tol = 0.01)
```

with printed output,

```
The number of profile factors is: 1
The number of runs is: 8
The objective criterion is: D-optimality
The objective value is: 12.13859
The number of iterations is: 9
The method of approximation is: MC
The family distribution and the link function are: poisson and log
The computing elapsed time is: 00:00:16
```





