---
title: "Convenient fetching of EZbakR outputs: EZget()"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{EZget}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

## Introduction

This vignette shows how to use the `EZget()` function provided by `EZbakR`.
In cases where you have multiple tables of a particular type in your `EZbakRData`
object, this can greatly facilitate extracting the table of interest. As a part
of this vignette, I will also describe how an `EZbakRData` object is organized.

```{r setup}
library(EZbakR)
```

## EZbakRData objects

Let's first analyze some simulated data to generate an `EZbakRData` object that
we can explore the contents of:

```{r}
simdata <- EZSimulate(nfeatures = 300, nreps = 2)

# Make initial EZbakRData object
ezbdo <- EZbakRData(simdata$cB, simdata$metadf)

# Estimate fractions twice, and don't overwrite the first analysis
# Second run will use different model; see EstimateFractions vignette for details
ezbdo <- EstimateFractions(ezbdo)
ezbdo <- EstimateFractions(ezbdo, strategy = 'hierarchical', overwrite = FALSE)

# Estimate kinetic parameters with three different strategies
# See EstimateKinetics vignettes for details.
ezbdo <- EstimateKinetics(ezbdo, repeatID = 1)
ezbdo <- EstimateKinetics(ezbdo, repeatID = 1, strategy = "shortfeed")
ezbdo <- EstimateKinetics(ezbdo, repeatID = 2, strategy = "shortfeed")
```


An `EZbakRData` object is a list that can contain the following items:

1. **cB**: The cB table you provided upon object creation.
2. **metadf**: The metadf table you provided upon object creation.
3. **fractions**: List of fractions estimates generated by `EstimateFractions()`.
4. **kinetics**: List of kinetic parameter estimates generated by `EstimateKinetics()`.
5. **averages**: List of parameter replicate averages generated by `AverageAndRegularize()`.
6. **comparisons**: List of comparisons of parameter averages, generated by `CompareParameters()`.
7. **dynamics**: List of dynamical systems model parameter estimated, generateld by `EZDynamics()`.
8. **readcounts**: List of tables of read counts generated by various EZbakR functions.
9. **metadata**: List with elements corresponding to the lists of tables described
above. Describes various features of the tables so that they can be fetched
with `EZget()`.

As an `EZbakRData` object is a list, its elements can be accessed in a few ways:

```{r}
# `$` notation:
ezbdo$fractions$feature

# `[[]]` notation with element names
ezbdo[['fractions']][['feature']]

# `[[]]` notation with numeric indices
ezbdo[[4]][[1]]

```


## Using EZget

`EZget()` provides an alternative strategy for getting a particular table.
It has two required arguments:

1. `obj`: The `EZbakRData` object you would like to get a table from.
2. `type`: The type of table you are looking for. Options are "fractions",
"kinetics", "readcounts", "averages", and "comparisons", the lists of tables
described above.

Most of the remaining parameters are search criteria that you specify. The
full list can be seen in the function docs (`?EZget()`). These all except
strings or vectors of strings as input, and all metadata will be checked to see
if the provided string is contained in the respective metadata slot. For example,
we can extract the kinetics table generated from the standard analysis like so:

```{r}
kinetics <- EZget(ezbdo,
                  type = "kinetics",
                  kstrat = "standard")
```

In some cases, multiple tables with the exact same metadata exist. For example,
the metadata for `fractions` tables is:

* The feature columns by which reads were grouped. This is "feature" for both of
our `fractions` tables.
* The mutational populations analyzed. This is "TC" for both of our `fractions`
tables.
* The fraction_design table used. This is the standard fraction_design for
a single mutation type analysis for both of our `fractions` tables.

Since we set `overwrite = FALSE` in our second run of `EstimateFractions`, these
tables were both saved. What distinguishes them is a final piece of metadata 
saved for all tables: `repeatID`. This is a numerical ID that distinguishes 
multiple instances of the same table. The ID is 1 for the first such object
created, 2 for the second, etc. Thus, the analysis with the standard mixture
model has a `repeatID` of 1, and the analysis with the hierarchical mixture
model has a `repeatID` of 2. We can thus access the latter as such:

```{r}
h_fxn <- EZget(ezbdo, 
               type = 'fractions',
               repeatID = 2)
```


There are three parameters that tune `EZget()`'s behavior. These are:

1. `returnNameOnly`: If TRUE, then only the names of the tables consistent with
the search criterion you specify will be returned. This will throw a warning
if there is more than one table that passes your criteria, but it will not
error in this case. If `returnNameOnly` is `FALSE`, then an error is thrown
if there is more than one table that matches your search criteria.
2. `exactMatch`: The `features` and `populations` arguments are the two arguments
that can be vectors of strings. Setting `exactMatch` to TRUE will force the 
provided `features` and `populations` vectors to exactly match those in a table's
metadata for that table to be returned. The alternative (default) behavior, is
that the provided `feature(s)` and `population(s)` only have to all be contained
in a table's metadata.
3. `alwaysCheck`: If only a single table of the relevant `type` is present in
your `EZbakRData` object, `EZget()` automatically returns that table without
checking to see if the search criteria match. If you set `alwaysCheck` to TRUE,
then the table is searched for as normal and will only be returned if its 
metadata match the search criteria.