---
title: "maldipickr"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{maldipickr}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(maldipickr)
```

<!-- WARNING - This vignette is generated by {fusen} from dev/maldipickr.Rmd: do not edit by hand -->

## Quickstart

The `{maldipickr}` package helps microbiologists reduce duplicate/clonal bacteria from their cultures and eventually exclude previously selected bacteria. `{maldipickr}` achieve this feat by grouping together data from MALDI Biotyper and helps choose representative bacteria from each group using user-relevant metadata -- a process known as **cherry-picking**.

`{maldipickr}` cherry-picks bacterial isolates with MALDI Biotyper:

* [using taxonomic identification report](#using-taxonomic-identification-report)
* [using spectra data](#using-spectra-data)



### Using taxonomic identification report

First make sure `{maldipickr}` is installed and loaded, alternatively [follow the instructions to install the package](https://clavellab.github.io/maldipickr/index.html#installation).

Cherry-picking four isolates based on their taxonomic identification by the MALDI Biotyper is done in a few steps with `{maldipickr}`.


#### Get example data

We import an example Biotyper CSV report and glimpse at the table.


```{r quickstart_report_data, eval = TRUE}
report_tbl <- read_biotyper_report(
  system.file("biotyper_unknown.csv", package = "maldipickr")
)
report_tbl %>%
  dplyr::select(name, bruker_species, bruker_log) %>% knitr::kable()
```

#### Delineate clusters and cherry-pick

Delineate clusters from the identifications after filtering the reliable ones and cherry-pick one representative spectra.

Unreliable identifications based on the log-score are replaced by "not reliable identification", but stay tuned as they do not represent the same isolates!


```{r quickstart_report_filter, eval = TRUE}
report_tbl <- report_tbl %>%
  dplyr::mutate(
      bruker_species = dplyr::if_else(bruker_log >= 2, bruker_species,
                                      "not reliable identification")
  )
knitr::kable(report_tbl)
```

The chosen ones are indicated by `to_pick` column.


```{r quickstart_report_delineate, eval = TRUE}
report_tbl %>%
  delineate_with_identification() %>%
  pick_spectra(report_tbl, criteria_column = "bruker_log") %>%
  dplyr::relocate(name, to_pick, bruker_species) %>% 
  knitr::kable()
```

### Using spectra data

In parallel to taxonomic identification reports, `{maldipickr}` process spectra data.
Make sure `{maldipickr}` is installed and loaded, alternatively [follow the instructions to install the package](https://clavellab.github.io/maldipickr/index.html#installation).

Cherry-picking six isolates from three species based on their spectra data obtained from the MALDI Biotyper is done in a few steps with `{maldipickr}`.


#### Get example data

We set up the directory location of our example spectra data, but adjust for your requirements. We import and process the spectra which gives us a named list of three objects: spectra, peaks and metadata (more details in Value section of `process_spectra()`).



```{r quickstart_spectra_data, eval = TRUE}
spectra_dir <- system.file("toy-species-spectra", package = "maldipickr")

processed <- spectra_dir %>%
  import_biotyper_spectra() %>%
  process_spectra()
```

#### Delineate clusters and cherry-pick

Delineate spectra clusters using Cosine similarity and cherry-pick one representative spectra.
The chosen ones are indicated by `to_pick` column.


```{r quickstart_spectra_delineate, eval = TRUE}
processed %>%
  list() %>%
  merge_processed_spectra() %>%
  coop::tcosine() %>%
  delineate_with_similarity(threshold = 0.92) %>%
  set_reference_spectra(processed$metadata) %>%
  pick_spectra() %>%
  dplyr::relocate(name, to_pick) %>% 
  knitr::kable()
```

This provides only a brief overview of the features of `{maldipickr}`, browse the  other vignettes to learn more about additional features.


## Session information

```{r session, eval = TRUE}
sessionInfo()
```

