---
title: "Using data from non-B cells"
author: "Kenneth B. Hoehn"
date: "`r Sys.Date()`"
output:
  html_document:
    fig_height: 4
    fig_width: 7.5
    highlight: pygments
    theme: readable
    toc: yes
  pdf_document:
    dev: pdf
    fig_height: 4
    fig_width: 7.5
    highlight: pygments
    toc: yes
  md_document:
    fig_height: 4
    fig_width: 7.5
    preserve_yaml: no
    toc: yes
geometry: margin=1in
fontsize: 11pt
vignette: >
  %\VignetteIndexEntry{Using data from non-B cells}
  %\VignetteEncoding{UTF-8}  
  %\usepackage[utf8]{inputenc}
  %\VignetteEngine{knitr::rmarkdown}
---

While originally designed for B cells, Dowser also supports phylogenetic inference for non B cells, especially cells evolving from a known ancestral sequence, such as tumor lineages.

If sequences are from a single lineage, the only requirement for non-B cell data is that the sequences supplied to `formatClones` are aligned and in a data.frame with a column for sequences and a column for sequence IDs. If from multiple lineages, they can be deliminated using the `clone_id` column.

In the code block below, we show how trees can be built using data with only sequence IDs, sequences, and germline sequences.

```{r, eval=FALSE, warning=FALSE, message=FALSE}
library(dowser)

data(ExampleAirr)

ExampleAirr <- dplyr::select(dplyr::filter(ExampleAirr, clone_id=="3128"),
 sequence_id, sequence_alignment, germline_alignment)

clones <- formatClones(ExampleAirr, germ="germline_alignment")

trees <- getTrees(clones)

```

Note that if specified `v_call`, `j_call`, and `junction_length` columns are not found in the input data.frame, the options `use_regions` will be set to false, as it is only for BCR sequences. If not already present, the `clone_id` and `locus` columns will be added to the dataframe with values 0 and "N", respectively.

When using `getTimeTrees` or `getTimeTreesIterate`, a meaninful germline is not required. Instead, you can set a `germline_alignment` which is series of N nucleotides the same length as the sequences in the sequence_alignment column, and set `include_germline=FALSE.`