---
title: "Getting started with ssw-r: Sequence alignment examples"
bibliography: ssw.bib
output:
  rmarkdown::html_document:
    toc: true
    number_sections: false
    highlight: "textmate"
    css: "custom.css"
vignette: >
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteIndexEntry{Getting started with ssw-r: Sequence alignment examples}
---

```{r, include=FALSE}
knitr::opts_chunk$set(
  comment = "NA",
  collapse = TRUE
)

run <- if (ssw::is_installed_ssw_py()) TRUE else FALSE
knitr::opts_chunk$set(eval = run)
```

The R package ssw offers an R interface for
[SSW](https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library),
a fast implementation of the Smith-Waterman algorithm for sequence alignment
using SIMD, described in @zhao2013ssw. The package is currently built on
[ssw-py](https://pypi.org/project/ssw-py/).

```r
library("ssw")
```

A short read sequence:

```r
read <- "ACGT"
```

## Exact alignment

Align the read against the reference sequence (`TTTTACGTCCCCC`) and print the results:

```r
read |> align("TTTTACGTCCCCC")
```

```text
CIGAR start index 4: 4M
optimal_score: 8
sub-optimal_score: 0
target_begin: 4	target_end: 7
query_begin: 0
query_end: 3

Target:        4    ACGT    7
                    ||||
Query:         0    ACGT    3
```

Get specific results, such as the alignment scores:

```r
a <- read |> align("TTTTACGTCCCCC")
```

```r
a$alignment$optimal_score
```

```text
[1] 8
```

```r
a$alignment$sub_optimal_score
```

```text
[1] 0
```

## Deletion

```r
read |> align("TTTTACAGTCCCCC")
```

```text
CIGAR start index 4: 2M1D2M
optimal_score: 5
sub-optimal_score: 0
target_begin: 4	target_end: 8
query_begin: 0
query_end: 3

Target:        4    ACAGT    8
                    ||*||
Query:         0    AC-GT    3
```

## Insertion with gap open

```r
read |> align("TTTTACTCCCCC", gap_open = 3)
```

```text
CIGAR start index 4: 2M
optimal_score: 4
sub-optimal_score: 0
target_begin: 4	target_end: 5
query_begin: 0
query_end: 1

Target:        4    AC    5
                    ||
Query:         0    AC    1
```

## Insertion with no gap open penalty

```r
read |> align("TTTTACTCCCCC", gap_open = 0)
```

```text
CIGAR start index 4: 2M1I1M
optimal_score: 6
sub-optimal_score: 0
target_begin: 4	target_end: 6
query_begin: 0
query_end: 3

Target:        4    AC-T    6
                    ||*|
Query:         0    ACGT    3
```

## Specify start index

```r
a <- align("ACTG", "ACTCACTG", start_idx = 4)
a
```

```text
CIGAR start index 0: 4M
optimal_score: 8
sub-optimal_score: 0
target_begin: 0	target_end: 3
query_begin: 0
query_end: 3

Target:        0    ACTC    3
                    |||*
Query:         0    ACTG    3
```

Print the results from position 4:

```r
a |> print(start_idx = 4)
```

```text
CIGAR start index 0: 4M
optimal_score: 8
sub-optimal_score: 0
target_begin: 0	target_end: 3
query_begin: 0
query_end: 3

Target:        0    ACTG    3
                    ||||
Query:         0    ACTG    3
```

## Forced alignment

Enforce no gaps by increasing the penalty:

```r
a <- force_align("ACTG", "TTTTCTGCCCCCACG")
a
```

The results are truncated:

```text
CIGAR start index 4: 3M
optimal_score: 6
sub-optimal_score: 0
target_begin: 4	target_end: 6
query_begin: 1
query_end: 3

Target:        4    CTG    6
                    |||
Query:         1    CTG    3
```

Use `formatter()` to avoid the truncation:

```r
b <- a |> formatter()
b
```

```text
[[1]]
[1] "TTTTCTGCCCCCACG"

[[2]]
[1] "   ACTG"
```

Or pretty-print the formatted results directly:

```r
a |> formatter(print = TRUE)
```

```text
TTTTCTGCCCCCACG
   ACTG
```

## Acknowledgements

ssw-r is built upon the work of two outstanding projects:

1. [SSW](https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library) -
   Original C implementation. Author: Mengyao Zhao
1. [ssw-py](https://pypi.org/project/ssw-py/) - Python binding for SSW.
   Author: Nick Conway

We extend our sincere gratitude to Mengyao Zhao for developing the original
SSW library and to Nick Conway for maintaining the ssw-py package.
Their work forms the foundation of ssw-r.
While ssw-r does not directly incorporate code from these projects,
it serves as an R interface to their functionality. We encourage users to
visit the original repositories for more information about the underlying
implementation and to consider citing these works in publications that use ssw-r.

## References
