---
title: "mirai & bakerrr"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{mirai & bakerrr}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Overview

This vignette compares two approaches to parallel processing in R: mirai and bakerrr. Both packages enable parallel execution of computationally intensive tasks, but with different design philosophies and usage patterns.

# Setup

We'll use a bootstrap calculation function that simulates long-running computations:

```{r setup}
library(mirai)
library(bakerrr)

long_stat_calc <- function(x, n_boot, sleep_time) {
  # x: numeric vector
  # n_boot: number of bootstraps
  # sleep_time: pause after each bootstrap (sec)

  if (!is.numeric(x)) stop("Input x must be numeric.")
  if (length(x) < 2) stop("Input x must have at least 2 values.")

  start_time <- Sys.time()
  boot_means <- numeric(n_boot)

  for (i in seq_len(n_boot)) {
    boot_means[i] <- mean(sample(x, replace = TRUE))
    if (sleep_time > 0) Sys.sleep(sleep_time)
  }

  end_time <- Sys.time()

  result <- list(
    boot_mean = mean(boot_means),
    boot_sd   = sd(boot_means),
    elapsed   = difftime(end_time, start_time, units = "secs")
  )
  class(result) <- "long_stat_calc"
  result
}

# Print method for easy reporting
print.long_stat_calc <- function(x, ...) {
  cat("Bootstrap Mean:", x$boot_mean, "\n")
  cat("Bootstrap SD:  ", x$boot_sd, "\n")
  cat("Elapsed Time:  ", x$elapsed, "seconds\n")
}
```

# Data Preparation

```{r data}
# Arguments for 10 parallel jobs
args_list <- list(
  list(x = rnorm(100), n_boot = 3000, sleep_time = 0.002),
  list(x = rnorm(100), n_boot = 3000, sleep_time = 0.002),
  list(x = rnorm(100), n_boot = 3000, sleep_time = 0.002),
  list(x = rnorm(100), n_boot = 3000, sleep_time = 0.002),
  list(x = rnorm(100), n_boot = 3000, sleep_time = 0.002),
  list(x = rnorm(100), n_boot = 3000, sleep_time = 0.002),
  list(x = rnorm(100), n_boot = 3000, sleep_time = 0.002),
  list(x = rnorm(100), n_boot = 3000, sleep_time = 0.002),
  list(x = rnorm(100), n_boot = 3000, sleep_time = 0.002),
  list(x = rnorm(100), n_boot = 3000, sleep_time = 0.002)
)
```

# mirai Implementation

mirai provides a lightweight, async-focused approach:

```{r mirai}
# Clean slate
mirai::daemons(0)
set.seed(10)

mirai_timing <- system.time({
  mirai::daemons(6)  # Start 6 daemon processes

  res <- mirai::mirai_map(
    .x = list(
      rnorm(100), rnorm(100), rnorm(100), rnorm(100),
      rnorm(100), rnorm(100), rnorm(100), rnorm(100),
      rnorm(100), rnorm(100)
    ),
    .f = long_stat_calc,
    .args = list(n_boot = 3000, sleep_time = 0.002)
  )

  # Check progress and collect results
  res[.progress]
  mirai_results <- res[.flat]
})

print(mirai_timing)
mirai::daemons(0)  # Clean up

```

# bakerrr Implementation

bakerrr offers an object-oriented approach with built-in job management:

```{r bakerrr}
bakerrr_timing <- system.time({
  baker <- bakerrr::bakerrr(
    long_stat_calc,
    args_list = args_list,
    n_daemons = 6
    # Optional: bg_args = list(stdout = "out.log", stderr = "error.log") # nolint
  ) |>
    bakerrr::run_jobs(wait_for_results = TRUE)

  bakerrr_results <- baker@results
})

print(bakerrr_timing)
```

# Comparison

## Performance

Both approaches show similar performance for CPU-bound tasks, with actual timing dependent on:

 - Task complexity
 - Number of workers
 - System resources
 - Overhead differences

## API Design

### mirai:

 - Functional programming style
 - Explicit daemon management
 - Direct result collection
 - Minimal syntax

### bakerrr:

 - Object-oriented approach
 - Automatic resource management
 - Built-in logging options
 - Method chaining support

## Use Cases

### Choose mirai when:

 - You need fine-grained control over async operations
 - Working with streaming or reactive computations
 - Minimal dependencies are important
 - Direct integration with other async patterns

### Choose bakerrr when:

 - You prefer object-oriented workflows
 - Built-in logging and error handling are valuable
 - Working within larger application frameworks
 - Method chaining fits your coding style
 
# Results Inspection
 ``` {r}
# Both approaches return similar structured results
str(mirai_results[[1]])
str(bakerrr_results[[1]])

# Print first result from each method
print(mirai_results[[1]])
print(bakerrr_results[[1]])
 ```

# Conclusion

Both mirai and bakerrr provide effective parallel processing capabilities. The choice depends on your specific requirements:

 - mirai: Lightweight, functional, explicit control
 - bakerrr: Object-oriented, feature-rich, automatic management

For production workflows requiring robust error handling and logging, bakerrr may be preferable. For performance-critical applications needing minimal overhead, mirai could be the better choice.

# Session Info

``` {r}
sessionInfo()
```
