---
title: "General introduction: iBreakDown plots for Sinking of the RMS Titanic"
author: "Przemyslaw Biecek"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{General introduction: iBreakDown plots for Sinking of the RMS Titanic}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = FALSE,
  comment = "#>",
  fig.width = 7,
  fig.height = 3.5,
  warning = FALSE,
  message = FALSE
)
```

# Data for Titanic survival

Let's see an example for `iBreakDown` plots for survival probability of Titanic passengers.
First, let's see the data, we will find quite nice data from in the `DALEX` package (orginally `stablelearner`).

```{r}
library("DALEX")
head(titanic)
```

# Model for Titanic survival

Ok, now it's time to create a model. Let's use the Random Forest model.

```{r}
# prepare model
library("randomForest")
titanic <- na.omit(titanic)
model_titanic_rf <- randomForest(survived == "yes" ~ gender + age + class + embarked +
                                   fare + sibsp + parch,  data = titanic)
model_titanic_rf
```

# Explainer for Titanic survival

The third step (it's optional but useful) is to create a DALEX explainer for Random Forest model.

```{r}
library("DALEX")
explain_titanic_rf <- explain(model_titanic_rf, 
                      data = titanic[,-9],
                      y = titanic$survived == "yes", 
                      label = "Random Forest v7")
```

# Break Down plot with D3

Let's see Break Down for model predictions for 8 years old male from 1st class that embarked from port C.

```{r}
new_passanger <- data.frame(
  class = factor("1st", levels = c("1st", "2nd", "3rd", "deck crew", "engineering crew", "restaurant staff", "victualling crew")),
  gender = factor("male", levels = c("female", "male")),
  age = 8,
  sibsp = 0,
  parch = 0,
  fare = 72,
  embarked = factor("Southampton", levels = c("Belfast", "Cherbourg", "Queenstown", "Southampton"))
)
```

## Calculate variable attributions

```{r}
library("iBreakDown")
rf_la <- local_attributions(explain_titanic_rf, new_passanger)
rf_la
```

## Plot attributions with `ggplot2`

```{r}
plot(rf_la)
```

## Plot attributions with `D3`

```{r}
plotD3(rf_la)
```

## Calculate uncertainty for variable attributions

```{r}
rf_la_un <- break_down_uncertainty(explain_titanic_rf, new_passanger,
                         path = "average")
plot(rf_la_un)
```


## Show only top features

```{r}
plotD3(rf_la, max_features = 3)
```

## Force OX axis to be from 0 to 1

```{r}
plotD3(rf_la, max_features = 3, min_max = c(0,1), margin = 0)
```


