---
title: "How to use shapper for classification"
author: "Alicja Gosiewska"
date: "`r Sys.Date()`"
output: 
  html_document:
    toc: true
    toc_float: true
    number_sections: true
vignette: >
  %\VignetteEngine{knitr::knitr}
  %\VignetteIndexEntry{How to use shapper for classification}
  %\usepackage[UTF-8]{inputenc}
---

```{r setup, include=FALSE, eval=FALSE}
knitr::opts_chunk$set(echo = TRUE,
                      message = FALSE,
                      warning = FALSE)
```

# Introduction

The `shapper` is an R package which ports the `shap` python library in R. For details and examples see [shapper repository on github](https://github.com/ModelOriented/shapper) and [shapper website](https://modeloriented.github.io/shapper/).

SHAP (SHapley Additive exPlanations) is a method to explain predictions of any machine learning model. 
For more details about this method see [shap repository on github](https://github.com/slundberg/shap).

## Python library shap

To run shapper python library shap is required. It can be installed both by python or R. To install it throught R, you an use function `install_shap` from the `shapper` package.

```{r, eval = FALSE}
shapper::install_shap()
```


# Load data sets

The example usage is presented on the `HR` dataset from the R package `DALEX`. For more details see [DALEX github repository](https://github.com/ModelOriented/DALEX).

```{r eval = FALSE}
library("DALEX")
Y_train <- HR$status
x_train <- HR[ , -6]

```

# Let's build models

```{r eval = FALSE}
library("randomForest")
set.seed(123)
model_rf <- randomForest(x = x_train, y = Y_train)

library(rpart)
model_tree <- rpart(status~. , data = HR)
```

# Here shapper starts

First step is to create an explainer for each model. The explainer is an object that wraps up a model and meta-data.
```{r eval = FALSE}
library(shapper)

p_function <- function(model, data) predict(model, newdata = data, type = "prob")

ive_rf <- individual_variable_effect(model_rf, data = x_train, predict_function = p_function,
            new_observation = x_train[1:2,], nsamples = 50)


ive_tree <- individual_variable_effect(model_tree, data = x_train, predict_function = p_function,
            new_observation = x_train[1:2,], nsamples = 50)

```

```{r eval = FALSE}
ive_rf
```

# Plotting results

```{r eval = FALSE}
plot(ive_rf, bar_width = 4)
```
[](https://modeloriented.github.io/shapper/articles/shapper_classification_files/figure-html/plot1-1.png)

To see only attributions use option `show_predicted = FALSE`.

```{r eval = FALSE}
plot(ive_rf, show_predicted = FALSE, bar_width = 4)
```
[](https://modeloriented.github.io/shapper/articles/shapper_classification_files/figure-html/unnamed-chunk-6-1.png)

We can show many models on one grid.

```{r eval = FALSE}
plot(ive_rf, ive_tree, show_predicted = FALSE, bar_width = 4)
```
[](https://modeloriented.github.io/shapper/articles/shapper_classification_files/figure-html/unnamed-chunk-7-1.png)

## Let's filter data for plot


```{r eval = FALSE}
ive_rf_filtered <- ive_rf[ive_rf$`_ylevel_` =="fired", ]
shapper:::plot.individual_variable_effect(ive_rf_filtered)
```
[](https://modeloriented.github.io/shapper/articles/shapper_classification_files/figure-html/unnamed-chunk-8-1.png)