---
title: "breakDown plots for the generalised linear models"
author: "Przemyslaw Biecek"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{breakDown plots for the generalised linear model}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

Here we will use the HR churn data (https://www.kaggle.com/) to present the breakDown package for `glm` models.

The data is in the `breakDown` package

```{r}
library(breakDown)
head(HR_data, 3)
```

Now let's create a logistic regression model for churn, the `left` variable.

```{r}
model <- glm(left~., data = HR_data, family = "binomial")
```

But how to understand which factors drive predictions for a single observation? 

With the `breakDown` package!

Explanations for the linear predictor.

```{r, fig.width=7}
library(ggplot2)
predict(model, HR_data[11,], type = "link")

explain_1 <- broken(model, HR_data[11,])
explain_1
plot(explain_1) + ggtitle("breakDown plot for linear predictors")
```

Explanations for the probability with intercept set as an origin.

```{r, fig.width=7}
predict(model, HR_data[11,], type = "response")

explain_1 <- broken(model, HR_data[11,], baseline = "intercept")
explain_1
plot(explain_1, 
     trans = function(x) exp(x)/(1+exp(x))) + ggtitle("Predicted probability of leaving the company")+ scale_y_continuous( limits = c(0,1), name = "probability", expand = c(0,0))

```



