---
title: "Weighted Statistics With table1"
author: "Benjamin Rich"
date: "`r Sys.Date()`"
output: 
  rmarkdown::html_vignette:
    css: [style.css, vignette.css]
    toc: true
vignette: >
  %\VignetteIndexEntry{Weighted Statistics With table1}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteDepends{boot,MatchIt}
  %\VignetteEncoding{UTF-8}
---

```{r echo=FALSE, message=FALSE, warning=FALSE, results='hide'}
library(table1, quietly=TRUE)
try(detach('package:printr', unload = TRUE), silent=TRUE) # Make sure printr is not loaded
```

## Introduction

Weighted descriptive statistics are required in some contexts, for instance, in
the analysis of survey data. This can be accomplished by using the provided
`weighted` wrapper class. Internally, this will cause the functions `wtd.mean`,
`wtd.var`, `wtd.quantile` and `wtd.table` from the `Hmisc` package (which is
optional in general but required to use this functionality) to be used in place
of their standard non-weighted counterparts.

## Example

We take an example from the `survey` package (note that this is for
illustration purposes only, it is not meant to be a real application):

```{r}
library(survey, quietly=TRUE)
data(myco)

myco$Leprosy <- factor(myco$leprosy, levels=1:0, labels=c("Leprosy Cases", "Controls"))

myco$AgeCat <- factor(myco$Age,
    levels=c(7.5,      12.5,       17.5,       22.5,       27.5,       32.5      ),
    labels=c("5 to 9", "10 to 14", "15 to 19", "20 to 24", "25 to 29", "30 to 34")
)

myco$ScarL <- as.logical(myco$Scar)

label(myco$Age) <- "Age"
units(myco$Age) <- "years"
label(myco$AgeCat) <- "Age Group"
label(myco$ScarL) <- "BCG vaccination scar"

table1(~ ScarL + Age + AgeCat | Leprosy, data=weighted(myco, wt), big.mark=",")
```

It also works in "transpose" mode:

```{r}
table1(~ Age + ScarL | Leprosy, data=weighted(myco, wt), transpose=T, big.mark=",")
```

For more flexibility, we may not want the weighting to be applied globally, but
only to some of the variables. We can do this as well, by using `weighted` on
individual variables:


```{r}
table1(~ weighted(ScarL, wt) + Age + AgeCat | Leprosy, data=myco, big.mark=",")
```

This implementation allows for simple weighted statistics, but does not
currently support more complex designs from the `survey` package like
stratified sampling or cluster sampling.


## The `weighted` and `indexed` classes

The `weighted` class is just a wrapper around a vector or `data.frame` that
adds a vector of weights as an attribute. These weights are carried along or
subsetted appropriately during operations like slicing or subsetting. See
`?weighted` for some examples.

The `indexed` class is similar, but it simply maintains the indices of a vector
(row indices for a `data.frame`) when a subset or slide is taken. This leads to
some interesting possibilities when we want to do more complex things.

The following example also comes from the `survey` package:

```{r}
data(api)

dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)

svyby(~api99+api00, ~stype, dclus1, svymean)

svytable(~sch.wide+stype, dclus1)
```

Using `table1`, the same results can be presented more beautifully:


```{r}
myrender <- function(x, name, ...) {
    if (is.numeric(x)) {
        r <- svymean(as.formula(paste0("~", name)), subset(dclus1, (1:nrow(dclus1)) %in% indices(x)))
        r <- c(Mean=as.numeric(r), SE=sqrt(attr(r, "var", exact=T)))
        r <- unlist(stats.apply.rounding(as.list(r), big.mark=","))
    } else {
        r <- svytable(as.formula(paste0("~", name)), subset(dclus1, (1:nrow(dclus1)) %in% indices(x)))
        r <- unlist(stats.apply.rounding(as.list(r), big.mark=",", digits=1, rounding.fn=round_pad))
    }
    c("", r)
}

apiclus1$stype2 <- factor(apiclus1$stype, levels=c("E", "M", "H"),
    labels=c("Elementary", "Middle School", "High School"))

label(apiclus1$api99)    <- "API in 1999"
label(apiclus1$api00)    <- "API in 2000"
label(apiclus1$sch.wide) <- "Met school-wide growth target?"

table1(~ api99 + api00 + sch.wide | stype2, indexed(apiclus1), render=myrender,
    render.strat=names)
```


