---
title: "votesmart"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{votesmart}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```


```{r setup}
library(votesmart)
```

The first step to using the `votesmart` package is to register an API key and store it in an environment variable by following [these instructions](https://github.com/decktools/votesmart#api-keys).

Let's make sure our API key is set. 

```{r}
# If our key is not registered in this environment variable,
# the result of `Sys.getenv("VOTESMART_API_KEY")` will be `""` (i.e. a string of `nchar` 0)
key <- Sys.getenv("VOTESMART_API_KEY")

key_exists <- (nchar(key) > 0)

if (!key_exists) knitr::knit_exit()
```


We'll also attach `dplyr` for working with dataframes.

```{r}
suppressPackageStartupMessages(library(dplyr))
conflicted::conflict_prefer("filter", "dplyr")
```

<br>

### Motivation

Some of these functions are necessary precursors to obtain data you might want. For instance, in order to get candidates' ratings by SIGs, you'll need to get `office_level_id`s in order to get `office_id`s, which is a required argument to get candidate information using `candidates_get_by_office_state`. We'll go through what might be a typical example of how you might use the `votesmart` package.

<br>

## Get Candidate Info

There are currently three functions for getting data on VoteSmart candidates: `candidates_get_by_lastname`, `candidates_get_by_levenshtein`, and `candidates_get_by_office_state`.

Let's search for former US House Rep [Barney Frank](https://en.wikipedia.org/wiki/Barney_Frank) using `candidates_get_by_lastname`.

<br>

From `?candidates_get_by_lastname`, this function's defaults are:

```
candidates_get_by_lastname(
  last_names,
  election_years = lubridate::year(lubridate::today()),
  stage_ids = "",
  all = TRUE,
  verbose = TRUE
)
```

Since the default election year is the current year and Barney Frank left office in 2013, we'll specify a few years in which he ran for office. 

```{r}
(franks <-
  candidates_get_by_lastname(
    last_names = "frank",
    election_years = c(2000, 2004)
  )
)
```

Looking at the `first_name` column, are a number of non-Barneys returned. We can next filter our results to Barney.

```{r}
(barneys <-
  franks %>%
  filter(first_name == "Barney") %>%
  select(
    candidate_id, first_name, last_name,
    election_year, election_state_id, election_office
  )
)
```

 The two rows returned correspond to the two `election_year`s we specified. Each candidate gets their own unique `candidate_id`, which we can `pull` out.

```{r}
(barney_id <-
  barneys %>%
  pull(candidate_id) %>%
  unique()
)
```


<br>

## Get Candidates' Ratings

One of the most powerful things about VoteSmart is its wealth of information about candidates' positions on issues as rated by a number of Special Interest Groups, or SIGs.

Given a `candidate_id`, we can ask for those ratings using `rating_get_candidate_ratings`.

```{r}
(barney_ratings <-
  rating_get_candidate_ratings(
    candidate_ids = barney_id,
    sig_ids = "" # All SIGs
  )
)
```

There are a lot of columns here because some ratings are tagged with multiple categories. 

```{r}
main_cols <- c("rating", "category_name_1", "sig_id", "timespan")
```

We'll filter to Barney's ratings on the environment using just the first category name.

```{r}
(barney_on_env <-
  barney_ratings %>%
  filter(category_name_1 == "Environment") %>%
  select(main_cols)
)
```

Something to be aware of is that some SIGs give ratings as letter grades:

```{r}
barney_ratings %>%
  filter(
    stringr::str_detect(rating, "[A-Z]")
  ) %>%
  select(rating, category_name_1)
```

But using just Barney's number grades, we can get his average rating on this category per `timespan`:

```{r}
barney_on_env %>%
  group_by(timespan) %>%
  summarise(
    avg_rating = mean(as.numeric(rating), na.rm = TRUE)
  ) %>%
  arrange(desc(timespan))
```


Keep in mind that these are ratings given by SIGs, which often have very different baseline stances on issues. For example, a pro-life group might give a candidate a rating of 0 whereas a pro-choice group might give that same candidate a 100.

```{r}
barney_ratings %>%
  filter(category_name_1 == "Abortion") %>%
  select(
    rating, sig_id, category_name_1
  )
```

<br>

## SIGs

When it comes to the Special Interest Groups themselves, the result of `rating_get_candidate_ratings` only supplies us with a `sig_id`. 

We can get more information about these SIGs given these IDs with `rating_get_sig`.

```{r}
(some_sigs <-
  barney_ratings %>%
  pull(sig_id) %>%
  unique() %>%
  sample(3)
)
```

```{r}
rating_get_sig(
  sig_ids = some_sigs
)
```

<br>

Or, if we don't yet know any `sig_id`s, we can get a dataframe of them with the function `rating_get_sig_list`. 

That function requires a vector of issue `category_ids`, however, so let's first get a vector of some `category_ids`.

```{r}
(category_df <-
  rating_get_categories(
    state_ids = NA # NA for national
  ) %>%
  distinct() %>%
  sample_n(nrow(.)) # Sampling so we can see multiple categories in the 10 rows shown here
)
```

Now we can get our dataframe of SIGs given some categories.

```{r}
(some_categories <- category_df$category_id %>% sample(3))
```

```{r}
(sigs <-
  rating_get_sig_list(
    category_ids = some_categories,
    state_ids = NA
  ) %>%
  select(sig_id, name, category_id, state_id) %>%
  sample_n(nrow(.))
)
```

We already have the category names corresponding to those `category_id`s in our `category_df`, so we can join `category_df` onto `sigs`s to attach `category_name_1`s to each of those SIGs.

```{r}
sigs %>%
  rename(
    sig_name = name
  ) %>%
  left_join(
    category_df,
    by = c("state_id", "category_id")
  ) %>%
  rename(
    category_name_1 = name
  ) %>%
  sample_n(nrow(.))
```



<br>

***

For more info or to report a bug to VoteSmart, please refer to the [VoteSmart API docs](http://api.votesmart.org/docs/index.html)! 

