Documentation for the Dyad Ratios Algorithm

James Stimson with light editing and updating by Dave Armstrong

General

extract() is an R function which implements the dyad ratios algorithm described in Stimson (2018) for constructing a latent time series from the raw materials of survey research marginals. It inputs data in archival form, as dated survey results, aggregates them into regular periods selected at run time, and extracts the latent dimension(s) which best account for the raw series.

Although its central mathematics is entirely different - based upon ratios of comparable observed series rather than covariance of “variables” - its behavior is strongly analogous to extraction of a “factor score” in standard (iterative) principal components analysis.

Usage

The routine is called from R by the command:

output<- extract(varname, date, index, ncases=NULL, 
  unit="A", mult=1, begindt=NA, enddt=NA, npass=1,
  smoothing=TRUE, endmonth=12)

Arguments

The first three arguments must be specified by the user, others are optional:

  1. varname is an \(n\)-length vector containing names given to input series. The routine assumes that any two observations with the same name are comparable and that any change in name signals noncomparability. Ratios are computed only between comparable observations.

  2. date is an \(n\)-length vector of dates, typically one of the dates of survey field work (e.g., first, last, or median day). This should be recorded in a date format. For example, if you had the string “2025-01-01”, you could use lubridate::ymd("2025-01-01") to convert it to a date format.

  3. index is an \(n\)-length vector of the numeric summary value of the result. It might be a percent or proportion responding in a single category, (e.g., the “approve” response in presidential approval) or some multi-response summary, for example,

\[\text{Index} = \frac{\text{Percent Agree}}{\text{Percent Agree} + \text{Percent Disagree}}\]

Interpretation of the derived latent dimension is eased by having the index coded such that polarity is the same across items—for example, if the concept being measured is liberalism, then high values always stand for the more liberal response - but as in principal component analysis, the routine deals appropriately with polarity switches.

  1. ncases is an \(n\)-length vector giving the number of cases for each value in index, typically sample size, is used during aggregation to produce a weighted average when multiple readings fall in one aggregation period. If this issue doesn’t occur or if the user prefers an unweighted average, then ncases=NULL—or omitting the ncases variable—will ignore case weighting. In the case of a mix of available and missing ncases indicators, 0 or NA values are reset to 1000.

  2. unit is the aggregation period, “D” (daily), “M” (monthly), “Q” (quarterly), “A” (annual, default), “O” (other, for multi-year aggregation). mult is the number of years, used only for unit option “O”.

  3. begindt is the beginning date for analysis. Default (NA) determines beginning date from the earliest date in the data. Entries for beginning and ending date use the ISOdate function. For example, to start an analysis in January, 1993, enter ISOdate(1993,1,1). (As always, R is case sensitive. So “ISO” must be caps and “date” lower case.)

  4. enddt is the ending date for analysis. Default (NA) determines ending date from the latest date in the data.

Warning: The routine cannot determine the earliest or latest dates of items which not actually are used in the analysis. The criterion for usage is that items must appear in more than one period after aggregation. So if the beginning or ending dates are determined by an item which is discarded because it does not meet this criterion, the routine will fail.

  1. smoothing specifies whether or not exponential smoothing is applied to intermediate estimates during the iterative solution process. Default is TRUE.

  2. npass number of dimensions to be extracted can be 1 or 2, defaults to 1.

Dating Considerations

The routine expects a date variable of R class date. Generally, import packages like foreign, readr, haven, rio will try to turn date-looking variables into date objects. The str() function will identify if your variable is a date or character. We can see from the output below that the date variable in the jennings dataset is indeed in a date format.

library(DyadRatios)
data(jennings)
str(jennings$date)
#>  Date[1:295], format: "1973-12-01" "1986-05-24" "1987-05-05" "1991-06-08" "1994-07-26" ...

If your variable is not already in a date format, you can transform it as such with the functions from the lubridate package. For example, if your date variable is in the format “2025-01-01”, you can use the ymd() function to convert it to a date format as follows:

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
temp <- data.frame(survey_date = c("2025-01-01", "2024-10-13", "2020-05-13"))
str(temp)
#> 'data.frame':    3 obs. of  1 variable:
#>  $ survey_date: chr  "2025-01-01" "2024-10-13" "2020-05-13"
temp$survey_date <- ymd(temp$survey_date)
str(temp)
#> 'data.frame':    3 obs. of  1 variable:
#>  $ survey_date: Date, format: "2025-01-01" "2024-10-13" ...

See how the class changed from chr to Date.

If you happened to have three variables for month, day and year, you could paste them together and use the lubridate functions:

temp <- data.frame(year= c(2025, 2024, 2020), 
                   month = c(1, 10, 5), 
                   day = c(1, 13, 13))
temp$date <- lubridate::ymd(paste(temp$year, temp$month, temp$day, sep="-"))
str(temp)
#> 'data.frame':    3 obs. of  4 variables:
#>  $ year : num  2025 2024 2020
#>  $ month: num  1 10 5
#>  $ day  : num  1 13 13
#>  $ date : Date, format: "2025-01-01" "2024-10-13" ...

Warning: The lubridate functions will not handle fake dates (for example, 1-32-05). It decodes dates that actually existed on past calendars or will exist on future ones (e.g., no Feb 29 unless year is actually a leap year.)

Ideally, the dates will be in a structured format. The lubridate functions will even parse string dates with words, e.g., “January 1, 1993” or “Aug 2, 2020” so long as the strings have month, day and year in the same position in the date.

temp <- data.frame(date = c("January 1, 1993", "Jan 1, 1993", "Aug 2, 2020"))
temp$date <- mdy(temp$date)

The formats produced by importing excel or csv documents will not be identical to those produced by lubridate functions, but they will still work with the extract() function as their components can still be extracted by the lubridate() functions month(), day() and year().

temp <- data.frame(date = ymd("2025-01-01", "1990-01-01", "1970-01-01", "1950-01-01"))
xl_temp <- tempfile(fileext = ".xlsx")
csv_temp <- tempfile(fileext = ".csv")

rio::export(temp, xl_temp)
rio::export(temp, csv_temp)


temp_xl <- rio::import(xl_temp)
temp_csv <- rio::import(csv_temp)

str(temp_xl)
#> 'data.frame':    4 obs. of  1 variable:
#>  $ date: POSIXct, format: "2025-01-01" "1990-01-01" ...
str(temp_csv)
#> 'data.frame':    4 obs. of  1 variable:
#>  $ date: IDate, format: "2025-01-01" "1990-01-01" ...

Notice that from excel, you get a POSIXct format variable and from csv you get an IDate object. They still are equal in integer form to the lubridate way. All values pass the equivalence test.

all(temp$date == temp_xl$date)
#> [1] TRUE
all(temp$date == temp_csv$date)
#> [1] TRUE

Output

The extract function produces as output 8 categories of information:

  1. formula reproduces the user call
  2. setup supplies basic information about options and the iterative solution
  3. period is a list of the aggregation periods, for example, 2005.02 for February, 2005
  4. varname is a list in order of the variables actually used in the analysis, a subset of all those in the data.
  5. loadings are the item-scale correlations from the final solution. Their square is the validity estimate used in weighting.
  6. means and std.deviations document the item descriptive information, and
  7. latent is the estimated time series, the purpose of everything. The variable latent1 is for the first dimension and latent2 for the second, if applicable.
  8. Diagnostic Information: there are several additional objects returned in the output, including hist - the iteration history, totalvar - the total variance to be explained, var_exp - the variance explained by each dimension, dropped - any series dropped from the analysis, and smoothing - whether smoothing was applied during the iterative solution process.

Output Functions

The raw output object created at run time contains everything of interest, but in an inconvenient format. There are several functions that can be used to display results in different ways.

  1. plot displays a time series ggplot of the number of dimensions estimated on y axes against time units on the x axis and can be called with plot(object)

  2. print displays the output of the iterative solution process. This is what previously was printed with verbose=TRUE which has been removed in favor of a print method for the function output.

  3. summary displays information about the raw series of the analysis. Under the heading: “Variable Name Cases Loading Mean Std Dev” it lists as many series as are used in the solution, giving variable name, number of cases (after aggregation), dimension loadings, and means and standard deviations of the raw series.

  4. get_mood retrieves the period and latent dimension estimate(s) from the model object and returns them in a data frame.

Negative Correlations?

Correlations, in the case of time series, measure whether two series vary in or out of phase. Thus the cross-sectional interpretation of the negative correlation—that two items are inversely related - does not hold. It is not unusual to observe negative “loadings” in extract analyses. They mean only that items move out of phase, not that they are opposites.

Model

Assume \(N\) survey results, each coded into a meaningful single indicator. These results consist of \(n\) subsets of comparable items measured at different times, \(t={1,\ldots, T}\). Designate each result \(x_{it}\), where the \(i\) subscript indicates item and \(t\) indicates aggregation period, \(1,\ldots, T\).

The starting assumption is that ratios, \(r_{it+k} = \frac{x_it+k}{x_{it}}\) of the comparable item \(i\) at different times will be meaningful indicators of the latent concept to be extracted. Metric information is thus lost, which is desirable because absent a science of question wording, we have no ability to compare different items. If there were no missing observations, then for each item \(i\), we could let \(r_{i1}=1.0\) and observe the complete set of ratios, \(r_{i2}, r_{i3}, \ldots, r_{iT}\). Then an average across the n items forms an excellent estimate of the latent concept \(\theta_{t}\):

\[\theta_{t} = \frac{\sum_{i=1}^{n}r_{it}}{n}\]

But we do have missing x’s - and in the typical case it is most of them. We would still be in good shape if we had a common period, say period 1, which was available for all series. We could then form the sum from the \(k\) available items, \(k\leq n\), and divide by \(k\). But we also lack such a common period. That motivates a recursive approach.

Forward Recursion: Begin by selecting that subset of items which are available at time 1. For them we can form \(\hat{\theta}\) for \(t=1,\ldots, T\) setting \(\hat{\theta}_{1} = 1.0\) and calculating \(\hat{\theta}_{2}, \ldots, \hat{\theta}_{T}\) from whatever subsets of items are available. Now proceed to period 2, setting \(\hat{\theta}_{2}\) to that value estimated from period 1 and now, using the subset of items which include period 2, estimating \(\hat{\theta}_{3}, \ldots, \hat{\theta}_{T}\) from the assumption that \(\theta_{2} = \hat{\theta}_{2}\). By projecting \(\hat{\theta}_{2}\) forward in this manner, the estimates for periods 3 through \(T\) become comparable to what they would have been had period 1 information been available. This procedure is repeated one period at a time through \(T-1\), giving from 1 to \(T-1\) different estimates of each of the \(\theta_{t}\). An average of all of them becomes \(\hat{\theta}_{t}\).

Backward Recursion: It will be seen that forward recursion very heavily weights early information relative to later information. Period 1 contributes to all subsequent estimates,whereas period \(T-1\) contributes only to \(T\), and period \(T\) only to itself. Thus the direction of recursion matters. Employing the same procedure backward puts a different weight on the items and gives a comparable, but not identical, set of estimates. Thus a more efficient set of estimates, one weighting all information equally, can be gained by averaging the two recursive estimates. (And the correlation between forward and backward series becomes a reliability estimate.)

Iterative Validity Estimation

As in iterated principal components we both make assumptions about item validity and then, post hoc, have the ability to observe empirical estimates of validities (the square of item/scale correlations). At the beginning validities are assumed to be 1.0 for all items. Then the empirically estimated validities become assumed validities for the next iteration. This procedure is repeated until the difference between assumed and estimated validities is effectively zero for all items, the maximum item discrepancy less than .001.

Example

We illustrate the use of extract with the data from Jennings et. al. (2017) who graciously shared their data through dataverse. The first few observations show the question names, dates, percentage distrusting in government, and sample size.

library(DyadRatios)
data(jennings)
head(jennings)
#>       variable       date value    n
#> 1 govtrust_bsa 1973-12-01  15.2 1857
#> 2 govtrust_bsa 1986-05-24  11.8 1548
#> 3 govtrust_bsa 1987-05-05  11.2 1410
#> 4 govtrust_bsa 1991-06-08  14.1 1445
#> 5 govtrust_bsa 1994-07-26  21.2 1137
#> 6 govtrust_bsa 1996-06-01  23.5 1180

We could see how many questions there are in the data.

library(dplyr) 
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
jennings %>% 
  group_by(variable) %>% 
  tally() %>% 
  arrange(desc(n)) %>% 
  print(n=40)
#> # A tibble: 37 × 2
#>    variable          n
#>    <chr>         <int>
#>  1 eb_trustprl      30
#>  2 eb_trustgov      29
#>  3 govtrust_bsa     20
#>  4 h_govsys         20
#>  5 trust_mori2      18
#>  6 trust_mori1      17
#>  7 bsa_poltrust     15
#>  8 bsa_MPs          14
#>  9 bsa_votes        14
#> 10 bes_mpstrust      8
#> 11 bsa_parties       7
#> 12 ess_trustparl     7
#> 13 ess_trustpol      7
#> 14 cspl_pubstd       6
#> 15 h_parlsat         6
#> 16 pols_mori         6
#> 17 bes_parties       5
#> 18 bes_polmoney      5
#> 19 h_mpssat          5
#> 20 improp_g          5
#> 21 trustmps_m        5
#> 22 bes_nosay         4
#> 23 mpgain_m          4
#> 24 pollies_g         4
#> 25 polmor_g          4
#> 26 trustown_m        4
#> 27 bes_polstrust     3
#> 28 bsa_govtrust2     3
#> 29 govtrust_m        3
#> 30 pols_g            3
#> 31 bes_govtrust      2
#> 32 bes_wmtrust       2
#> 33 bsa_pols          2
#> 34 eb_polcor         2
#> 35 efficacy_g        2
#> 36 govtrust2_m       2
#> 37 spint_g           2

We can now estimate the dyad ratios algorithm:


jennings_out <- DyadRatios::extract(
  jennings,  
  varname_col = "variable", 
  date_col = "date", 
  index_col = "value", 
  n_col = "n", 
  agg_interval = "annual", 
  start_date = lubridate::ymd("1985-01-01", tz="GMT"), 
  end_date = max(jennings$date), 
  n_dim = 1, 
  smoothing = TRUE)

We can print the information about the estimates with:

print(jennings_out)
#> 
#> === Dyad Ratios Estimation ===
#> 
#> Aggregation:   annual
#> Period:        1985-01-01 to 2016-06-29
#> Time points:   32
#> Series:        35
#> Observations:  287
#> Dimensions:    1
#> Smoothing:     on
#>   Alpha (F):   0.500
#>   Alpha (B):   0.500
#> Iterations:    5
#> Eigenvalue:    3.556
#> Var. Explained:46.1%
#> 
#> Mood (first 10 periods):
#>   1985       44.38
#>   1986       43.95
#>   1987       43.20
#>   1988       43.83
#>   1989       44.15
#>   1990       44.96
#>   1991       45.34
#>   1992       41.96
#>   1993       46.65
#>   1994       49.74
#>   ... (22 more periods)

Our latent dimension explains about \(50\%\) of the variance in the raw series. We could also look at the loadings:

summary(jennings_out)
#> 
#> ================================================================
#> Dyad Ratios Estimation Report
#> ================================================================
#> 
#> Aggregation interval: annual
#> Period: 1985-01-01 to 2016-06-29
#> Time points:  32
#> Series used:  35
#> Total observations: 287
#> Dimensions extracted: 1
#> Exponential smoothing: on
#> Convergence tolerance: 0.0010
#> 
#> ----------------------------------------------------------------
#> Iteration History
#> ----------------------------------------------------------------
#> Iter   Converge     Crit      Reliability  Alpha_F   Alpha_B 
#> ------------------------------------------------------------ 
#>     1      0.32769     0.001        0.853     0.500     0.500
#>     2      0.03894     0.001        0.896     0.500     0.500
#>     3      0.00870     0.001        0.902     0.500     0.500
#>     4      0.00203     0.001        0.903     0.500     0.500
#>     5      0.00048     0.001        0.904     0.500     0.500
#> 
#> ----------------------------------------------------------------
#> Variable Loadings
#> ----------------------------------------------------------------
#> Variable                  N      Mean       Std      Dim1
#> -------------------------------------------------------
#> bes_govtrust              2    10.900     2.404     1.000 
#> bes_mpstrust              3    52.736     3.821     0.909 
#> bes_nosay                 4    51.425     5.836     0.988 
#> bes_parties               2    23.182     7.240     1.000 
#> bes_polmoney              3    58.553     1.265     0.261 
#> bes_polstrust             3    52.000     8.000     0.578 
#> bes_wmtrust               2    35.500     7.778     1.000 
#> bsa_MPs                  14    74.043     2.329     0.307 
#> bsa_govtrust2             3    61.667     0.874    -0.845 
#> bsa_parties               7    68.214     3.258     0.628 
#> bsa_pols                  2    43.500     3.536     1.000 
#> bsa_poltrust             15    91.920     1.588     0.326 
#> bsa_votes                14    72.907     3.917     0.601 
#> cspl_pubstd               6    21.500     9.607     0.918 
#> eb_polcor                 2    59.710     2.362    -1.000 
#> eb_trustgov              16    64.928     7.235     0.875 
#> eb_trustprl              17    59.327     7.333     0.928 
#> ess_trustparl             7    48.579     2.626     0.428 
#> ess_trustpol              7    61.850     2.531     0.120 
#> govtrust2_m               2    45.500     3.536     1.000 
#> govtrust_bsa             19    25.216     8.092     0.855 
#> govtrust_m                2    69.006     2.820     1.000 
#> h_govsys                 18    63.432     7.319     0.431 
#> h_mpssat                  5    38.000     3.464     0.599 
#> h_parlsat                 6    34.333     2.066     0.652 
#> improp_g                  5    57.600     9.072     0.902 
#> mpgain_m                  4    57.000    11.106     0.933 
#> pollies_g                 4    82.250     4.425     0.894 
#> polmor_g                  4    54.250    13.200     0.712 
#> pols_mori                 6    53.667     5.610     0.357 
#> spint_g                   2    72.000     7.071     1.000 
#> trust_mori1              16    73.375     3.686     0.244 
#> trust_mori2              17    75.647     3.297     0.525 
#> trustmps_m                4    68.501     6.758     0.746 
#> trustown_m                4    40.750     4.646    -0.502 
#> 
#> ----------------------------------------------------------------
#> Variance Accounting
#> ----------------------------------------------------------------
#> Eigenvalue estimate:    3.556
#> Variance explained:     46.1%

Some of these have much higher loadings than others. That means, they are more reliable indicators of government (dis)trust than others. For example, govtrust_bsa (actual question from the BSA: “Do not trust British governments to place needs of the nation above interests of their own party?”) is a very reliable indicator with observations in 20 different time-periods. On the other hand, bse_MPs (actual question from the BSA: “Those we elect as MPs lose touch with people pretty quickly”) has a quite lower loading indicating it is a much less reliable indicator of (dis)trust. We can also make a plot of mood:

plot(jennings_out)

Finally, to retrieve the latent dimension and the periods, we can use the get_mood function:

ests <- get_mood(jennings_out)
ests
#>    period year month quarter     mood
#> 1       1 1985     0       0 44.38014
#> 2       2 1986     0       0 43.95388
#> 3       3 1987     0       0 43.19971
#> 4       4 1988     0       0 43.83217
#> 5       5 1989     0       0 44.14840
#> 6       6 1990     0       0 44.95633
#> 7       7 1991     0       0 45.33936
#> 8       8 1992     0       0 41.95610
#> 9       9 1993     0       0 46.64557
#> 10     10 1994     0       0 49.74351
#> 11     11 1995     0       0 56.14776
#> 12     12 1996     0       0 52.63219
#> 13     13 1997     0       0 53.61000
#> 14     14 1998     0       0 50.25706
#> 15     15 1999     0       0 44.23176
#> 16     16 2000     0       0 47.70124
#> 17     17 2001     0       0 48.64989
#> 18     18 2002     0       0 49.40556
#> 19     19 2003     0       0 51.68327
#> 20     20 2004     0       0 52.00664
#> 21     21 2005     0       0 50.90603
#> 22     22 2006     0       0 52.45453
#> 23     23 2007     0       0 51.53650
#> 24     24 2008     0       0 53.25603
#> 25     25 2009     0       0 57.93163
#> 26     26 2010     0       0 58.36906
#> 27     27 2011     0       0 58.64144
#> 28     28 2012     0       0 59.80209
#> 29     29 2013     0       0 60.12331
#> 30     30 2014     0       0 59.99269
#> 31     31 2015     0       0 55.23849
#> 32     32 2016     0       0 59.35633

Bootstrapping

Dave Armstrong added this facet of the analytical framework.

There is no inferential framework baked-in to Stimson’s original method. In fact, in my conversations with Jim about this point, he seemed wholly uninterested in the sampling uncertainty of the measure. Given that I think Jim has forgotten more about this stuff than I know, he may well have the right of it. That said, I think it is interesting, so I decided to write a bootstrapping algorightm to get a sense of the variability of the estimates. I would argue that we should not do a simple non-parametric bootstrap because that will change the temporal density of the data. Instead, we could use a parametric bootstrap. This allows us to get a sense of how variable the estimates are given the data that we have. The algorithm uses the following steps.

  1. Sample the number of cases as \(n^{(b)}_i \sim \text{Binomial}(\text{index}_i, \text{ncases}_i)\) for each observation in the model.
  2. Calculate the boostrapped index value \(\text{index}_i^{(b)} = \frac{n^{(b)}_i}{\text{ncases}_i}\).
  3. Estimate mood using \(\text{index}_{i}^{(b)}\) and the original number of cases. In this step, when the data are normalized (i.e., the weighted mean and standard deviation are calculated), we use the weighted mean and standard deviation from the original data so that differences in the mean and standard deviation are not responsible for the variation.
  4. Repeat steps 1-3 for \(R\) bootstrap samples.
  5. Calculate the \(\frac{1-\text{level}}{2}\) and \(1-\frac{1-\text{level}}{2}\) quantiles of the bootstrap distribution.

Here’s the bootstrapped analysis for the Jennings data.

data(jennings)
jennings_boot <- boot_dr(
  obj = jennings_out, 
  data = jennings, 
  R=50L, pw=TRUE)

The bootstrapped results (if pw=TRUE) contains two elements - ci and pw. The ci element contains the estimate and confidence intervals for the latent dimension. Here are the first few lines of the confidence intervals:

head(jennings_boot$estimates)
#>   period year month quarter     mood    lower    upper
#> 1      1 1985     0       0 44.38014 42.84321 45.68359
#> 2      2 1986     0       0 43.95388 42.35571 45.33672
#> 3      3 1987     0       0 43.19971 41.70016 44.17129
#> 4      4 1988     0       0 43.83217 42.56710 44.31481
#> 5      5 1989     0       0 44.14840 42.88156 44.72981
#> 6      6 1990     0       0 44.95633 43.50502 45.74109

The main use for this would be a plot like the following:

library(ggplot2)
ggplot(jennings_boot$estimates, aes(x=year, y=mood, ymin = lower, ymax=upper)) + 
  geom_ribbon( alpha=0.2) + 
  geom_line() + 
  labs(y="Mood (Distrust in Government)", x="Year") + 
  theme_bw()

If pw=TRUE, the pw element contains pairwise comparisons between all time periods. This would be most useful for identifying where “significant” changes in mood happen. For example, you could look at the significant differences, we’ll just look at the first few here:

jennings_boot$pw %>% 
  filter(p_diff > .95) %>% 
  head()
#>   p1 p2       diff p_diff
#> 1  1  2 -0.4262574   1.00
#> 2  1  3 -1.1804240   1.00
#> 3  1  8 -2.4240390   1.00
#> 4  1  9  2.2654326   0.98
#> 5  1 10  5.3633783   1.00
#> 6  1 11 11.7676226   1.00

You could also make a plot of which differences are significant for all pairs.

library(tidyr)
#> 
#> Attaching package: 'tidyr'
#> The following object is masked from 'package:DyadRatios':
#> 
#>     extract
pwdiff <- jennings_boot$pw %>% 
  mutate(diff_sig = ifelse(p_diff > .95, diff, 0))

ggplot(pwdiff, aes(x=as.factor(p1), y=as.factor(p2), fill=diff_sig)) + 
  geom_tile(color="black") + 
  scale_fill_gradient2(low="#377eb8", mid="white", high="#e41a1c", na.value = "grey90") +
  labs(x="From", y="To", fill = "To - From") + 
  theme_classic() + 
  theme(axis.text.x = element_text(angle=90, vjust=0.5, hjust=1), 
        legend.position = "inside", 
        legend.position.inside = c(.85, .33))

In this instance, anywhere that is white indicates a non-significant difference. We can see from the plot above that there are many year-over-year changes that are statistically significant. We could print them out, arranged by size:

jennings_boot$pw %>% 
  filter(p_diff > .95 & p2 - p1 == 1) %>% 
  arrange(diff)
#>    p1 p2       diff p_diff
#> 1  14 15 -6.0253040   1.00
#> 2  30 31 -4.7541998   1.00
#> 3  11 12 -3.5155704   1.00
#> 4   7  8 -3.3832663   1.00
#> 5  13 14 -3.3529355   1.00
#> 6  20 21 -1.1006120   1.00
#> 7  22 23 -0.9180259   1.00
#> 8   1  2 -0.4262574   1.00
#> 9   4  5  0.3162299   0.98
#> 10 25 26  0.4374298   1.00
#> 11  3  4  0.6324598   0.98
#> 12 17 18  0.7556694   0.98
#> 13 16 17  0.9486429   1.00
#> 14 12 13  0.9778084   0.98
#> 15 27 28  1.1606568   1.00
#> 16 21 22  1.5484999   1.00
#> 17 23 24  1.7195283   1.00
#> 18 18 19  2.2777083   1.00
#> 19  9 10  3.0979457   1.00
#> 20 15 16  3.4694885   1.00
#> 21 31 32  4.1178444   1.00
#> 22 24 25  4.6756022   1.00
#> 23  8  9  4.6894716   1.00
#> 24 10 11  6.4042443   1.00

References

Jennings, W. N. Clarke, J. Moss and G. Stoker (2017). “The Decline in Diffuse Support for National Politics: The Long View on Political Discontent in Britain”, Public Opinion Quarterly, 81(3), 748-758.

Stimson, James A. 2018. “The Dyad Ratios Algorithm for Estimating Latent Public Opinion: Estimation, Testing, and Comparison to Other Approaches.” Bulletin of Methodological Sociology 137-138:201–218.