# Contrasts and Linear Hypothesis Tests

In my time running the UCD Ecology Statistical Support Group and at the Davis R Users Group, one of the most common questions I get is “how do my treatment groups differ from one another?” This sort of question, usually referred to as a “contrast” falls under a more general envelope of “linear hypothesis test”. Here, I’ve put together a brief overview of hypothesis tests, contrasts, and the use of the `emmeans`

package to test them.

# Linear hypothesis tests

Linear hypothesis tests involve a *linear* combination of the components of a statistical model, usually to generate the estimated mean value of our response given some set of predictors.

In most cases, these estimated means will be:

- The response of a group in an ANOVA
- The response at some value of a predictor in regression

### What is meant by *linear combination*?

Mathematically speaking, linear operations used in hypothesis testing are either:

- Addition (X + Y)
- Scalar multiplication (cX)

### Contrasts

In particular, we are often interested in *contrasts*, which are comparisons between population means. By definition, a contrast is defined as a linear combination of means or regression coefficients where the *values of the linear combination* sum to zero:

\[ c_1\mu_1 + c_2\mu_2 + ... + c_p\mu_p\]

A comparison of two means, \(\mu_1\) and \(\mu_2\) can be expressed as \(c_1\mu_1 + c_2\mu_2\), where \(c_1 = 1\) and \(c_2 = -1\).

However, tests under the broad envelope of “linear hypothesis tests” are not just limited to contrasts.

### Constructing Linear Hypothesis Tests

To understand how we construct hypothesis tests using these components, we need to think about how most models are constructed.

Let’s consider a one-way ANOVA, comparing the estimated responses of three groups. I’ll use the `iris`

dataset here, for convenience. In this analysis, we might be interested in whether a certain feature of the irises, `Sepal.Width`

, differs across the three `Species`

.

The ANOVA model used in this case could be written as `lm(Sepal.Length ~ Species, data = iris)`

Given how R uses factor coding, the fitted model will consist of three coefficients:

- \(\beta_0\), the estimated Sepal Length of the first group, “setosa”
- \(\beta_1\), the estimated difference in Sepal Length between the second group, “versicolor”, and the first
- \(\beta_2\), the estimated difference in Sepal Length between the third group, “virginica”, and the first

Or, written mathematically:

\((\hat{\mu}_{setosa}) = \beta_0\)

\((\hat{\mu}_{versicolor}) = \beta_0 + \beta_1\)

\((\hat{\mu}_{virginica}) = \beta_0 + \beta_2\)

### What tests can we run with this data?

The most natural test to start with would be asking whether the groups differ from one another. Mathematically, this statement could be expressed as whether \(\hat \mu_{species 1} - \hat \mu_{species 2} = 0\)

Based on how this model is constructed, we can express this difference using coefficients:

\((\hat{\mu}_{setosa}) - (\hat{\mu}_{versicolor}) = (\beta_0) - (\beta_0 + \beta_1) = - \beta_1\)

\((\hat{\mu}_{setosa}) - (\hat{\mu}_{virginica}) = (\beta_0) - (\beta_0 + \beta_2) = - \beta_2\)

\((\hat{\mu}_{versicolor}) - (\hat{\mu}_{virginica}) = (\beta_0) - (\beta_0 + \beta_1) = \beta_1 - \beta_2\)

If we were to express these \(\beta\) coefficients as a vector \(\vec{\beta} = (\beta_0, \beta_1, \beta_2)\), we could write these tests as the following:

\((\hat {\mu}_{setosa}) - (\hat {\mu}_{versicolor}) = (0,1,0) * \vec{\beta}\)

\((\hat {\mu}_{setosa}) - (\hat {\mu}_{virginica}) = (0,0,1) * \vec{\beta}\)

\((\hat {\mu}_{versicolor}) - (\hat {\mu}_{virginica} = (0,1,-1) * \vec{\beta}\)

Where each hypothesis is a linear combination of coefficients, either added or subtrated from one another with a constant multiplier (1).

### How do I run these tests?

By design, some tests are baked in to `lm`

when using factor variables. The summary object will offer T-tests of whether each coefficient differs from zero, which can be used to determine whether the sepal width of the first group, setosa, differs from 0, and whether the other two groups differ from this first factor level.

```
iris_anova <- lm(Sepal.Width ~ Species, data = iris)
summary(iris_anova)
```

```
##
## Call:
## lm(formula = Sepal.Width ~ Species, data = iris)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.128 -0.228 0.026 0.226 0.972
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.42800 0.04804 71.359 < 2e-16 ***
## Speciesversicolor -0.65800 0.06794 -9.685 < 2e-16 ***
## Speciesvirginica -0.45400 0.06794 -6.683 4.54e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3397 on 147 degrees of freedom
## Multiple R-squared: 0.4008, Adjusted R-squared: 0.3926
## F-statistic: 49.16 on 2 and 147 DF, p-value: < 2.2e-16
```

`emmeans`

But how do we test the third hypothesis, whether the second and third groups differ from one another? Enter `emmeans`

– a useful package to generate “estimated marginal means” that can be used to test hypotheses.

By default, the `pairs()`

function quickly compares means between groups (identified by the `specs`

argument) in all possible combinations.

```
# Call emmeans on the model object, with one or more sets of columns to generate means for
iris_emm <- emmeans(iris_anova, specs = "Species")
# All pairwise contrasts of the "Species" column
pairs(iris_emm)
```

```
## contrast estimate SE df t.ratio p.value
## setosa - versicolor 0.658 0.0679 147 9.685 <.0001
## setosa - virginica 0.454 0.0679 147 6.683 <.0001
## versicolor - virginica -0.204 0.0679 147 -3.003 0.0088
##
## P value adjustment: tukey method for comparing a family of 3 estimates
```

But we could also do this in a more defined way, by specifying linear combinations of these estimated means, as shown before. The one twist is that `emmeans`

would like to define the linear combinations by estimated means, so the contrasts will take a “1” or “-1” for each mean that is being compared:

```
test1 <- c(1,-1,0) # mean of species 1 minus species 2
test2 <- c(1,0,-1) # mean of species 1 minus species 3
test3 <- c(0,1,-1) # mean of species 2 minus species 3
contrast(iris_emm,
method = list("1. Se - Ve" = test1,
"2. Se - Vi" = test2,
"3. Ve - Vi" = test3))
```

```
## contrast estimate SE df t.ratio p.value
## 1. Se - Ve 0.658 0.0679 147 9.685 <.0001
## 2. Se - Vi 0.454 0.0679 147 6.683 <.0001
## 3. Ve - Vi -0.204 0.0679 147 -3.003 0.0031
```

### More complicated tests

At times, we might have a specific biological hypothesis for how estimated means differ from one another, such as whether one mean is more than twice the value of another, or whether the difference between two groups is equal to a particular value.

In our fourth test, we can manipulate this vector to check whether the twice the mean sepal width of *setosa* differs from the sepal width of *virginica*.

```
test4 <- c(2, 0, -1)
contrast(iris_emm,
method = list("4. 2xSe - Vi" = test4))
```

```
## contrast estimate SE df t.ratio p.value
## 4. 2xSe - Vi 3.88 0.107 147 36.139 <.0001
```

In the fifth, we can test whether means differ by a specific value using the `offset`

argument. Here, the test asks whether *setosa* and *virginica* differ by a value of 3.5.

```
test5 <- c(0, 1, -1)
contrast(iris_emm,
method = list("5. Se - Vi" = test4),
offset = - 3.5)
```

```
## contrast estimate SE df t.ratio p.value
## 5. Se - Vi 0.382 0.107 147 3.556 0.0005
```

### P-value control

When running multiple hypothesis tests, remember that accounting for inflating family-wise error rate can be important. By default, the `pairs`

function in `emmeans`

will apply a Tukey contrast to account for all simultaneous pairwise comparisons.

`pairs(iris_emm)`

```
## contrast estimate SE df t.ratio p.value
## setosa - versicolor 0.658 0.0679 147 9.685 <.0001
## setosa - virginica 0.454 0.0679 147 6.683 <.0001
## versicolor - virginica -0.204 0.0679 147 -3.003 0.0088
##
## P value adjustment: tukey method for comparing a family of 3 estimates
```

Note that we can change this, if desired. The `adjust`

argument can apply multiple corrections, such as the *Bonferroni* correction.

```
contrast(iris_emm,
method = list("1. Se - Ve" = test1,
"2. Se - Vi" = test2,
"3. Ve - Vi" = test3),
adjust = "bonferroni")
```

```
## contrast estimate SE df t.ratio p.value
## 1. Se - Ve 0.658 0.0679 147 9.685 <.0001
## 2. Se - Vi 0.454 0.0679 147 6.683 <.0001
## 3. Ve - Vi -0.204 0.0679 147 -3.003 0.0094
##
## P value adjustment: bonferroni method for 3 tests
```

### A worked example - Frog Occupancy in Mountain Streams

`emmeans`

can handle far more complex model types than the one presented previously, including random effects models, generalized linear models, and even MCMC (Bayesian) methods.

Model parameters that consist of a mixture of numeric and categorical predictors can also be accounted for, using the same general syntax presented above (with some additional complexity). Here, we show how `emmeans`

might be used to test a number of hypotheses about frog occupancy across different mountain streams.

```
frogs <- read.csv("frogexample.csv",
header = TRUE,
stringsAsFactors = TRUE)
```

### Fitting a logistic GLMER

Terms:

- WaterDepth - Water Depth
- velo1 - Water Velocity
- substraten2 - Stream Substrate
- hab - Habitat Type
- site - Study Site (Random Effect)

```
occ_model <- glmer(occ ~ WaterDepth * velo1 + velo1 * substraten2 + hab + (1 | site),
data = frogs,
family = "binomial")
coef(occ_model)
```

```
## $site
## (Intercept) WaterDepth velo1 substraten2boul substraten2grav
## stream1 -2.6007168 -3.821035 7.286018 0.5542787 0.2701652
## stream2 -0.4757093 -3.821035 7.286018 0.5542787 0.2701652
## substraten2silt habPool habRiff WaterDepth:velo1
## stream1 1.430042 0.3435592 1.732065 1.039696
## stream2 1.430042 0.3435592 1.732065 1.039696
## velo1:substraten2boul velo1:substraten2grav velo1:substraten2silt
## stream1 -7.871831 -10.69519 -25.31216
## stream2 -7.871831 -10.69519 -25.31216
##
## attr(,"class")
## [1] "coef.mer"
```

Mathematically:

\[log( { \frac{y}{1-y}}) \sim \beta_0 + \beta_1WtrDepth + \beta_2Velo + \beta_{3-5}Substrate + \beta_{6-7}Hab + \beta_8(WaterDepth ~ * ~Velo) + \beta_{9-11}(Velo ~ * ~ Substrate) + \epsilon\]

### 1. Pairwise tests of a single factor - Do frogs prefer pools, cascades, or riffles?

\(\bar {Y}_{Pool} = \bar {Y}_{Cascade} = \bar {Y}_{Riffle}\)

\(\beta_0 = \beta_0 + \beta_{Cascade} = \beta_0 + \beta_{Riffle}\)

While we have multiple factors included in the model, individual pairwise tests of each factor individually are possible.

Note that `emmeans`

can provide results on the model scale (logit) or at the response scale (probability).

```
pairs(emmeans(occ_model,
specs = "hab"))
```

```
## contrast estimate SE df z.ratio p.value
## Casc - Pool -0.344 0.189 Inf -1.822 0.1625
## Casc - Riff -1.732 0.316 Inf -5.487 <.0001
## Pool - Riff -1.389 0.281 Inf -4.937 <.0001
##
## Results are averaged over the levels of: substraten2
## Results are given on the log odds ratio (not the response) scale.
## P value adjustment: tukey method for comparing a family of 3 estimates
```

```
pairs(emmeans(occ_model,
specs = "hab",
type = "response"))
```

```
## contrast odds.ratio SE df z.ratio p.value
## Casc / Pool 0.709 0.1338 Inf -1.822 0.1625
## Casc / Riff 0.177 0.0558 Inf -5.487 <.0001
## Pool / Riff 0.249 0.0702 Inf -4.937 <.0001
##
## Results are averaged over the levels of: substraten2
## P value adjustment: tukey method for comparing a family of 3 estimates
## Tests are performed on the log odds ratio scale
```

### 2. Interactions - Do trends in occupancy (as a function of velocity) differ between substrates?

\(\beta_0 + \beta_2 * Velo = \beta_0 + (\beta_2 + \beta_9) * Velo= \beta_0 + (\beta_2 + \beta_{10}) * Velo = \beta_0 + (\beta_2 + \beta_{11}) * Velo\)

What if we have a mix of predictors, both quantitative and categorical? `emtrends`

can be used to ask whether difference in the slope coefficient differs between groups.

```
occ_trend <- emtrends(occ_model,
specs = "substraten2", # Categorical variable
var = "velo1") # Quantitative variable
pairs(occ_trend)
```

```
## contrast estimate SE df z.ratio p.value
## bedr - boul 7.87 2.71 Inf 2.903 0.0193
## bedr - grav 10.70 2.85 Inf 3.754 0.0010
## bedr - silt 25.31 8.04 Inf 3.149 0.0089
## boul - grav 2.82 1.61 Inf 1.748 0.2987
## boul - silt 17.44 7.69 Inf 2.269 0.1054
## grav - silt 14.62 7.73 Inf 1.891 0.2317
##
## Results are averaged over the levels of: hab
## P value adjustment: tukey method for comparing a family of 4 estimates
```

`emtrends`

also provides some nice visual tools, such as `emmip`

, which returns a `ggplot`

object that can be subject to further modification, such as added standard errors

```
emmip(occ_model, substraten2 ~ velo1,
type = "response",
at = list(velo1 = seq(0, 1, by = .25))) +
geom_errorbar(aes(ymax = yvar + SE,
ymin = yvar - SE),
width = .1)
```

### 3. Multiple interactions – Substrate | Water Depth and Velocity

Tests between:

\(\beta_0 + \beta_1 * WtrDepth + \beta_2*Velo + \beta_{3-5}Substrate + \beta_8(WtrDepth ~ * ~ Velo) + \beta_{9-11}(Velo * Substrate)\)

For different substrates and values of Water Depth and Velocity

br()

In this model, we have multiple quantitative predictors that interact with categorical ones. What if we would like to compare occupancy across habitat types for a couple different stream velocities and water depths? Using the condition operator, `|`

, and the `at = list()`

assignment, we can generate estimated means and contrasts across a combination of these predictors.

```
water_comp <- emmeans(occ_model,
specs = ~ substraten2 | WaterDepth + velo1, # Alternative formulation
at = list(velo1 = c(0, 1),
WaterDepth = c(0, .2)),
type = "response")
pairs(water_comp)
```

```
## WaterDepth = 0, velo1 = 0:
## contrast odds.ratio SE df z.ratio p.value
## bedr / boul 1.00e+00 0.00e+00 Inf -2.740 0.0312
## bedr / grav 1.00e+00 0.00e+00 Inf -1.000 0.7495
## bedr / silt 0.00e+00 0.00e+00 Inf -4.622 <.0001
## boul / grav 1.00e+00 0.00e+00 Inf 1.260 0.5882
## boul / silt 0.00e+00 0.00e+00 Inf -3.191 0.0077
## grav / silt 0.00e+00 0.00e+00 Inf -4.209 0.0002
##
## WaterDepth = 0.2, velo1 = 0:
## contrast odds.ratio SE df z.ratio p.value
## bedr / boul 1.00e+00 0.00e+00 Inf -2.740 0.0312
## bedr / grav 1.00e+00 0.00e+00 Inf -1.000 0.7495
## bedr / silt 0.00e+00 0.00e+00 Inf -4.622 <.0001
## boul / grav 1.00e+00 0.00e+00 Inf 1.260 0.5882
## boul / silt 0.00e+00 0.00e+00 Inf -3.191 0.0077
## grav / silt 0.00e+00 0.00e+00 Inf -4.209 0.0002
##
## WaterDepth = 0, velo1 = 1:
## contrast odds.ratio SE df z.ratio p.value
## bedr / boul 1.51e+03 3.94e+03 Inf 2.797 0.0265
## bedr / grav 3.37e+04 9.22e+04 Inf 3.810 0.0008
## bedr / silt 2.35e+10 1.86e+11 Inf 3.023 0.0134
## boul / grav 2.20e+01 3.40e+01 Inf 2.042 0.1727
## boul / silt 1.56e+07 1.18e+08 Inf 2.189 0.1262
## grav / silt 6.99e+05 5.32e+06 Inf 1.769 0.2883
##
## WaterDepth = 0.2, velo1 = 1:
## contrast odds.ratio SE df z.ratio p.value
## bedr / boul 1.51e+03 3.94e+03 Inf 2.797 0.0265
## bedr / grav 3.37e+04 9.22e+04 Inf 3.810 0.0008
## bedr / silt 2.35e+10 1.86e+11 Inf 3.023 0.0134
## boul / grav 2.20e+01 3.40e+01 Inf 2.042 0.1727
## boul / silt 1.56e+07 1.18e+08 Inf 2.189 0.1262
## grav / silt 6.99e+05 5.32e+06 Inf 1.769 0.2883
##
## Results are averaged over the levels of: hab
## P value adjustment: tukey method for comparing a family of 4 estimates
## Tests are performed on the log odds ratio scale
```

Default plotting of this object can also be useful:

`plot(water_comp, CIs = TRUE) `