Paired-sample *t*-test (also called, dependent-sample
*t*-test) compares means between two groups of observations that
are NOT independent. For example, when you collected pre-test and
post-test scores from the same group of students, the pre-test and
post-test scores came from the same person (i.e., a within-subject
design). Individuals with high pre-test scores were also likely to get
high post-test scores, and those with lower pre-test scores were likely
to have low post-test scores. In other words, the pre-test and post-test
scores were correlated; they were NOT independent. Therefore, the
assumption of *independence observation* was violated. We cannot
just compare the means of pre- and post-test. Instead, we need to
*pair up* the pre- and post-test scores and calcualte a
difference (*D*) score for each person.

Let’s consider an example

```
p_id <- c(1:10) # participant's ID
pre <- c(3, 5, 8, 8, 6, 9, 2, 3, 7, 5) #pre-test scores
post <- c(5, 8, 9, 10, 5, 9, 4, 2, 10, 7) #post-test scores
dat <- data.frame(p_id, pre, post) # create a data frame
dat
```

```
## p_id pre post
## 1 1 3 5
## 2 2 5 8
## 3 3 8 9
## 4 4 8 10
## 5 5 6 5
## 6 6 9 9
## 7 7 2 4
## 8 8 3 2
## 9 9 7 10
## 10 10 5 7
```

We calculate the difference score (*D* = post - pre) for each
person.

```
dat$D <- dat$post - dat$pre # calculate D score for each participant
dat
```

```
## p_id pre post D
## 1 1 3 5 2
## 2 2 5 8 3
## 3 3 8 9 1
## 4 4 8 10 2
## 5 5 6 5 -1
## 6 6 9 9 0
## 7 7 2 4 2
## 8 8 3 2 -1
## 9 9 7 10 3
## 10 10 5 7 2
```

```
library(psych)
describe(dat$D)
```

```
## dat$D
## n missing distinct Info Mean Gmd
## 10 0 5 0.927 1.3 1.711
##
## lowest : -1 0 1 2 3, highest: -1 0 1 2 3
##
## Value -1 0 1 2 3
## Frequency 2 1 1 4 2
## Proportion 0.2 0.1 0.1 0.4 0.2
```

On average, the scores went up by 1.3 points (*SD* =
1.4944341). We would like to test whether this score increase was
statistically significant. The null hypothesis for this test was that
pre- and post-test were not different, i.e., *pre* -
*post* = *D* = 0. Therefore, \[H_0: \mu_d = 0\].

At this point, we only need to test one variable, *D*, whether
it is different from zero. Therefore, we can use the formula \[\begin{aligned}
t &= \frac{\bar{D} - \mu_d}{SE_D} \\
&= \frac{\bar{D} - 0}{SE_D} \\
&=\frac{\bar{D}}{SE_D}
\end{aligned}\].

We will need \(\bar{D}\) and its standard error. Recall that \(SE = SD/\sqrt{N}\)

```
n <- nrow(dat) # get sample size N
n
```

`## [1] 10`

```
se_D <- sd(dat$D)/sqrt(n) # calculate SE
se_D
```

`## [1] 0.4725816`

```
t <- mean(dat$D)/se_D # calculate t
t
```

`## [1] 2.750848`

Now we have the *t* value. To test whether it is significant,
we would compare it the \(t_{critical}\) at a corresponding
*df*, which is *N* -1 = 9. You can do this by looking up
the *t* table or use *t*-test calculator on the
internet.

`t.test()`

The base R provides the `t.test()`

function to make it
easier for us to conduct *t*-tests. For paired t-test, you would
use `t.test(score1, score2, paired = TRUE)`

. The function
will subtract `score2`

from `score1`

(i.e.,
`score1`

- `score2`

). Therefore, it would make
sense to the post-test in the `score1`

position.

`t.test(dat$post, dat$pre, paired = TRUE)`

```
##
## Paired t-test
##
## data: dat$post and dat$pre
## t = 2.7508, df = 9, p-value = 0.02245
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 0.2309462 2.3690538
## sample estimates:
## mean difference
## 1.3
```

The output includes the *t* value, its degrees of freedom
(*df*), and the *p*-value to help determine whether it is
statistically significantly different from 0. The function also gives
use the 95% CI for the difference scores. The interval [0.23, 2.37] has
a very high chance to capture the population mean of the difference
scores. Note that the 95% CI does not include zero.

Looking at the *p*-value and 95% CI, we conclude that the
difference between pre- and post-test was more than zero. **That
is, the post-test was significant higher than the pre-test.**

**Note**: Paired-sample *t*-test is not limited
to pre- vs. post-testing. It can be used with other dependent samples.
For example, when we are studying twins, their genetics, personality,
childhood background, etc. are not independent. We could use a
paired-sample *t*-test to test, for example, whether older twins
are more responsible than younger twins or not.

When each person is represented in one row and all repeated
observation are recorded as a new variable, such as `pre`

and
`post`

. The dataset grows in columns or *width* with
repeated observations. We call the data organized this way a
*wide* formatted data.

```
p_id <- c(1:10) # participant's ID
time1 <- c(3, 5, 8, 8, 6, 9, 2, 3, 7, 5)
time2 <- c(5, 8, 9, 10, 5, 9, 4, 2, 10, 7)
dat2 <- data.frame(p_id, pre, post)
dat2
```

```
## p_id pre post
## 1 1 3 5
## 2 2 5 8
## 3 3 8 9
## 4 4 8 10
## 5 5 6 5
## 6 6 9 9
## 7 7 2 4
## 8 8 3 2
## 9 9 7 10
## 10 10 5 7
```

`t.test(dat2$post, dat2$pre, paired = TRUE)`

```
##
## Paired t-test
##
## data: dat2$post and dat2$pre
## t = 2.7508, df = 9, p-value = 0.02245
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 0.2309462 2.3690538
## sample estimates:
## mean difference
## 1.3
```

However, there is another way to organize the data, where each
observation is recorded as a row. A repeated observations will create a
new row. In this scheme, we will need variables to identify
*when* the data was observed and who it belongs to.

```
ob_id <- c(1:20) # observation ID
p_id <- c(1:10, 1:10) # participant ID
score <- c(3, 5, 8, 8, 6, 9, 2, 3, 7, 5, 5, 8, 9, 10, 5, 9, 4, 2, 10, 7) # all test scores.
time <- c(rep("pre", 10), rep("post", 10)) # First 10 were pre-test, last 10 were post-test.
dat2.long <- data.frame(ob_id, p_id, time, score)
knitr::kable(dat2.long)
```

ob_id | p_id | time | score |
---|---|---|---|

1 | 1 | pre | 3 |

2 | 2 | pre | 5 |

3 | 3 | pre | 8 |

4 | 4 | pre | 8 |

5 | 5 | pre | 6 |

6 | 6 | pre | 9 |

7 | 7 | pre | 2 |

8 | 8 | pre | 3 |

9 | 9 | pre | 7 |

10 | 10 | pre | 5 |

11 | 1 | post | 5 |

12 | 2 | post | 8 |

13 | 3 | post | 9 |

14 | 4 | post | 10 |

15 | 5 | post | 5 |

16 | 6 | post | 9 |

17 | 7 | post | 4 |

18 | 8 | post | 2 |

19 | 9 | post | 10 |

20 | 10 | post | 7 |

In this long formatted data, the first 10 rows were the pre-test
`scores`

, and the last 10 rows were from the post-test. The
`time`

column was used to identify when the observation
happened. The `p_id`

identified which observations belonged
to which participants. Repeated observation makes the data grows in rows
or *length*. Hence, it is called a *long* format.

You can also use `t.test()`

even if you data is in a long
format. However, you will need to change the code to
`t.test(y ~ x, data, paired = TRUE)`

, where `y`

is
your dependent variable and `x`

is your independent variable
(i.e., the testing time: pre vs. post).

```
dat2.long$time <- factor(dat2.long$time) #convert time into a factor
t.test(score ~ time, data = dat2.long, paired = TRUE)
```

```
##
## Paired t-test
##
## data: score by time
## t = 2.7508, df = 9, p-value = 0.02245
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 0.2309462 2.3690538
## sample estimates:
## mean difference
## 1.3
```

The effect size can be calculated with Cohen’s \(d = \frac{\bar{D}}{s_D}\)

```
# Let's go back to the "dat" dataset.
d.manual <- mean(dat$D)/sd(dat$D)
d.manual
```

`## [1] 0.8698945`

`effectsize`

packageThe `effectsize`

package provides a function to calculate
Cohen’s *d*.

```
library(effectsize)
effectsize::cohens_d(dat$post, dat$pre, paired = TRUE)
```

```
## Cohen's d | 95% CI
## ------------------------
## 0.87 | [0.12, 1.59]
```

```
# OR save a t-test as an R object and put it into the function.
mypair_t.test <- t.test(dat$post, dat$pre, paired = TRUE)
effectsize::cohens_d(mypair_t.test)
```

```
## Cohen's d | 95% CI
## ------------------------
## 0.87 | [0.12, 1.59]
```

Copyright © 2022 Kris Ariyabuddhiphongs