How to do a weighted T-test in R?

I have df1:

PopDens Score1 Group
93.53455 17.985288 B
137.13861 10.549394 A
35.98619 13.392857 A
89.69800 8.644537 B
16.27796 29.591635 A
25.33346 21.081301 F
89.69800 2.644537 C
46.27796 29.591635 A
25.33346 5.081301 B
36.27796 29.591635 A 1.33346 9.081301 B

I would like to perform a t-test between groups A and B looked at the difference in mean of score1.

However, I want to weight the analysis so that rows with a larger PopDens have a stronger weight in the analysis. For example, I don't want the final row to have as much weight in the analysis as the second row because the population densities are very different.

How is this done?

2 Answers

Below is more like a small summary of my thoughts and quick search. I have never used a weighted t.test before, only weights in linear regression.

There is no clear definition for what would make a weighted t-test. The issue lies with how to use weights in estimating the error because that is the basis of your t-test. You can check out this discussion and maybe this paper on weights in linear regression.

So your data:

df = structure(list(PopDens = c(93.53455, 137.13861, 35.98619, 89.698,
16.27796, 25.33346, 89.698, 46.27796, 25.33346, 36.27796, 1.33346
), Score1 = c(17.985288, 10.549394, 13.392857, 8.644537, 29.591635,
21.081301, 2.644537, 29.591635, 5.081301, 29.591635, 9.081301
), Group = structure(c(2L, 1L, 1L, 2L, 1L, 4L, 3L, 1L, 2L, 1L,
2L), .Label = c("A", "B", "C", "F"), class = "factor")), class = "data.frame", row.names = c(NA,
-11L))

We subset on only A and B:

df = subset(df,Group %in% c("A","B"))

And we can compare the results of a t-test and lm:

coefficients(summary(lm(Score1~ Group,data=df))) Estimate Std. Error t value Pr(>|t|)
(Intercept) 22.54343 3.653195 6.170881 0.0004580837
GroupB -12.34532 5.479793 -2.252882 0.0589470215
t.test(df$Score1[df$Group=="B"],df$Score1[df$Group=="A"],data=df) Welch Two Sample t-test
data: df$Score1[df$Group == "B"] and df$Score1[df$Group == "A"]
t = -2.404, df = 6.463, p-value = 0.05007
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval: -24.695931765 0.005282865
sample estimates:
mean of x mean of y 10.19811 22.54343

You get a p-value of 0.0589470215 for the effect of difference of B from A. For the t.test 0.05007, it's not crazily different.

Now for a weighted linear regression:

coefficients(summary(lm(Score1~ Group,data=df,weight=df$PopDens))) Estimate Std. Error t value Pr(>|t|)
(Intercept) 17.845885 3.780246 4.7208269 0.00215547
GroupB -5.466244 5.727617 -0.9543663 0.37168503

You can see that the coefficients are estimated differently.. more towards the higher weight samples.

For the weighted t-test offered in package weights:

library(weights)
wtd.t.test(x=df$Score1[df$Group=="A"],y=df$Score1[df$Group=="B"],
weight=df$Score1[df$Group=="A"],weighty=df$Score1[df$Group=="B"],samedata=FALSE)
$test
[1] "Two Sample Weighted T-Test (Welch)"
$coefficients t.value df p.value
2.90701563 6.97938063 0.02283172
$additional
Difference Mean.x Mean.y Std. Err 13.468496 25.884728 12.416232 4.633101

Apparently it is a frequency weight in this weighted t-test but I am not sure. If you prefer to use this, will be good to read the code in detail since it is not very well documented how the standard errors etc are calculated.

If you would have more than 2 groups, you could also do an wighted anova with:

library(stats)
aov(Score1 ~ Group, data = df1, weight = PopDens)

Pop Glow

How to do a weighted T-test in R?

2 Answers

Your Answer

Sign up or log in

Post as a guest

You Might Also Like

Technic Launcher Not Downloading Modpacks [closed]

How do I get modded item/block IDs?

Fastest way to get enchanted books?

Is it no longer possible to earn the Ace Trainer badge?