I have the following question: A random sample of size 25 from a normal distribution has mean 47 and standard deviation 7. Based on $t$-statistics, can we say that the given information supports the conjecture that the mean of the population is 42?
I'm really confused how $t$-statistics works to reject or fail to reject a hypothesis. An explanation would be really helpful. Thanks!
$\endgroup$ 41 Answer
$\begingroup$Two-Sided One-Sample T -Test
Just happened to have a normal dataset with $n=25, \bar X = 57, S = 7$ in my R Session window.
Are data appropriate for a t test? Here is a summary of the data, computed by R:
summary(x) Min. 1st Qu. Median Mean 3rd Qu. Max. 35.18 40.78 44.83 47.00 52.35 61.34
length(x); sd(x)
[1] 25 # sample size n = 25
[1] 7 # sample standard deviation S = 7.0
stripchart(x, pch="|")Approximately symmetrical data with no far outliers; passes Shapiro-Wilk normality test with a P-value above $0.05 = 5\%.$
shapiro.test(x) Shapiro-Wilk normality test
data: x
W = 0.96136, p-value = 0.4423Data are close enough to normal for a t test to be valid.
R printout for the t test. Thus, here is output from R for a one-sample t test of $H_0: \mu = 42$ against $H_a: \mu \ne 42.$
t.test(x, mu=42) One Sample t-test
data: x
t = 3.5714, df = 24, p-value = 0.001543
alternative hypothesis: true mean is not equal to 42
95 percent confidence interval: 44.11054 49.88946
sample estimates:
mean of x 47 Interpretation of output. The P-value is $0.0015 < 0.05 = 5\%,$ so you would reject $H_0$ at the 5% level of significance. You could also reject at the 1% level.
The output also gives a 95% confidence interval (CI)$(44.11, 49.89),$ so we can conclude the true value of $\mu$ is in that interval--which does notcontain $\mu = 42.$
One interpretation of this CI is that it is an interval of "non-rejectable" null hypotheses, based on your data.
Details your should know about the test.@PeterForeman has shown you how to compute the T-statistic. Except for the P-value, you should be able to reproduce everything else in the output by hand computation.
Exact P-values are given in computer printouts. By looking at a printed table of t, you should be able to 'bracket' the P-value. For example, my table has values 2.467 and 3.745 on line DF = 24, which bracket the T-statistic 3.5714. Looking at the top margin of my table, I see that the P-value must be between $2(0.001) = 0.002$ and $2(0.0005) = 0.001,$ which agrees with the value from R. [The
2s are because this is a 2-sided t test.]You can get the exact P-value of this 2-sided test in R or other statistical software. It is the probability of a T statistic farther from $0$ than the observed $T =3.5714.$ In R, where
ptis a CDF of Student's t distribution, the following computation gets you very close to the P-value in the printout. (If the value of the reported T statistic is rounded, then the P-value may not match exactly, but only the first couple of decimal places matter for decision making.)
.
2 * (1 - pt(3.5714, 24))
[1] 0.001543522- To answer one of your questions in comments:
From the printed t table, you can say that
a critical value for rejecting at the 5% level is $c = 2.064.$ That is you would reject at
the 5% level of $|T| > 2.064,$ which it is.
The critical value cuts probability $0.025 = 2.5\% $ from the upper tail of Student's t distribution with DF = 24. In R, where
qtis a quantile function (inverse CDF), you can get the 5% critical value as shown below. What is the critical value for a test at the 1% level of significance?
${}$
qt(.975, 24)
[1] 2.063899Graphical summary. The figure below shows the density function of Student's t distribution with 24 DF. The vertical blue like shows the observed value of the T-statistic. The P-value is twice the area under the curve to the right of this line. Lower and upper critical values for a test at the 5% level are shown by vertical dotted orange lines; red lines (farther out) for a test at the 1% level.
$\endgroup$ 2