Week 5 Introduction to Hypothesis Testing Reading

Sherri Spriggs

6 Week 5 Introduction to Hypothesis Testing Reading

An Introduction to Hypothesis Testing

What are you interested in learning about? Perhaps you’d like to know if there is a difference in average final grade between two different versions of a college class? Does the Fort Lewis women’s soccer team score more goals than the national Division II women’s average? Which outdoor sport do Fort Lewis students prefer the most? Do the pine trees on campus differ in mean height from the aspen trees? For all of these questions, we can collect a sample, analyze the data, then make a statistical inference based on the analysis. This means determining whether we have enough evidence to reject our null hypothesis (what was originally assumed to be true, until we prove otherwise). The process is called hypothesis testing.

A really good Khan Academy video to introduce the hypothesis test process: Khan Academy Hypothesis Testing. As you watch, please don’t get caught up in the calculations, as we will use SPSS to do these calculations. We will also use SPSS p-values, instead of the referenced Z-table, to make statistical decisions.

The Six-Step Process

Hypothesis testing requires very specific, detailed steps. Think of it as a mathematical lab report where you have to write out your work in a particular way. There are six steps that we will follow for ALL of the hypothesis tests that we learn this semester.

Six Step Hypothesis Testing Process
1. Research Question	4. Assumptions, Analysis, Graphs and Tables, Calculations
2. Statistical Hypotheses	5. Statistical Decision
3. Decision Rule	6. Interpretation

1. Research Question

All hypothesis tests start with a research question. This is literally a question that includes what you are trying to prove, like the examples earlier: Which outdoor sport do Fort Lewis students prefer the most? Is there sufficient evidence to show that the Fort Lewis women’s soccer team scores more goals than the national Division 2 women’s average?

In this step, besides literally being a question, you’ll want to include:

mention of your variable(s)
wording specific to the type of test that you’ll be conducting (mean, mean difference, relationship, pattern)
specific wording that indicates directionality (are you looking for a ‘difference’, are you looking for something to be ‘more than’ or ‘less than’ something else, or are you comparing one pattern to another?)

Example:

Consider this research question: Do the pine trees on campus differ in mean height from the aspen trees?

The wording of this research question clearly mentions the variables being studied. The independent variable is the type of tree (pine or aspen), and these trees are having their heights compared, so the dependent variable is height.
‘Mean’ is mentioned, so this indicates a test with a quantitative dependent variable.
The question also asks if the tree heights ‘differ’. This specific word indicates that the test being performed is a two-tailed (i.e. non-directional) test. More about the meaning of one/two-tailed will come later.

2. Statistical Hypotheses

A statistical hypothesis test has a null hypothesis, the status quo, what we assume to be true. Notation is H_0, read as “H naught”. The alternative hypothesis is what you are trying to prove (mentioned in your research question), H₁ or H_A. All hypothesis tests must include a null and an alternative hypothesis. We also note which hypothesis test is being done in this step.

The notation for your statistical hypotheses will vary depending on the type of test that you’re doing. Writing statistical hypotheses is NOT the same as most scientific hypotheses. You are not writing sentences explaining what you think will happen in the study. Here is an example of what statistical hypotheses look like using the research question: Do the pine trees on campus differ in mean height from the aspen trees?

$LaTeX: H_0\:$ : $LaTeX: \mu_{pine}=\mu_{aspen}$

LaTeX: H_1 : $LaTeX: \mu_{pine}\ne\mu_{aspen}$

This particular notation will be explained in further detail later, but notice that there is both an $LaTeX: H_0\:$ and LaTeX: H_1 . The null hypothesis, $LaTeX: H_0\:$ , is stating, in mathematical notation, that the tree height means ( $LaTeX: \mu$ ) of each type of tree are assumed to be equal. The alternative hypothesis, , is what we are trying to prove, that the tree height means are not the same, ie differ.

3. Decision Rule

In this step, you state which alpha value you will use, and when appropriate, the directionality, or tail, of the test. You also write a statement: “I will reject the null hypothesis if p < alpha” (insert actual alpha value here). In this introductory class, alpha is the level of significance, how willing we are to make the wrong statistical decision, and it will be set to 0.05 or 0.01.

Example of a Decision Rule:

Let alpha=0.01, two-tailed. I will reject the null hypothesis if p<0.01.

4. Assumptions, Analysis and Calculations

Quite a bit goes on in this step. Assumptions for the particular hypothesis test must be checked. SPSS will be used to create appropriate graphs, statistics and test output tables. Where appropriate, calculations of the test’s effect size will also be done in this step.

All hypothesis tests have assumptions that we hope to meet. For example, tests with a quantitative dependent variable consider a histogram(s) to check if the distribution is normal, and whether there are any obvious outliers. Each hypothesis test has different assumptions, so it is important to pay attention to the specific test’s requirements.

Required SPSS output will also depend on the test.

5. Statistical Decision

It is in Step 5 that we determine if we have enough statistical evidence to reject our null hypothesis. We will consult the SPSS p-value and compare to our chosen alpha (from Step 3: Decision Rule).

Put very simply, the p-value is the probability that, if the null hypothesis is true, the results from another randomly selected sample will be as extreme or more extreme as the results obtained from the given sample. The p-value can also be thought of as the probability that the results (from the sample) that we are seeing are solely due to chance. This concept will be discussed in much further detail in the class notes.

Based on this numerical comparison between the p-value and alpha, we’ll either reject or retain our null hypothesis. Note: You may NEVER ‘accept’ the null hypothesis. This is because it is impossible to prove a null hypothesis to be true, because you do not know the entire population.

Retaining the null means that you just don’t have enough evidence to prove your alternative hypothesis to be true, so you fall back to your null. (You retain the null when p is greater than or equal to alpha.)

Rejecting the null means that you did find enough evidence to prove your alternative hypothesis as true. (You reject the null when p is less than alpha.)

Example of a Statistical Decision:

Retain the null hypothesis, because p=0.12 > alpha=0.01.

The p-value will come from SPSS output, and the alpha will have already been determined back in Step 3. You must be very careful when you compare the decimal values of the p-value and alpha. If, for example, you mistakenly think that p=0.12 < alpha=0.01, then you will make the incorrect statistical decision, which will likely lead to an incorrect interpretation of the study’s findings.

6. Interpretation

The interpretation is where you write up your findings. The specifics will vary depending on the type of hypothesis test you performed, but you will always include a plain English, contextual conclusion of what your study found (i.e. what it means to reject or retain the null hypothesis in that particular study). You’ll have statistics that you quote to support your decision. Some of the statistics will need to be written in APA style citation (the American Psychological Association style of citation). For some hypothesis tests, you’ll also include an interpretation of the effect size.

Some hypothesis tests will also require an additional (non-Parametric) test after the completion of your original test, if the test’s assumptions have not been met. These tests are also call “Post-Hoc tests”.

As previously stated, hypothesis testing is a very detailed process. Do not be concerned if you have read through all of the steps above, and have many questions (and are possibly very confused). It will take time, and a lot of practice to learn and apply these steps!

This Reading is just meant as an overview of hypothesis testing. Much more information is forthcoming in the various sets of Notes about the specifics needed in each of these steps. The Hypothesis Test Checklist (XLSX) [11KB] will be a critical resource for you to refer to during homeworks and tests.

Student Course Learning Objectives

4. Choose, administer and interpret the correct tests based on the situation, including identification of appropriate sampling and potential errors

c. Choose the appropriate hypothesis test given a situation

d. Describe the meaning and uses of alpha and p-values

e. Write the appropriate null and alternative hypotheses, including whether the alternative should be one-sided or two-sided

f. Determine and calculate the appropriate test statistic (e.g. z-test, multiple t-tests, Chi-Square, ANOVA)

g. Determine and interpret effect sizes.

h. Interpret results of a hypothesis test

5. Use technology in the statistical analysis of data

6. Communicate in writing the results of statistical analyses of data

Attributions

Adapted from “Week 5 Introduction to Hypothesis Testing Reading” by Sherri Spriggs and Sandi Dang is licensed under CC BY-NC-SA 4.0.

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

Math 132 Introduction to Statistics Readings (Biology Version) Copyright © by Sherri Spriggs is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.