Mathwizurd.com is created by David Witten, a mathematics and computer science student at Stanford University. For more information, see the "About" page.

Chi-Squared

Goodness-of-Fit

The basic idea of the chi-squared test is to determine how well something fits a model that we expect. This is used with categorical variables, as we will see.

Example (from the textbook)

We have this list of 256 CEO's and the month they were born in.

Births Month
23 Jan
20 Feb
18 Mar
23 April
20 May
19 June
18 July
21 Aug
19 Sep
22 Oct
24 Nov
29 Dec

Now, we have to wonder: Is this special at all? Because a person is as likely to be born in any month, we expect 256/12 = 21.333 people in each month. So, could this happen solely from randomness?

Null Hypothesis

We've seen this problem before. That's our null hypothesis. Our null hypothesis is that births are uniformly distributed. Our alternate hypothesis is that they aren't. 

Assumptions

As with all of our tests, we must check certain assumptions. 

Counted Data Assumption: Data are counts for categories. This determines whether we can actually used chi-squared.

Independence Assumption: This is only to determine that the sample is representative of the population. The population should be independent within. 

Sample Size Assumption: We should expect at least 5 people in each category.

MathJax TeX Test Page Pretty much what we do now is we take the total deviation and we match it up to the chi-squared distribution, which tells us the p-value, or the chance that we'd get something as extreme as our distribution assuming that dates are evenly distributed. $$\chi^2 \text{ (or deviation) } = \sum_{\text{all cells}}\dfrac{(Obs- Exp)^2}{Exp}$$ This accounts for two things: negative numbers (it's squared), and large values (it's divided by the expected). For our particular question, it's $\boxed{5.094}$. Now, we have to graph it and see what our p-value is. Before we do that, we have to determine of degrees of freedom. This is important, because if we have 12 months, each month could have a small error, and that would add up. If we had 2 groups, and we got a total error of 5, that is a much greater error.
I got this from here: http://homepage.stat.uiowa.edu/~mbognar/applets/chisq.html

I got this from here: http://homepage.stat.uiowa.edu/~mbognar/applets/chisq.html

Surprisingly, or perhaps not surprisingly, we got a p-value of 93.2%.  This means that if the dates were truly random, we'd get a distribution like ours 93% of the time. This means it is very reasonable that the dates are simply random.

Be careful

There is no way to prove that it's actually randomly distributed. We can only disprove or fail to disprove.

Test of Homogeneity

I'm not going to do an example, but it's similar to what we did before. Let's say we had statistics from 1990, 2000, and 2010 on what HS students did after HS: College, Job, Military, or Travel. In our first example, our expected results were all 21.3. For this example, we expect the total percentage (combined all three years) to equal the percentage in each group. The degrees of freedom are (R-1)(C-1), so it would be (4-1)(3-1) = 6. We find the chi^2 value the same way.

Test of Independence

A question like this would arise if you ask "Does going to the tattoo parlor affect getting Hepatitis C?" So, you'd get data on people with Hepatitis C/No Hepatitis C (one type of category), and you'd get data on people who've gotten tattoos (1) in a parlor (2) at home, or (3) never. Then, you expect that they get the same proportions in each group. The degrees of freedom are still (R-1)(C-1) = (3-1)(2-1) = 2. In a test for homogeneity, we had a single categorical variable measured on >=2 populations. In a test for independence, we had multiple categorical variables measured on one population. This affects the conclusions we draw from this.

Types of Samples and Bias