Wednesday, March 28, 2012
Assignment 26
Some things to note:
So the equation is:
p-hat +/- z* [ sqrt ((phat*(1-phat))/n)) ]
So we can see that this equation follows what we are used to. P-hat is like x-bar: the sample proportion. We know how to find z*- it hasn't changed! The next portion is just the standard error- just like we are used to!
Keep note though, I said it was the standard error. Can you tell the reason? It's because we are using p-hat, not Po because we do not have hypothesis.
The confidence interval conclusion has not changed. (BE CAREFUL: We aren't talking about the true mean any more! Remember the proper parameter- check question 1 or last assignment for help.)
Question 5: Here's a hint: Whenever the question is "Why can't we" it means CHECK CONDITIONS!!
Question 6: Check out the equation on the right hand side of the equation sheet, in the row right under the heading "proportions".
Questions 9-12: We will be doing questions 13-16 in class, so these questions will be easier to answer after that!
Good Luck!
-Hillary
Wednesday, March 21, 2012
Exam 3 Review
What makes it challenging? It is a LOT of interpretation. Most people are comfortable with all the calculations. We test you on your understanding of the concepts and definitions.
Basically, this means know your definitions. Know them in and out. Know how to interpret them and recognize them.
Main Topics:
1. Tests of Significance
2. Confidence Interval Estimations
a. For both 1&2, need to know t and z tests, four step process
3. ANOVA
Side Topics
1. Type I/Type II errors
2. What type of procedure is this?
3. Two-sided confidence intervals
4. Sample size
5. Symbols
1. Tests of Significance
Definitions to KNOW:
- Test of significance: An outcome that is unlikely to happen if a claim is true is good evidence that the claim is not true. (This is the theory of a test of significance. Remember the coin example?)
- p-value: The probability of getting an x-bar as extreme or more extreme if the null hypothesis is true. (KNOW THIS. You will need to be able to INTERPRET this as well. Meaning if I give you an actual situation, you could put numbers into the right locations. You can see previous posts for a more in depth explanation of this).
- Parameter: The mean of what you are finding out about the population. Okay, so this isn't really a definition, but be comfortable writing parameters. (Remember we need the MEAN and the POPULATION).
- Null/Alt Hypothesis: Null Hypothesis: Statement of no change. Alternative: What we want to prove.
Obviously you need to be comfortable with every part of the four step process for a test of significance (for both t and z tests).
Write the parameter, null and alternative hypothesis and state the level of significance. I've talked about these previously, but make sure you can do them.
Conditions
For a Z-test the conditions are (and are met by):
1. Randomization: Met through SRS OR RAT.
2. Normality: Met through CLT OR graph displaying approximately normality.
3. Sigma is known: Yes or no. They give it to you or they don't.
For a t-test conditions are (and are met by):
1. Randomization: Met through SRS OR RAT.
2. Normality: Met through CLT OR graph displaying no extreme skewness or outliers.
For a z-test, we use the equation z=x-bar - mu/ (sigma/ sqrt(n)). This is called the test statistic. We then go to the z-table and get a value for the p-value.
Things to remember about how to find z-test p-values:
- One-sided test with Ha: Mu<#: Read p-value directly off table.
- One-sided test with Ha: Mu>#: 1-table value = P-value.
- Two-sided test with x-bar < null hypothesis mu: 2*(table value)= p-value.
- Two-sided test with x-bar > null hypothesis mu: 2*(1-table value)=p-value.
For a t-test, we use the equation t=x-bar-mu/( s/sqrt(n). This is called the t test statistic. We then go to the t-table and get a value for p-value.
We like the t table because it already accounts for if it's a one-sided or two sided or if it is greater than or less than. The basic process to find the p-value is as follows:
- Take your t test-statistic
- Find your degrees of freedom (df) (n-1)
- Enter the table on your df row.
- Find the two values that sandwich your t test stat.
- Follow those two columns down to the bottom.
- Decide if you have a one-sided or two-sided test
- Read the two p-value values off
- Say "P-value = Number on right < P-value
Conclude
- Compare p-value with alpha
- Reject/Fail to Reject Null (p-value
alpha, fail to reject). - Conclude in context.
2. Confidence Interval Estimation
Definitions to KNOW:
- What is a confidence interval: It is used to estimate the mean. Gives reasonable values for the mean, etc.
- Confidence Level: If the procedure were repeated many times, confidence level is the amount of INTERVALS we would expect to contain the true mean. (This is an important one. Realize what confidence level is NOT: It is NOT how often our interval will contain mu, or x-bar or the percentage of time we are right).
- Margin of Error: the amount we expect our mean (mu) to differ from our sample mean (x-bar).
Procedure
Write the parameter, choose confidence level. Conditions are the same as for test of hypothesis.
Z confidence interval
Use equation x-bar +/- z* (sigma/sqrt(n)).
Finding z*
- Go to the t-table
- Find the row with your confidence level in it (top of chart)
- Follow it down to the third row from the bottom labeled "z*".
t confidence interval
Use equation x-bar +/- t* (s/sqrt(n))
Finding t*
- Find degrees of freedom
- Go to where your df and your confidence level intersect
- That is your t*
Conclude
Use the cookie-cutter answer to conclude for confidence intervals.
"We are _______% confident that the true mean _____________ lies between (_____,_____)"
a. Difference between Z and t tests and four step process
We basically stepped through the four step process for both confidence intervals and tests of significance above. Wouldn't hurt to go over the different sections, though.
Remember, we use a z-test is Sigma (population standard deviation) is KNOW, we use a t-test if sigma is UNKNOWN.
For multiple choice, it's helpful to really remember if you are using a t and z test. Remember, a z-test will give you a one-number p-value. A t-test will give you a range of values. Keep this in mind for your answers.
It may help you to remember the differences between the distributions (we talked about this briefly in class, check your notes). For example, the t-distribution has more areas in the tail (less precise). As n increases, the t-distribution becomes more like the z-distribution in shape.
3. ANOVA (Analysis of Variance)
Anova is all about reading the output. Be sure you know how to:
- Write Hypothesis
- Find the p-value
- check conditions
- Conclude IN CONTEXT based on the confidence intervals
- Go over assignment 21. It was there for a reason.
- Know how to do the four step for both two-sample and matched pairs. (this includes things like, how does the parameter change? When do you do each one? What do you graph? What equations do you use? We went over all this in class).
Monday, March 19, 2012
Assignment 24
Assignment 23
Thursday, March 15, 2012
Assignment 22
Something I forgot to mention in class is that we call sigma/sqrt(n) the standard deviation of x-bar.
Well s/sqrt (n) = standard error.
Questions 2-5 are just a simple one-sample t-test, so that should be simple enough.
Questions 9-15: What kind of procedure is this?
In this case, they didn't necessarily give you the differences, but you can clearly see that each student has TWO measurements: two thighs were hit with tennis balls. (And of course, I gave you this example in class, so that might have helped :) )
Be careful on calculating the t-test statistic. Remember we only care about the differences. You need to use statcrunch to do this.
Good Luck!
Sunday, March 11, 2012
Assignment 21 + t-test Practice
A common mistake on this one is the "list and state how the conditions are met". This means you must STATE the condition (like Normality:) then after the colon, state how it is met for this particular problem.
Remember how the conditions change slightly for a t-test.
On part B, be careful with the t-test confidence interval: we are using t*, not Z*, which I think is a common mistake.
Okay, now onto t-test practice so you can actually solve the problems!
Let's do an example!
Hillary thinks that the statistic department isn't correctly stating the actual amount of late-fee money they receive from Stat 121 students. They claim that on average each test gives them 5,000 dollars. Hillary takes a simple random sample of a 10 different testing periods over the last five years and gets a mean of 5,800 dollars and standard deviation of 750. Alpha = 0.05. Assume test fees are normally distributed.
STATE: Is the true mean income earned by Stat 121 late fees greater than 5,000 dollars?
Okay. So there are a few things we notice here off the bat. First where is the standard deviation from? It says in the problem it is from the sample, meaning that we know S, not sigma. This means we will be doing a t-test. Also, the STATE lets us know what our hypothesis will end up being (greater than).
For the sake of this problem, we aren't going to go through the entire Plan or Solve steps, only because the point of this problem is to help you learn how to use the t-table.
Ho: Mu=5000
Ha: Mu > 5000
t=5800-5000/ [750/sqrt(10)] = 3.37
Now we go to the t chart. We need one more thing though before we use it: degrees of freedom. Remember, df= n-1.
So in this case, df= 10-1 = 9.
Go to the tenth row in the t-table. Find the two values that sandwich our t value.
I see that the t* values of 3.690 and 4.397.
Assignment 20
Remember the definition for p-value:
"P-value is the probability of getting an x-bar as extreme or more extreme if the null hypothesis were true"
This definition has slightly more things you can see "subbed in" for. Let's try an example.
Let's do the example we talked about in class, the pink cookies from the vending machine. We get an x-bar of 650 calories, and we are testing:
Ho: Mu=600 cal
Ha: Mu>600 cal.
We calculate a p-value of 0.03. If the question asked us to interpret the p-value in context, we might say:
"The probability is 3% of getting a value as high or higher than 650 calories if the true calories of the cookies was 600."
I highlighted the same colors of the sentence that correspond to the definition sentence. See how the main points are there and how you can recognize them? There are obviously different ways to re-arrange the sentence, but all the main parts have to be there.
The last questions (questions 8-11) are what we couldn't go over and you should have learned in class. Just some hints (we will go over what it REALLY means this Thursday.)
A type I error is REJECTING a TRUE null hypothesis.
A type II error is FAILING to REJECT a FALSE null hypothesis.
For example:
Ho: The cake is done.
Ha: The cake is not done.
In a type I error, we REJECT a null hypothesis that was actually TRUE. So, We would say that the cake is not done when the cake was actually done (meaning we left it in the oven and overcooked it).
In a type II error, we would take the cake out, but it wasn't done yet (because we failed to reject the null, but it was false).
alpha=probability of a type I error
beta=probability of a type II error.
I hope that helps you answer the questions, although it contains none of the explanation. I think it will make more sense once we go over it.
Good luck!
-Hillary
Monday, March 5, 2012
Assignment 18
Don't get confused by the wording on question 1. You know what the mean and standard deviation are of a sampling distribution of x-bar: remember, we always assume the null hypothesis is true.
Remember: Test-statistic = z-score.
I realize in question 6 that they do not give you an alpha. But even without an alpha, you should be able to answer the question. Which p-value gives us more evidence against the null (helps us accept the alternative)? What does it mean when p-value is low? When p-value is high? How do we get these p-values? If our test statistic (z) is further from the mean, does that give us a high or low p-value? DRAW IT OUT ON A GRAPH! It will help. I promise.
The rest of the questions step you through the process we talked about at the end of class.
Be careful on p-value in question 12: Our null hypothesis is greater than. What proportion do we want from the table?
Good Luck!
-Hillary
Assignment 17
Statistically significant means it did not happen due to chance alone. Meaning, our p-value was significant. In other words, if p-value is less than alpha, and we reject the null hypothesis, our p-value was statistically significant.
-Hillary
Thursday, March 1, 2012
Assignment 16
The key here is to remember that you only need to choose ONE POPULATION.
So, for example, you'd have one of these populations:
Middle-aged American women of a healthy weight and BMI that don't drink wine.
OR
Middle-aged American women of a healthy weight and BMI that do drink wine.
Once you choose one, roll with it for the rest of the time.
Thus when you write the parameter, just write about one of the populations (the one you chose). How does a parameter differ from the population? What word do we need to add? (Hint...it starts with "m").
Remember the difference between confidence level and confidence interval. Interval is what we actually report, LEVEL is that complicated definition we talked about. Here's a hint for question 5, you should NOT answer:
"That is the percentage of the time we will find mu in the interval". This is WRONG. Hopefully this WRONG answer will help you remember the right one :)
The last question we want to use our conclude cookie-cutter answer.
Good Luck!