Stat 121 Fall 2011

Tuesday, February 21, 2012

Need more Help?

I have another TA friend who writes a fantastic website sort of like this blog that is a great resource if you need more help! While my blog is homework focused, hers is more concept focused. If you have a concept you are struggling with, more often then not you will be able to find great example problems and powerpoint slides on the topic. Check Kiya's website out!

https://sites.google.com/site/kiyabyustat/

I've also listed a link to her site in the sidebar :)

Happy Stat 121-ing!

-Hillary

Assignment 14

We will be going over Control Charts on Thursday (and doing questions 1-2).

Questions 5-7 are a great review on the difference between populatino distributions and sampling distributions of x-bar. Remember, if i were to take out ONE data point from a population graph, what would it represent? What about if I were to take it out of a sampling distribution of x-bar? What would it represent? (This should help with questions 6 and 7).

Question 8-10 are a simple review on statistics versus parameters. I think question 10 poses the most challenge. Remember your key words to know if it is talking about a parameter. (Hint: things that are "known" or about "all" are generally parameters).

Finally, questions 11-13 were discussed at the end of lab last week. This is relating the central limit theorem and how graph shapes will change (as well as the mean). Look over your class notes! (PS- These are VERY helpful questions to know for the exam! Be sure to understand them.)

-Hillary

Assignment 13

Sorry this is so late folks! My computer broke over the weekend which makes it a little difficult to write blog posts!

Luckily if you were in lab, Assignment 13 shouldn't have posed too much of a problem.

Questions 2-4: When you are doing these, remember what our "new" standard deviation is (aka what standard deviation is for a sampling distribution of x-bar). Particularly on question 3, think about what you are solving for, and where this symbol arrives in your equation. On this question, you won't be using an entire formula from your equation sheet. Adapt!

Questions 5-11 test your knowledge on the difference between the graph of an individual (population) or of a sample mean (sampling distribution of x-bar). Be careful what equation you use!!

Question 14 is probably the hardest for students, but you definitely know how to do it! The key for this problem is labeling what you know. Write down on your paper the equation. Then list the variables you have to fill:

mu, sigma, n, x-bar and z.

We clearly are solving for z to get a proportion. That means the values for mu, sigma, n and x-bar are given somewhere in this question. Find them! See how much easier this problem becomes once you label what you have? Then it is just plug and chug.

HINT: If it asks the probability that the company's average loss will not exceed, we are looking for the left proportion :) (less than).

Monday, February 13, 2012

Assignment 12

Probability is a pretty easy concept that I do not want to spend too much time on in class, since it will make up very little of your exam AND most all of you have seen these concepts before.

Probability of an event can be between 0 and 1. This makes sense. The lowest a probability can be is something has 0% chance of happening. And nothing can have more than 100% of something happening.

If there is a distribution of probabilities, they should all add up to 1. Again, this makes sense. As an example, let's say we got the probability of college students at BYU having 0, 1, 2, 3, 4+ roommates. The distribution may look like:

0 | 1 | 2 | 3 | 4+ |
.05 .10 .10 .30 .45

As you notice, .05+.10+.10+.30+.45 = 1. This is because at BYU, you HAVE to have one of those options (you cannot have less than 0 roommates, and I have covered everything in "4+"). So it has to encompass 100%.

Questions 1-4
Pretty straight forward. Choose the proportion that makes the most sense.

Questions 5-7
The key to these questions is writing out the right possibilities. Make sure you get every possible combination. I'll give you the first FOUR as a hint...but you need to come up with the rest.

GGG
GGB
GBG
BGG

The hard part for most people about these questions is the whole "x=2". X stands for the number of girls a couple has. So when x=2, it means how many arrangements are there only two girls: no more, no less.

For question 7, remember all you know about probability. What are all the values that "X" (number of girls in the combination) that can be possible? Look at your arrangements. Make sure they add up to 1!

The rest of the assignment is about parameter versus statistics, experimental design and association versus causation. We have gone over the last two extensively in class, and they are good reviews for the exam! Statistic versus parameter we will discuss in class.

Good Luck!

Assignment 11

We went over most of this assignment in my first lab, and the WHOLE assignment in my second lab, so you should be well equipped for this!

For questions 8-13, Remember that marginal distributions deal with the MARGINS, so they are only total rows. (You'll notice you have to compute the total values by yourself).

Conditional distributions are based on one condition. Block out the row/column you are interested in. Remember, that will be the SPECIFIC. (For example "whether you buy or not" is not a specific since there are nonbuyers and buyers. "Higher" IS a specific, because it just isn't "quality").

Remember: Be careful with the word CAUSE. What does that mean? When can we conclude causation?

Wednesday, February 8, 2012

Assignment 10

We went over everything you need for assignment ten last week, so here is some help. We won't be going over it this Thursday in Lab.

In one of my labs, I wasn't able to get to "r^2" (r -squared). The definition of r-squared is as follows:

"The percent variation in y, explained by x".

Realize that it is a percentage basically describing how much our x variable (explanatory) explains or describes our y variable (response). For example, Let's use house price versus house size again. You can probably imagine what this would look like (Draw it if it will help). House size is our explanatory variable and price is our response. It has a positive relationship because as house size increases, so does house price.

Now, let's say our r-value (correlation) for this is .8. To get r^2, we just square it. Thus, we get .64. r^2 is usually in a percentage, so we would say 64%.

According to the definition, (The percent variation in y, explained by x), this means "64% of variation in house price is explained by how big your house is".

This probably makes sense. A lot of how expensive our house is is because of the size. But the other 36% could be explained by location, schools nearby, property, newness, etc. This should help with problem 6.

Problem 7 is the weird one I told you about. I"ll step you through it. Remember, you'll never have to do this again.

You'll notice we have the variables "Sy and Sx" and "Y-bar and X-bar". Sx and x-bar refer to the standard deviation and mean of the x, or explanatory, variable. That means Sy and Y-Bar refer to the standard deviation and mean of y, or response, variable.

Looking at the problem, which is the explanatory and which is the response variable? Try to figure it out on your own first.

Did you get that the wife's height is the explanatory and the husband's height is the response? The clue here was that we were using the "regression line to predict the husbands height from the wife's height".

Knowing that, then it becomes easy. Sx=2.7 and x-bar=64. Sy=2.8 and y-bar=69.3. r=correlation coefficient, which is given.

Solve for b first, then plug it into the next equation.

For questions 9-12, make sure you follow the StatsCrunch instructions on Blackboard. They will help you produce an output that will make answering these questions easy.

Question 12 is asking about something called "extrapolation" which you should have learned in lecture. Extrapolation means trying to predict a y for an x outside of the range of your data. For example, let's use the example of credit hours versus hours of sleep at night.

Let's say we only collected data up until 15 credit hours. We could NOT use our line to predict someone who was taking 18 credit hours. Why not? Because we would not know what the regression line was doing after 15 credit hours.

Be careful on Question 14 - make sure you are talking about differences in the students not the environments.

Good Luck!
-Hillary

Saturday, February 4, 2012

Assignment 9

We talked about most of this in class. Be careful on number 5: how are slope and correlation related? Think about it. Is it possible that a graph could have points NOT close together and another graph have points that ARE close together, yet the "best fit line" had the same slope?

Another hint is that r values can only take on certain numbers. What number is the slope?

Questions 7-9 we talked about in class. Look back to your notes on correlation, "r". Each one of the rules we discussed falls into one of these phrases.
Correlation coefficient:

has no units
is effected by outliers
can only be between quantitative variables
only describe linear relationships
is between -1 and 1

Question 12 uses the slope definition. Here is a reminder of that definition:

"The average change in y for every one unit increase in x".

All of the red variables can be exchanged for the specific circumstance.

Good Luck!
-Hillary