Looks like no one added any tags here yet for you.
What is a theoretical sampling distribution? Empirical sampling distribution?
Theoretical - distribution of results we will get if we calculate a statistic for an infinite number of samples, based on logic or mathematic formulas
Empirical - frequency distribution of observed scores
What is meant when we say that a sampling distribution has been estimated by simulation?
1. theoretical results are asymptomatic, they tell us the shape of the distribution that we will get as sample size tend to infinity
2. sampling distributions are central to statistical inference, we can calculate how often we will get a figure in the tail of the distribution through random variation across samples
What is standard error?
the standard deviation of a sampling distribution
if normal, 60% of the sample will lie within 1 standard error of the mean, and 95% within 1.96 (2)
what does the central limit theorem tell us?
theorem that the sampling distribution of the mean becomes normal as sample size increases.
Appropriate sampling distribution for the mean from small sample
t distribution
appropriate sampling distribution for the mean from large sample
normal
appropriate sampling distribution for the test of independence for crosstabulation
chi square
what are 3 ways the t distribution resembles the normal
1. symmetrical
2. unimodal
3. as sample size rises, they become more normal, their central peaks rise and their tails lighten
2 ways t-distribution differs from the normal
1. low central peak
2. heavy tails
what is the relationship between the chi square distribution with one DF differ for a normal distribution? why?
the mean for chi square can only be a positive number because all negative values in the normal distribution will become positive once they are squared
why does the standard error for a mean become greater as sample size becomes smaller?
standard error increases as standard deviation increases, and standard deviation increases when sample size is low. so when sample size increases closer to that of the actual population, standard error, as well as standard deviation will increase
in what sense are there many t distributions? many chi squared distributions?
t - many as in there is a whole family of t distributions but convention is to refer to them as a single distribution
chi - another family of distributions like simple distributions and crosstabulations
for chi square, how does the sampling distribution change with DF? why does this matter?
as DF increases, the shape, central tendency, and dispersion of the distribution shift
the mean increases by 1 as DF increases by 1 - becomes more symmetric
for t distributions, why must we be concerned about DF?
the degree of freedom defines the shape of a t distribution
suppose someone calculates the standard error of the mean for some variable, and obtains 1.0. Assuming simple random sampling within what range of the true mean would 95% of the sample means lie?
95% will lie within 1.96 (approx 2)
what is a suggested rule of thumb for when the sampling distribution of a proportion can be treated as normal? what did graphs suggest about the accuracy of this rule of thumb?
Np and N(1-p) should both exceed 10, graphs show they become more symmetrical when these values are greater than 10
what does the law of large numbers tell us?
as samples grow larger statistics calculated for them tend to ward the results we would obtain if we got data from a full population
Difference between a null hypothesis and a research hypothesis
null states no difference between groups, or no association between variables
research states there is a difference between groups, some can predict what is higher, or there is an association between variables
what is a two-tailed test? one-tailed? situations when we would use them?
two - two critical regions in a distributions, extreme results in either tail are significant
- normal and t distributions
one - one critical region, results are significant only if they fall in the tail of the distribution specified by the research hypothesis
- chi-square and f distributions
what is a type 1 error? how can we reduce the chances of one?
mistaken rejection of a truly null hypothesis (false positive)
taking larger samples and reducing standard errors
what is a type 2 error? what can we do to reduce the chances of one?
fails to reject a truly false hypothesis (false negative)
taking larger samples and reducing standard errors
what is statistical power? why is it important?
likelihood of a significant test detecting an effect when there actually is one
tells us if the sample size is too small
what is a confidence interval? how are they constructed?
ranges within a given proportion of sample results can be expected to fall
start from our observed results, and then place bounds around it
in graphing regression results, why are confidence bands wider at the top and bottom of the centroid?
confidence bands are typically wider for low and high values of the predictor than for values near the mean
what fundamental question does bayesian's inference try to answer that differs from the questions posed in standard inference?
how likely it is that the true difference between groups, or the association between variables something, or lies in a specific range
in standard we test whether we hold to the null
write the equation for bayes' theorem. what do the various elements in the formula mean?
P(AD)=P(A/D)P(D)
P(A) probability of characteristic A
P(D) probability of a result in the data
P(A/D) prob. of having A, given the result of the data
P(D/A) prob. of a result in the data, give characteristic of A
P(AD) prob. of having A, and of getting a result in the data
P(DA) prob. of getting results in the data and having A
P(AD)+P(DA) same combo of events
what is the bayes factor? how does it help us estimate the P(A/D)?
the ratio that gives us thr factor by which we multiply P(A) to get P(A/D)
What is the difference between frequentist and a personal probability?
frequentist - probability is just the long-run relative frequency of some event, based on large numbers of people gathered over time
personal - beliefs held by an individual based on past experience and personal beliefs
difference between a standard confidence interval and a bayesian credible interval?
standard - the mean of your estimate plus and minus the variation in that estimate
bayesian - there is a 95% probability that the true (unknown) estimate would lie within the interval, given the evidence provided by the observed data
what are 3 replies those who accept personal probabilities have given to the argument allowing them involves the risk that unrealistic conclusions will be drawn because of unrealistic priors?
1. data from moderately large sample will swamp any reasonable priors
2. if people report their priors, others can redo the analysis
3. serious data analysis tend to use vague priors that assume more than modest knowledge
What is a PRE measure?
proportional reduction in error
these include lambda, gamma, and somers' d
it tells us how much we can reduce our errors in predicting outcomes if we know how two variables are linked
What is lambda?
used for nominal measures
proportion by which we can reduce our errors in guessing a case's score on the dependent variable if we know how the dependent is linked to the independent
what is gamma?
designed for ordinal measures
we try to predict whether pairs of cases suggest a positive or negative relationship between the variables
what is somner's d
modification of gamma
addition of Ty to the denominator, it represents pairs of cases tied to the dependent variable, but not to the independent
What is yules Q?
special case of gamma
numerically identical to gamma, difference in usage. Yule's Q does not restrict his statistic to ordered variables, can be used for nominal dichotomous
In the formula for r, what is the function of the SD in the denominator?
ensures that r will remain within the range from -1 to +1 , telling us if the association is positive or negative or there is no linear association
why is the numerator of the formula for r called the covariance?
because it is similar in form to the variance
n the formula for r, how is evidence of a positive association tailed up? of a negative association?
positive - if each variable scores below its mean (Xi-Xbar) will be negaitve, so will (Yi-Ybar) so (Xi-Xbar)(Yi-Ybar) will be positive
negative - X is above the mean, y below the mean, or vice versa, their product will be negative suggesting a negative association
starting from a formula for r that does not use algebraic notation, show what happens when the variables are standardized
its SD becomes 1
r becomes the mean of the product of x and y
what are 2 ways to interpret pearson's r?
1. based on the fact that squaring it yields r^2, a PRE measure
2. based on what happens if we standardized the variables in bivariate regression
difference between spearman's P and pearsons r
for p, instead of using the observed values of x and y, we use their ranked position (1-N)
what do we do before calculating rho if more than one case lies in a category?
we suppose w finer measurement, they could be distinguished , and we take the median rank that would then be found for the set
what is a scatterplot? what are some alternatives?
set of points plotted that show the extent of correlation
alternatives - boxplot, bar chart
what is a moving average
a calculation to analyze data points by creating a series of different subsets of the full data set
what are 2 advantages of a bar chart over a line graph? two disadvantages?
advantages - bars make it easier to estimate a value on the y-axis, greater visual impact
dis - breaks up a smooth trend line, high ink-to-information ratio
what is a mosaic plot? why are rectangles in the plot different sizes? why do we care about the "Pearson residuals"
a plot in which each cell is represented by a rectangle whose area is proportionate to the number of case in the cell, they are shaded differently to display residuals
Pearsons residuals tell us the difference between overserved and expected cell counts
why are some cells in a mosaic chart patterned or shaded differently? why might we be interested in a particularly light/dark cell?
standardized residuals are displayed by shading and patterns
dark cells usually represent a heavier cell and light shading for lighter cells. meaning larger or smaller residual values, large residuals identify a larger difference between the expected and observed cell count
what measure do we typically use to identify heavy tails? how is it related to chi square? what levels of measurement are we typically interested in?
we can use a crosstabulation
crosstabulations tell us the breakdown of data by the two variables, a chi tells us the results of a crosstabulation are statistically significant
nominal and ordinal
what is an association plot? what is the difference between rectangles above/below the line?
used when we do not need to show residuals, just draw attention to cells that are heavy/light
heavy tails are darker and are above the line, light cells are shaded lighter and rest below the line
what are conditional tables? what is another name for them?
set of mutually exclusive variables testing conditional probabilities of a single variable to another, 2 variables and a test factor
also called partial tables
how do conditional tables "control for" third variables?
the third variable the test factor is fixed so it cannot be linked between the other two variables within the table
what, for the columbia school, was a "test factor"
a third variable that was controlled to see how associations may change
what is a practical problem in breaking a sample down by many variables at once? what is one way to try to get around this problem and what difficulty arises if we take this route?
we will have few cases left in some subtables, too few cases will be the result
we could collapse test factors by for example making them dichotomous, but subtables would no longer be identifiable
term: specification
exists when the association between 2 variables is different for subsamples with different values of a third variable
term: moderation
when the relationship between 2 variables change when the value of a third changes, the moderator is the third variable
term: distortion
exists when the relationship between 2 variables is reversed when we control for a third
term: spurious relationship
the observed correlation between 2 variables exist because each is affected by a common cause
term: intervening variable
variable that affects the relationship between independent/dependent variables
term: mediator
how 2 variables relate
what is a doubledecker? what does the width of the bars tell us?
two variables identified at the same time, two variables within another
width of the bars are proportionate to the size of the category
final exam = 30+2.2*(study hours) what does 30 tell us? the 2.2?
30 - tells us someone who did not study at all is predicted to get a 30
2.2 - mark rise on average
in the general formula y=a+bx, what are "a" and "b" called? give 2 names for b
a - intercept
b - slope/coefficient
final mark = 60+0.3*(math anxiety) what does 60 tell us? the 0.3?
60 - someone with no math anxiety is predicted to get a score of 60
0.3 - grade rise on average by 0.3 for each additional point of the anxiety scale
explain the principle of least square through which a regression line is chosen
we must choose the line that minimizes the sum of the squared distances between scores on the dependent variable and scores predicted for them
two helpful by-products of our usual methods of choosing a bivariate regression line?
r^2 - how much variance in y is accounted for by x
standard error estimate
is the standard error equal to the variance or our errors of prediction? if not, what does it equal?
we call it the error variance
very similar to the variance
if errors of prediction are normally distributed, what range will 95% of them lie?
95% of the observations will lie within 1.96 standard deviations of the mean
show why pearsons r is equal to the regression coefficient for standardized variables
their numerators are the same
b will equal r when the denominators are the same
what is the generic interpretation for b? what does it become when the variables are standardized?
b gives us the average number of units of change in y for a unit of change in x
when variables are standardized, b becomes beta, beta gives us the average number of SDs of change in x
if a predictor is dichotomous, what does b tell us?
b gives us the average difference between the 2 groups
what is truncation? when might we apply it in regression?
you might truncate variables when experiencing flat trendlines, essentially recoding or removing variables to remove a flattened trend line
what might we do to deal with accelerating curve?
also called an exponential curve
take the log of the accelerating curve and present the logging income
how do we typically interpret the coefficient for a spine? why are these created?
giving us the change in slopes at a point we call the knot
spines deal with data where different slopes exist for different ranges of x values
when does b give us an estimate of how much % change we get in y for a unit of change in x?
when b < .20 it gives us the approximate % increase we obtain in y for a unit of change in x
what is an alternative to the quadratic curve? advantage to alteration?
dummy variables
allows us to give a verbal comparison of the reference category with each of the others
principle used to select evaluation of multiple regression? 3 advantages of choosing this way?
principle of least squares
1) passes through centroid
2) neither over no underestimate on average
3) it leads to different measures of how well we are doing in predicting the DV
difference between r^2 and R^2
r^2 - how much variance in y is accounted for by x
R^2 - how well we can explain the DV
in multiple regression, how do we obtain coefficients that shows the effects of an IV independent of other IVs
the coefficient is the number the total increases by every how much amount of time. like if you get $1000 increase in your salary every year, the coefficient is 1000*(years worked)