PSC 103A Exam 1

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/73

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

74 Terms

1
New cards

descriptive statistics

- details characteristics of a sample

- mean, median, mode, sd, var

- collecting, organizing, summmarizing data

2
New cards

inferential statistics

- generalize characteristics to a population

- testing hypothesis

- examine relations among variables

- make predicitions

3
New cards

parameter

value of interest in population (mean, sd, variance, correlation)

4
New cards

statistic

value computed from sample

5
New cards

measurement

- quantify attributes in a construct (can make good or weak inferences)

6
New cards

examples of measurements hard to quantify

attachment, self-esteem, personality, self-regulation

7
New cards

Measurement Scales

- nominal

- ordinal

- interval

- ratio

8
New cards

nominal

categorical (classification) -- grouping individual observations into qualitative classes

- Hair color (blonde, gray, brown, black, etc.)

- Nationality (Kenyan, British, Chinese, etc.)

9
New cards

ordinal

order/ranking (classification, order)

- order of race finishings, likert scale

10
New cards

interval

number/grouping denotes magnitude of differences among observations (classification, order, equal intervals)

- IQ scores, temperature,

11
New cards

ratio

like interval but with logical zero point (classification, order, equal intervals, true zero)

- age, height, weight, speed

12
New cards

univariate

bivariate

multivariate

1 variable

2 variables

3 or more variables

13
New cards

Categorical Variables

qualitative, no numerical property

- binary (2 options)

- nominal (greater than 2 options and NO order)

- ordinal (greater than 2 options and ordered)

14
New cards

Numerical Variables

quanitative, numerical responses

- discrete (numerical values limited to isolated points on number line) -- ex. # of pets

- continuous (numerical values NOT limited to isolated points on number line) -- ex. age, height

15
New cards

What data visualization is best for identifying outliers and spread?

histograms and boxplots

16
New cards

What data visualization is best for describing relationships between variables?

scatterplots (ex. stress x performance)

17
New cards

detect outliers by using ________ and handle them by ___________

- boxplots, z scores

- remove, transform, keep w justification

18
New cards

boxplot shows

- line can also show mean

<p>- line can also show mean</p>
19
New cards

mode

most repeated value

- not intrinsic, needs to be determined

- mode is midpoint in frequency distrubution

- very UNRELIABLE source of info

- highly resistent measure

- sensitive to size and number of class intervals

- good for nominal variables

20
New cards

median

midway point

- highly resistant

- affected by change in value of a case

- good for "bad" distributions like US incomes

- good for arbitary ceiling/floor distributions

21
New cards

mean

average

- most commonly used measure of central tendency

- for inference and descriptions

- BEST estimator of parameter

- NON resistent

- sum of signed deviations equals 0 (di= xi - x [ sample mean])

22
New cards

when a distribution is ___________, the mean and median are equal. (opposite isnt always true)

symmetric

23
New cards

when a distribution is __________, the bulk of cases fall into the upper part of the distribution and the median is _______ than the mean.

negatively skewed (left), greater

<p>negatively skewed (left), greater</p>
24
New cards

when a distribution is __________, the bulk of cases fall into the lower part of the distribution and the median is _______ than the mean.

positively skewed (right), lower

<p>positively skewed (right), lower</p>
25
New cards

mean, median, mode for a normal distribution?

equal and centered

<p>equal and centered</p>
26
New cards

mean, median, mode for a positively skewed distribution?

mode less than median less than mean

<p>mode less than median less than mean</p>
27
New cards

mean, median, mode for a negatively skewed distribution?

mean less than median less than mode

<p>mean less than median less than mode</p>
28
New cards

"best guess" to make smallest absolute error?

median

29
New cards

"best guess" to be absolutely right?

mode

30
New cards

mean as "best guess" produces...

an error d

- average error is 0 over all cases

31
New cards

dispersion/spread

observations depart from central tendency

32
New cards

range

spread of distribution

range = high score - low score

33
New cards

deviation

departure of a case from given statistic

- from the mean: di = xi - x [sample mean]

- from the median: d'i = xi - Md

34
New cards

variance

spread of variability (average of the deviations about the mean)

- variance = 0 if all cases in distrubution are the same

- minimizies the sum of squared deviations

<p>spread of variability (average of the deviations about the mean)</p><p>- variance = 0 if all cases in distrubution are the same</p><p>- minimizies the sum of squared deviations</p>
35
New cards

standard deviation

square root of variance

- index of variability

<p>square root of variance</p><p>- index of variability</p>
36
New cards

Interquartile Range

measure of variation not sensitive to outliers

--calc: split data into 2 sets, get median in each set, and subtract Q3 - Q1

<p>measure of variation not sensitive to outliers</p><p>--calc: split data into 2 sets, get median in each set, and subtract Q3 - Q1</p>
37
New cards

less variability makes any guess _____ precise

more

38
New cards

normal distrubution

- theoretical distribution of variables as a symmetrical bell-shaped graph

- x axis is specific values x takes

- y axis is density function f(x)

- differs in location (mean) and scale (sd)

- most descriptive factors mean and variance/sd

39
New cards

standard normal distribution

mean = 0

sd = 1

40
New cards

qualities of normal distribution

- density function symmetric about mean

- mean is also mode and median

- 68% of area under curve is 1 SD away from mean

- 95% of area under curve is 2 SDs away from mean

- 99% of of area under curve is 3 SDs away from mean

41
New cards

properties of normal distribution

- distribution of sample means x is normal irrespective to size of N

- mean = mu

- standard error: SE = sd/sqrt(N)

- sample mean and sample variance independent IF population distribution is normal

42
New cards

a continuous variable has an ____________ number of possible outcomes

infinite

43
New cards

suppose you know the exact distrubution of a variable. Whats the probability of observing any particular value in the distribution?

ZERO

- if possible values are infinite, probability is ratio of something to infinity = 0 so we cant define probability for each outcome

44
New cards

density function

measures probability of each value

45
New cards

standardized score (z score)

- shows the relative status of a score in a distribution

- shows number of SDs above/below the mean

z = (xi - x[sample mean]) / s

- mean of z score should be 0 and sd should be 1

- does not change form of distribution

<p>- shows the relative status of a score in a distribution</p><p>- shows number of SDs above/below the mean</p><p>z = (xi - x[sample mean]) / s</p><p>- mean of z score should be 0 and sd should be 1</p><p>- does not change form of distribution</p>
46
New cards

What is the normal probability of a score of x=57.5 given a mean of 50 and sd of 5?

z = (57.5 - 50) / 5 = 1.5

-- from table, 1.5 sds from mean = .993 which is the probability of observing a z value less than or equal to 1.5 in a normal dist

- if negative, do 1 - .933 to get .067

47
New cards

if you get a score of 85 on two exam and the average score for both exams was 76. Did you perform equally well on both exams?

No, we need to know the spread of the distribution/variability

48
New cards

Central Limit Theorem

if population has finite mean mu and finite variance σ^2, the distrubution of sample means from samples of N observations makes normal distribution with mean mu and variance (σ^2)/N as sample size increases

---> when N is large, sampling distribution of the mean is normal

-- closer to normal original pop is, closer to normal sample dist of any size is

49
New cards

Sampling Distribution

collected data/statistics from a sample to try to "guess" what that statistic is in the population

--> theoretical probability distribution showing the relation between possible values of given statistic and their probability for samples of size N from population

50
New cards

what does a sampling distribution imply?

implies that the frequency distribution is not made from individual observations but from statistics of an infinite number of random samples of size N

51
New cards

How can we say that what we obtained from the sample reflects the characteristic of the population?

- bc of the central limit theorem which states that if the sample is random and large enough, the sampling distribution of the statistic will be centered at the population statistic

- if population has finite mean mu and variance σ^2, the sample distribution means approaches a normal distribution with mean mu and variance (σ^2)/N as sample size increases

52
New cards

x axis and y axis of sampling distribution

x axis represents values of sample statistic

y axis represents probability density

53
New cards

if the distribution of the original population is normal, the sampling distribution of the mean...

will also be normal regardless of the sample size

54
New cards

if we know the variability of the means, we know... (2 things)

1. by how much each sample of size n is expected to vary

2. how far we will be on average from the population mean by computing the sample mean

---> allows us to make inferences

55
New cards

the sample mean is an ________ estimator of mu

unbiased

56
New cards

the sampling distribution of the mean _______ match the population distribution

will not exactly

57
New cards

variance of sampling distribution

will be smaller than the populations variance by a factor of 1/N

--> σ^2 of sample = (σ^2 of pop) / N

<p>will be smaller than the populations variance by a factor of 1/N</p><p>--&gt; σ^2 of sample = (σ^2 of pop) / N</p>
58
New cards

SD of the sampling distribution

standard error (SE)

-- σ of sample = (σ of pop) / sqrt(N)

--> denotes how far on average sample mean is from the population mean

<p>standard error (SE)</p><p>-- σ of sample = (σ of pop) / sqrt(N)</p><p>--&gt; denotes how far on average sample mean is from the population mean</p>
59
New cards

the larger the sample size, the ______ the variance of the sampling distribution of the means, thus, the ______ probable it is that the sample mean approaches the population mean

- smaller the variance

- more probable

60
New cards

if the statistic is unbiased, the mean of the sampling distribution is ______

the true parameter value

61
New cards

if the statistic is consistent, as sample size increases, sampling variability _____

decreases

62
New cards

Properties of Sampling Distribution of the Mean (4 things)

1. expected value of the sample mean is the pop mean

2. variance of the sampling distribution of the mean will be smaller than the pop variance by 1/N

3. if the pop mean is normal, the sample mean is normal regardless of sample size

4. when N is large, sampling dist of the mean is normal regardless of pop dist

63
New cards

Estimators

formula or method for combining values occuring in data to infer pop parameters

64
New cards

Properties of Estimators

criteria to judge how effective stat is in inferring pop stat

1. statistic G is unbiased estimator of 0 when expected value of G over all samples is 0

--> sample mean UNbiased

--> sample variance biased

2. statistic G is consistent when the probability of G approaching 0 increased as sample size increases

--> sample mean and sample variance are consistent

3. statistic G is efficient estimator of 0 if there is no other estimator that has a smaller mean squared error (MSE)

--> MSE captures sampling variance and bias

4. statistic G efficient relative to statistic H (both unbiased) so that the stat with the smaller sampling variance (smaller SE) is more efficient

--> mean more efficient than median

5. statistic G is sufficient estimator of parameter 0 if it contains all info about 0

--> sample mean is sufficient when data is normal dist

65
New cards

what estimators feature most/all of the properties?

sample mean and sample variance

66
New cards

how to put sample mean into standard form?

z = (sample mean - mu) / SE

or

z = (sample mean - mu) / (σ/sqrt(N))

<p>z = (sample mean - mu) / SE</p><p>or</p><p>z = (sample mean - mu) / (σ/sqrt(N))</p>
67
New cards

Example: given a population with mu = 20 and simga = 5, how likely is it to draw a mean = 30 with N = 2?

z = (30 - 20) / (5/sqrt(2))

68
New cards

what is the minimum variance estimator

sample mean

69
New cards

confidence intervals for the mean

areas of precision/uncertainty around point estimate

70
New cards

95% of the Z-values will fall between -1.96 and +1.96 means that...

95% of all possible sample means will fall within 1.96 SEs of the population mean

71
New cards

99% of the Z-values will fall between -2.58 and +2.58 means that...

99% of all possible sample means will fall within 2.58 SEs of the population mean

72
New cards

100% Confidence Interval for the mean

sample mean - infinityσx <= mu <= sample mean + infinityσx

73
New cards

when would you be interested in studying variability?

- check reliability/consistency (low variability = more dependable)

- to make inferences (affects confidence intervals and error margins)

74
New cards

variance vs standard deviation of sampling distribution

Variance

-- σ^2/N

-- how much sample mean varies squared from pop mean

Standard Deviation

-- σ/sqrt(N)

-- average amount sample mean deviates from pop mean