Stats Unit 1

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/56

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

57 Terms

New cards

variable

a characteristic that can vary in value among subjects in a sample or a population, mutually exclusive and collectively exhaustive values

New cards

values

categories, possible options for responses

New cards

mutually exclusive

events that cannot happen at the same time - only belong to one category, no ambiguity

New cards

collectively exhaustive

everyone should be able to fit into one category, no one left over

New cards

qualitative (categorical)

scale of measurement is a set of unordered categories that differ in quality, not quantity or magnitude

New cards

quantitative (numerical)

scale of measurement is a set of ordered categories that differ in quantity or magnitude (can be ranked)

New cards

discrete variable

can only assume integer values - no fractional values

New cards

continuous variable

can assume any real value, including fractions - only limited by precision of instrument

New cards

nominal variable

qualitative/categorical, unordered and discrete (ex: hair, religion, color)

New cards

ordinal variable

qualitative/categorical, ordered and discrete (ex: preference for food, army ranks)

New cards

interval variable

quantitative/numerical, discrete or continuous - uniform intervals between adjacent values, arbitrary 0, subtraction and addition makes sense (ex: calendar year, degrees Celsius)

New cards

ratio variable

quantitative/numerical - has non-arbitrary true zero that means a complete absence of something, multiplication and division make sense (ex: height, number of siblings)

New cards

cross-sectional data

observation on different individual units at the same point in time (ex: the current presidential approval rating)

New cards

time series data

observations on a variable over time (ex: how does the amazon stock price vary year after year)

New cards

pooled cross sections

data from multiple years based on different cross-sectional samples of the same population - take a cross section of individuals and ask them a question, and do this year after year with new cross sections each time

New cards

panel or longitudinal survey

time series for each cross-sectional member in a data set - choose a cross-section of individuals and ask them the same questions over a time period (ex: Terman's termites)

New cards

tables

units of analysis are placed in top row, variables in columns

New cards

bar charts

qualitative data, use categories

New cards

ogive

uses first column of table as x-axis and cumulative frequency or percentage as vertical axis - will always trend upward or plateau, will never dip down

New cards

stem and leaf plots

no loss of data, can be rotated to show spread of data

New cards

histograms

quantitative data, gives frequency of ordered data - all bins on horizontal axis should have the same width, use Sturge's rule to calculate the number of bins

New cards

things to watch out for in visual displays

dramatic title, 3D and rotated graphs, gratuitous effects, appeal to authority figures, vague/no source, estimated data, funky axis scaling, non-zero origin

New cards

descriptive research questions

describe the problem (how many, what)

New cards

explanatory research questions

explain why/how the problem is occurring

New cards

theory

answer to a "how" or "why" question or speculative idea offered as an explanation - somewhat contested, becomes a law after its been repeatedly verified

New cards

concepts

turn into a theory (ex: religion, success)

New cards

hypothesis

theory that has been made concrete (replace concepts in a theory with variables)

New cards

instrument

measurement device like a survey, test, scale, ruler

New cards

unit of analysis

the entity about which we collect information - characteristics/properties of these entities are called variables

New cards

unit of measurement

units used to record measurements of a variable (ex: dollars, inches)

New cards

robust/resistant statistics

statistics not affected by outliers (median, mode)

New cards

mode

most common value - can be determined for nominal, ordinal, and interval-ratio data, may have more than one mode for a set of data

New cards

median

the 50th percentile, can only be determined for ordinal and interval-ratio data

New cards

mean

average - can only be calculated for interval-ratio data, takes into account the value of each item in a set of data (not resistant, can be affected by outliers), cannot be determined for grouped data if there's an open class

New cards

trimmed means

calculate mean after getting rid of the lowest and highest numbers (ex: remove lowest three and highest three numbers)

New cards

range

max-min, can be misleading if there are outliers

New cards

average deviation from the mean

calculate how far away on average the values are from the mean - numbers below the mean will have a negative distance, so this value will always equal zero because positive and negative signs will cancel out, suggesting that there is no variation

New cards

average absolute deviation from the mean

take distance of each value from the mean, but put it into absolute value before averaging- solves sign problem and gives you a whole number

New cards

variance

take distance of each value from the mean and then square it before averaging- solves sign problem but gives you units squared

New cards

standard deviation

the square root of variance - take distance of each value from the mean and then square it before averaging, then take square root of your result

New cards

coefficient of variation

std dev/mean *100%, helps us assess which of two or more interval-ratio variable has more variation (smaller CV = less variation)

New cards

standard unit (z-score)

(x-mean)/std dev, tells us by how many standard deviations a value lies above or below the mean of the data set - helps standardize data and makes it easier to compare

New cards

use CV when

comparing two or variables and want to know which has more variation, or when comparing two or more groups with respect to a single variable and want to know which has relatively more dispersion

New cards

use z-score when

comparing two or more individuals values of different variables and want to know which value is relatively more extreme or exceptional

New cards

empirical rule

works well for bell-shaped distributions, most data should fall within three standard deviations of the mean

New cards

histogram skew

based on where the tail of histogram lies - if tail goes to left, you have a left skew

New cards

combinations

order doesn't matter (AB=BA)

New cards

permutation

order matters (AB=/=BA)

New cards

random experiment

experiment must have two or more outcomes, and there must be uncertainty as to which outcome will occur (ex: flipping a coin, drawing a card)

New cards

sample space (s)

set of all basic outcomes (ex: heads, tails)

New cards

basic outcome

one of the possible results from a random experiment (ex: getting heads)

New cards

events

a combination of one or more basic outcomes, typically represented by uppercase letters (ex: Event A = rolling an even number on a die)

New cards

empirical estimation

necessary when we have no prior knowledge of events, hard to figure out with just logic but can be done with data

New cards

law of large numbers

as a sample size increases, so does probability

New cards

classical probability

don't need actual data, can reason it out logically

New cards

subjective probability

necessary when a repeatable random experiment is not available, reflects personal judgement or expert opinion about the likelihood of an event - often when an event is new and we don't have past data to work from

New cards

probability tree

to find probability of a basic outcome, multiply the probability of each branch leading to that outcome