Lecture 5 - Populations, Samples and Variables

0.0(0)
studied byStudied by 0 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/56

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

57 Terms

1
New cards

What is a population?

A collection of things having some quantifiable characteristic in common

Objects, animals, people

A theoretical concept, often can't be counted

Dynamic, constant movement in and out

The group we want to study

2
New cards

How do we define populations?

Inclusion criteria

Exclusion criteria

Confounders

3
New cards

What is inclusion criteria?

The common characteristics of the population

4
New cards

What is exclusion criteria?

The common characteristics that the members of the population lack

5
New cards

What are confounders?

Characteristics of a population that could potentially affect study results

Things to watch for

Related to risk factor and outcome

6
New cards

What are inferential statistics?

Based on studies of populations

Results should apply to all individuals belonging to that population

Take care when extrapolating results from studies on a specific population to those who don't meet the inclusion and exclusion criteria for that population

7
New cards

What is internal validity?

The study was done right

8
New cards

What is external validity?

The study means something

9
New cards

What is a sample?

The group we actually study

A group of individuals that represents the population

10
New cards

What 2 concepts are essential to sample selection?

Random sampling

Law of independence

11
New cards

What is random sampling?

Every member of the population has an equal chance of being selected

12
New cards

What is the law of independence?

Selection of one member does not influence the change of choosing any other

13
New cards

An independent, random sample is chosen in such a way that...

Every possible combination of size N has an equal chance of being selected

14
New cards

What is a parameter?

A numerical (true) value that summarizes data for the population

What value we want to know

Stable, even though the population itself is dynamic

Not constant

15
New cards

What is a statistic?

The estimate of the parameter in a sample

The value for our study

16
New cards

What is a confidence interval?

Statistic +/- margin of error

Based on our study, what we think the range is for the parameter

17
New cards

With a 95% CI... you are ___ that 95% of __ will produce an __ around the __ that contains the __ for that population

Confident

Samples

Interval

Statistic

Parameter

18
New cards

A larger sample size gets...

More of the population you are sampling

More accurate statistic to the parameter

Narrower confidence interval

19
New cards

Each piece of data we collect is called a...

Variable

Each variable is measured in each member of the sample

Ex: weight, breed, BCS

20
New cards

How do we describe variables?

The type of variable (categorical, quantitative, qualitative)

The scale of measurement (binary, nominal, ordinal, interval, ratio)

The distribution of the values (normal: guassian or parametric, or nonparametric)

21
New cards

All 3 descriptions of variables combine to tell us...

What statistics we can use

22
New cards

What are qualitative variables?

Freeform, no structure

23
New cards

What are quantitative variables?

Numbers that reflect a value

24
New cards

What are the 2 types of quantitative variables?

Discrete: limited choices (size, sex, breed)

Continuous: extensive range of possible values (weight)

25
New cards

What are categorical variables?

No numerical value even if the categories have been assigned a number

26
New cards

What are the types of categorical variables?

Binary or dichotomous: 2 choices (Y/N, T/F, +/-) (intact)

Multinomial: more than 2 choices (breed)

Ordinal: more than 2 choices and ordered (BCS)

27
New cards

What are scales found in quantitative data?

Interval: continuous variables, difference between values is consistent (age in years, wt in kg)

Ratio: compare variables using different scales, relative distance from 1, uncommon in stats (test positivity)

28
New cards

How do we describe continuous variables?

Distribution

29
New cards

What is a distribution?

The pattern you see when you graph the frequency of the variable's different values

30
New cards

What is a histogram?

A bar graph showing the frequency of each value

31
New cards

What are things to look for in distribution?

Skew and outliers

32
New cards

What is skew?

Is the distribution symmetric

If not symmetric, it is skewed, there are more values in one direction than the other

Right skew means a tail to the right (positive skewness) and left skew means a tail to the left (negative skewness)

33
New cards

What is an outlier

A point that is really different from everything else

Error is the most common reason

34
New cards

What are the 2 ways we can describe distributions?

Central tendency

Variability

35
New cards

What are the measures of central tendency?

Mode: most common value

Median: 50th percentile, half higher and half lower

Mean: the average

36
New cards

What are the measures of variability?

Standard deviation: average distance of values from the mean

Variance: standard deviation squared

Range

Percentiles

37
New cards

What measure of central tendency should be used when a distribution is all over the place?

Mode

38
New cards

What measure of central tendency should be used when a distribution is skewed, but not all over the place?

Median

39
New cards

If we have a normal distribution how will our measures of central tendency look?

They will be the same

40
New cards

What are normal/gaussian/parametric distributions?

Symmetrically distributed

Describe with mean and SD

Mean = median = mode

95% of observations are within ~2 SD of the mean

41
New cards

What are nonparametric distributions?

Non-symmetrical

Describe with median and percentiles

42
New cards

How do we describe distributions with categorical data?

Describe with counts and percentages

43
New cards

How do we describe qualitative distributions?

Thematic analysis (repetitions)

Quotations (examples)

44
New cards

What are descriptive statistics?

Who is in the study and what are their characteristics

Describe each variable by type, scale, and distribution

Using tables and graphs

45
New cards

Why are descriptive statistics beneficial?

Have an idea of who the population is

Helps you compare your study sample to your target population

46
New cards

What are we evaluating in our inferential statistics?

Make comparisons with the data set (are values different between groups, is an event more common in one group)

Draw conclusions based on the results, hypothesis testing

Ask the question, is X associated with Y?

47
New cards

What is variable X?

Explanatory variable

AKA independent variable, predictor variable, risk factor, exposure of interest

Is correlated with the response variable, but not necessarily causing an effect on it

48
New cards

What is variable Y?

Response variable

AKA outcome variable, dependent variable, event of interest

49
New cards

What is linear regression?

Continuous predictor, continuous outcome

y = mx + b

y = outcome

m = slope

b = intercept where x is 0

50
New cards

m = 0 means...

There is no relationship between x and y

51
New cards

For every 1 unit change in x...

There will be an m unit change in y

52
New cards

What is multivariable regression?

Creates an equation to explain the relationship between the outcome, predictors, and other things like confounders

Allows us to consider multiple risk factors simultaneously

Often requires a larger data set

53
New cards

What is logistic regression?

Similar to linear, outcome is now dichotomous

Regression coefficients can be used to produce odds ratio

Slope variables are not really interpretable by themselves

54
New cards

Almost all statistics can be done using...

Generalized linear models

55
New cards

What do we need to know for generalized linear models?

The question we are asking

The distribution of the outcome variable

Any confounders we want to control for

Any population structure/clustering to account for

56
New cards

Probability is...

Between 0 (never going to happen) and 1 (guaranteed to happen)

57
New cards

P-value is...

Probability we use to decide if something we observed is real or just happened by chance