Lecture 4: Probability: Sampling, Conditional Probability, Sensitivity, Specificity, PPV and NPV

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/48

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

49 Terms

1
New cards

Dichotomous Graphical Summary

Bar chart

2
New cards

Dichotomous Numerical Summary

Frequency table

3
New cards

Categorical Graphical Summary

Bar chart

4
New cards

Categorical Numerical Summary

Frequency Table

5
New cards

Ordinal Graphical Summary

Histogram

6
New cards

Ordinal Numerical Summary

Frequency table + cumulative frequency

7
New cards

Discrete Graphical Summary

Boxplot

8
New cards

Discrete Numerical Summary

Frequency table + cumulative frequency

9
New cards

Continuous Graphical Summary

Boxplot

10
New cards

Continuous Numerical Summary

SD, mean, variance, median, IQR, mode

11
New cards

SD, Mean, Variance, Range

More affected by outliers

12
New cards

Role of Probability

  • Probabilities are numbers that reflect the likelihood that a particular event occurs

  • Statistical inference involves making generalizations or inferences about unknown population parameters based on sample statistics

  • A population parameter is any summary measure computed on a population (e.g., the population mean, which is denoted as μ; the population variance, which is denoted σ2)

13
New cards

In General, (Role Of Probability)

  • Select a sample from the population of interest

  • Measure the characteristic under study

  • Summarize this characteristic in our sample

  • Make inferences about the population based on what we observe in the sample

14
New cards

Probability Basics

  • Probability reflects the likelihood that an outcome will occur

  • 0 ≤ probability ≤ 1

  • Probability of 0 means no chance that a particular event will occur

  • Probability of 1 indicates that an event is certain to occur

15
New cards

Sampling

Population size = N, sample size = n

  • When we select a sample from a population, we want that sample to be representative of the population

16
New cards

Two Main Types of Sampling

  • Probability sampling: each member of the population has a known probability of being selected

  • Non-probability sampling: each member of the population is selected without the use of probability

17
New cards

Probability sampling: each member of the population has a known probability of being selected

If we select subjects at random (e.g. by simple random sampling), then each subject has the same probability of being selected. This means each subject is equally likely to be selected.

18
New cards

Probability Sampling

  • Simple random sampling

  • Systematic sampling

  • Stratified sample

  • Cluster sampling

  • Multistage sampling

19
New cards

Simple random sampling

  • Need to build a sampling frame

  • Select n individuals at random (each has the same probability = 1/N of being selected)

  • Most useful with small population

20
New cards

Need to build a sampling frame

A complete list or enumeration of all members of population N

21
New cards

Systematic sampling

Start with sampling frame; determine sampling interval (N/n); select first person at random from first (N/n) thereafter

22
New cards

Stratified sample

Organize population into mutually exclusive groups or strata

23
New cards

Organize population into mutually exclusive groups or strata

  • These groups are different from each other (e.g. demographic groups)

  • Individuals within each group are similar to each other

  • Select individuals at random within each stratum

24
New cards

Cluster sampling

When clusters exist which are very similar

25
New cards

When clusters exist which are very similar

  • Groups are similar to each other (natural groups, e.g. neighborhoods, zip code)

  • Within each group subjects may be quite different

  • Then, we sample everyone from specific clusters

26
New cards

Multistage sampling

Combine types of sampling techniques

27
New cards

Non-Probability Sampling

  • Used in practice because sometimes not possible to generate a sampling frame

  • Convenience sampling

  • Quota sampling

28
New cards

Convenience sampling

  • Non-probability sample (not for inference)

  • For preliminary data

  • Not representative

29
New cards

Quota sampling

  • Select a predetermined number of individuals into sample from groups of interest

  • Similar to stratified sampling, groups are non-overlapping and different; but quota doesn’t have to represent the population, and use convenience sampling when selecting samples from each group

30
New cards

Sampling variability

  • Inferences about a large number of individuals in a population based on a study of only a small fraction of the population (i.e., the sample)

  • If a study is replicated or repeated on another sample from the population, it is possible that we might observe slightly different results (slightly different sample)

  • From any given population, there are many different samples that can be selected. The results based on each sample can vary, and this variability is called sample variability

  • When we make estimates about population parameters based on sample statistics, it is extremely important to quantify the precision in our estimates

  • The probability distribution of a statistic produced by repeatedly selecting samples of the same size and computing the desired statistic is called the sampling distribution, e.g., sampling distribution of the sample mean

31
New cards

Conditional probability

Probability of outcome in a specific subpopulation or subsample

32
New cards

Sensitivity and specificity

Screening tests are not used to make medical diagnoses but instead to identify individuals most likely to have a certain condition.

Some examples are PSA test for prostate cancer, mammograms for breast cancer, and serum and ultrasound assessments for prenatal diagnosis.

33
New cards

Evaluate the performance of the screening test (with dichotomous results)

  • Test comes back as one of two responses

  • But test classification can be different from the truth

34
New cards

Test comes back as one of two responses

  1. Positive (+), i.e., according to the test, you have the disease

  2. Negative (-), i.e., according to the test, you do not have the disease

35
New cards

But test classification can be different from the truth

  1. You actually have the disease (+)

  2. You actually don’t have the disease (-)

36
New cards

Sensitivity

True positive fraction

Probability that a diseased person screens positive = P (screen + | disease)

Ability of test to correctly identify those with disease

37
New cards

Specificity

True negative fraction

Probability that a disease-free person screens negative = P (screen - | disease free)

Ability of test to correctly identify those without disease

38
New cards

Positive; disease

True positive (TP) = have disease and test positive

39
New cards

Positive; no disease

False positive (FP) = do not have disease but test positive

40
New cards

Negative; disease

False negative (FN) = have disease but test negative

41
New cards

Negative; no disease

True negative (TN) = do not have disease and test negative

42
New cards

Sensitivity =

TP / TP + FN

43
New cards

Specificity =

TN / FP + TN

44
New cards

P (disease | screen positive) asks

What is the probability that I have the disease if my screening test comes back positive?

45
New cards

P (disease | screen positive) =

Positive predicted value (PPV)

46
New cards

P (disease free | screen negative) =

Negative predicted value (NPV)

47
New cards

Positive Predictive Value

TP / TP + FP

48
New cards

Negative Predictive Value

TN / FN + TN

49
New cards

Independence

Two events, A and B, are independent if P (A | B) = P (A) or if P (B | A) = P (B)