Send a link to your students to track their progress
39 Terms
1
New cards
PROBABILITY
Appropriately conducting and interpreting biostatistical applications require attention to a number of important issues. ● Clearly defining the objective or research question; ● Choosing an appropriate study design; ● Selecting a representative sample, and ensuring that the sample is of sufficient size; ● Carefully collecting and analyzing the data; ● Producing appropriate summary measures or statistics; ● Generating appropriate measures of effect or association; ● Quantifying uncertainty; ● Appropriately accounting for relationships among characteristics; and, ● Limiting inferences to the appropriate population.
2
New cards
Probability theory
The collection and summarization of data and the drawing of inferences require knowledge of biostatistics principles grounded in _____.
3
New cards
Probability sampling
each member of the population has a known probability of being selected.
4
New cards
Non-probability sampling
each member of the population is selected without the use of probability.
5
New cards
Probability sampling
includes simple random sampling, systematic sampling, and stratified sampling.
6
New cards
Non-probability sampling
includes convenience sampling and quota sampling.
7
New cards
Probability
refers to the proportion of times an event is expected to occur in the long run. It is a number that reflects the likelihood that a particular event, such as sampling a particular individual from a population into a sample, will occur. It is a mathematical tool that helps us understand and address chance. It permits dealing with random variability in constructive ways.
8
New cards
Random variable
is a numerical quantity that takes on different values depending on chance. This is not totally different to the initial definition of variable but, if the process that produced the value of that variable is random.
9
New cards
Discrete random variables
form a countable set of possible values
10
New cards
Continuous random variables
form an unbroken continuum of possible values.
11
New cards
Event
is an outcome or set of outcomes for a random variable.
12
New cards
Binomial probability distributions
are the most common type of probability distribution that applies to discrete random variables in which: ● There are n independent observations. ● Each observation can be characterized as a "success" or "failure". ● The probability of "success" for each observation is a constant p.
13
New cards
Binomial random variables
have two parameters: n (number of observations) and p (probability of success for each observation). The cumulative probability of an event is the probability of observing given value x or less. Again, we focus on events where the outcome could be binomial or dichotomous and as denoted by "success" or "failure" as mentioned above.
14
New cards
Normal (Gaussian) distributions
are the most common type of probability distribution that applies to continuous random variables. They are recognized by their symmetry, bell shape, points of inflection at population mean minus one standard deviation and population mean plus one standard deviation, and horizontal asymptotes as they approach the X axis on either side.
15
New cards
Z-distribution
Many random variables encountered in nature are not Normal. However, these random variables can be made Normal by re-expressing them with mathematical transformation such as logs, exponents, powers, roots, and quadratics. To determine Normal probabilities, values are first standardized to transform it to a Normal scale with mean 0 and standard deviation 1.
16
New cards
Z-score
tells the distance the value falls from the mean in standard deviation units.
17
New cards
Positive z-scores
Values that are larger than the mean.
18
New cards
Negative z-scores
Values that are smaller than the mean.
19
New cards
Statistical inference
the act of using data in a particular sample to make generalizations about the population from which it came from.
20
New cards
Standard error or the mean
The standard deviation of the sampling distribution of sample mean. It is an estimate of the sample mean's precision as an estimate of the population mean. Inversely proportional to the square root of the sample size.
21
New cards
Square root law
Because of it, the standard error of the mean gets smaller and smaller as the sample size gets larger and larger. Therefore, sample means based on large n are more likely to fall close to the true value of the population mean than means on small n (all other things being equal).
22
New cards
Central limit theorem
When sample size n is large, the sampling distribution of the sample mean tends toward Normality even when the population is not Normal.
23
New cards
Hypothesis testing
uses a deductive procedure to judge claims about parameters. A specific hypothesis is made about a population parameter and a sample statistic is used to determine the chance whether the hypothesis is true.
24
New cards
Small P-value
indicates that observed data are unlikely to have come from the distribution suggested by the null hypothesis.
25
New cards
Alpha
sets the standard for how extreme the data must be before we can reject the null hypothesis and is predetermined when designing studies.
26
New cards
P-value
computed from data and would indicate how extreme the data are.
27
New cards
Type I error
an erroneous rejection of a true null hypothesis.
28
New cards
Type II error
an erroneous retention of a false null hypothesis.
29
New cards
Estimation
We use sample statistics to produce estimates about unknown population parameters.
30
New cards
Point estimation
provides a single estimate of the parameter
31
New cards
interval estimation
provides a range of values (confidence intervals) that seek to capture the parameter.
32
New cards
Confidence level of a confidence interval
refers to the success rate of the method in capturing the parameter it seeks.
33
New cards
Systematic errors
come in these three general forms: confounding, information bias, and selection bias.
34
New cards
Confounding
is especially problematic in non-experimental studies. It derives from the mixing together of the effects of the explanatory variable and extraneous variables lurking in the background.
35
New cards
Confounders
Extraneous variables that cause confounding.
36
New cards
Information bias
arises from defects in measurement. With categorical data, this corresponds to misclassification of the explanatory variable or response variable or both. Such misclassifications may be either differential or non-differential.
37
New cards
Differential misclassifications
affect some groups more than others. These biases can lead to false positive or false negative results
38
New cards
Non-differential misclassification
occurs to the same extent in the groups being compared. These biases may result either toward acceptance of the null hypothesis or not at all.
39
New cards
Selection bias
due to the manner in which subjects are selected for the study. The lack of eligibility criteria and/or the deliberate assignment of study participants into groups to bring about expected results are some examples of selection bias sources.