Biostats unit 2

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/49

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

50 Terms

New cards

Probability distribution

a statistical function that describes all the possible values and likelihoods that a random variable can take within a given range.

New cards

Combination

If a set consists of n objects, and we wish to form a subset of x objects from these n objects, without regard to the order of the objects in the subset

New cards

Nonnegative function

a probability distribution (sometimes called a probability density function) of the continuous random variable X if the total area bounded by its curve and the x -axis is equal to 1 and if the subarea under the curve bounded by the curve, the x -axis, and perpendiculars erected at any two points a and b give the probability that X is between the points a and b.

New cards

Sampling distribution

The distribution of all possible values that can be assumed by some statistic, computed from samples of the same size randomly drawn from the same population

New cards

Standard error

The square root of the variance of the sampling distribution

-use SE for sample means ) —> Z=x-u/SE

New cards

Statistical inference

1) allows us to answer probability questions about sample statistic

2) Provide necessary theory for making the statistical inference procedures valid

the procedure by which we reach a conclusion about a population on the basis of the information contained in a sample drawn from that population.

New cards

Point estimate

a single numerical value used to estimate the corresponding population parameter.

-the point estimate of the population mean is the sample mean

New cards

Interval estimate

consists of two numerical values defining a range of values that, with a specified degree of confidence, most likely includes the parameter being estimated.

New cards

Sampled population

the population from which one actually draws a sample.

New cards

Target population

the population about which one wishes to make an inference.

New cards

p hat (p̂)

sample proportion

New cards

population proportion

New cards

Z score equation

New cards

T-distribution

a continuous probability distribution that generalizes the standard normal distribution. Like the latter, it is symmetric around zero and bell-shaped.

<p><strong><em>a continuous probability distribution that generalizes the standard normal distribution</em></strong><span>. Like the latter, it is symmetric around zero and bell-shaped.</span></p>

New cards

Chi square distribution

used for variances and frequencies

New cards

F-distribution

used for inferences on 2 unknown population variances

New cards

Estimator

unbiased if the expected value of the estimator is the value of the population parameter

New cards

Normal curve

If P(z ≤ z₁) < 0.5 → Left side of the curve → Negative z1
If P(z ≤ z₁) > 0.5 → Right side of the curve → Positive z1
If P(z ≤ z₁) = 0.5 → z1=0 (exactly at the mean).

<ul><li><p class=""><strong>If P(z ≤ z₁) < 0.5</strong> → Left side of the curve → Negative z1</p></li><li><p class=""><strong>If P(z ≤ z₁) > 0.5</strong> → Right side of the curve → Positive z1</p></li><li><p class=""><strong>If P(z ≤ z₁) = 0.5</strong> → z1=0 (exactly at the mean).</p></li></ul><p></p>

New cards

Mean symbol for pop

New cards

Standard deviation symbol

New cards

Z-score formula

o=for single data point

SE=for sample mean

New cards

Probability of success symbol

New cards

Probability distribution

All probabilities (P(X=x)) must be non-negative.
The total sum of probabilities (∑P(X=x)) must equal 1.

New cards

Bernoulli trial

When a random process or experiment called a trial can result in only one of two mutually exclusive outcomes (like heads or tails, dead or alive, sick or well)

New cards

The Bernoulli process

1) Each trial results in 1 of 2 possibly mutually exclusive outcomes one is denoted success and the other is failure

2) The probability of success, denoted by p, remains constant from trial to trial. The probability of failure is 1-p and if often denoted by q

3) The trials are independent, that is the outcome of any particular trial is not affected by the outcome of any other trial

New cards

Combinations

When order does not matter, those subsets are

New cards

Permutation

When order does matter, the arrangement is called

New cards

Binomial distribution

applies to an infinite or finite population where the sampling is done with replacement

New cards

Continuous random variable

one that can assume any value within a specified interval

Exp: weight, height, amount of drug in system, amount of HDL in blood

New cards

Normal distribution

1) Symmetric about mean

2) Mean=median=mode

3) Total area under curve above the x-axis is 1 sq-unit (since it’s a probability distribution) since it is symmetric 50% of the area is to left of the mean & 50% of the area is to the right of the mean

4) If we draw perpendicular lines 1 standard dev. on either side of the mean, the the area enclosed is about 68%. If we go out 2 standard dev. on each side we encompass about 95%

If we go out 3—> we get 99.7% of the area

(n)(p) >= 5

n(1-p) >=5

<p>1) Symmetric about mean</p><p>2) Mean=median=mode</p><p>3) Total area under curve above the x-axis is 1 sq-unit (since it’s a probability distribution) since it is symmetric 50% of the area is to left of the mean & 50% of the area is to the right of the mean</p><p>4) If we draw perpendicular lines 1 standard dev. on either side of the mean, the the area enclosed is about 68%. If we go out 2 standard dev. on each side we encompass about 95%</p><p>If we go out 3—> we get 99.7% of the area</p><p>(n)(p) >= 5</p><p>n(1-p) >=5</p>

New cards

Empirical rule

-Completely determined by its parameters

-u determines location/center, the horizontal position number line

-o determines spread/width of the bell, how narrow/wide the bell is

-the smaller the value of sigma, the narrower the bell and vice versa

<p>-Completely determined by its parameters</p><p>-u determines location/center, the horizontal position number line</p><p>-o determines spread/width of the bell, how narrow/wide the bell is</p><p>-the smaller the value of sigma, the narrower the bell and vice versa</p>

New cards

Sample proprtion

p hat

New cards

How do we create a sampling distribution of a discrete finite population?

1) From the finite pop of size N, randomly draw all possible samples of size n

2) Compute the sample statistic for each sample

3) Create a frequency distribution with the first column being the distinct values of the sample statistic and the second with the frequencies of these values

New cards

Discrete variable

a type of variable that can only take specific, separate values. Think of it as something you can count—like the number of books on a shelf, the number of people in a room, or the number of cars in a parking lot. You can't have half a car or 2.7 people, so the values are distinct and not continuous.

New cards

Central Limit Theorem

when you take a large enough sample from any population (no matter its shape), the distribution of the sample means will tend to look like a normal distribution (bell-shaped curve), even if the original population isn’t normally distributed. The sample means will have the same average (μ\mu) as the population and a smaller spread (standard error).

It’s the reason we can use normal distributions to make predictions and draw conclusions in statistics!

New cards

.90

1.645

New cards

.95

1.96

New cards

.99

2.576

New cards

Convert to Z-score

New cards

Working with counts

If you're counting how many individuals, use n×p and square root of np (1-p)

<p>If you're counting <strong>how many individuals</strong>, use <strong>n×p and square root of np (1-p)</strong></p>

New cards

Working with proportions

If you're looking at the proportion of a sample, use p and square root of p(1−p)/n

<p>If you're looking at <strong>the proportion of a sample</strong>, use <strong>p and square root of p(1−p)/n</strong></p>

New cards

Sample mean

x bar

New cards

Confidence interval equation

New cards

Degrees of freedom

df=n-1

New cards

T distribution vs. Z distribution

Use a z-distribution when:

The population standard deviation is known:
- You can calculate a z-score because you have precise information about the population's standard deviation.
- For example, in quality control analysis, population parameters are often known.
Sample size is large:
- Typically, n>30n > 30 is considered large enough for the Central Limit Theorem to apply, making the z-distribution appropriate.
Data is normally distributed or approximately normal:
- Z-distributions assume normality and are suitable when the data meets this condition.

Use a t-distribution when:

The population standard deviation is unknown:
- You use the sample standard deviation as an estimate, which introduces variability. The t-distribution accounts for this.
Sample size is small:
- When n≤30n \leq 30, the t-distribution is more appropriate because smaller sample sizes are more prone to variability.
Data is normal or approximately normal:
- Like the z-distribution, the t-distribution assumes normality but is more forgiving for small samples.

New cards

Precision of the estimate

-refers to margin of error

New cards

Variance of a binomial distribution

New cards

Expected value

E(x) = (n) (p)

New cards

When the sample size decreases…

the margin of error increases, which causes the confidence interval to widen.

New cards

Confidence interval for proportion