Midterm Review

0.0(0)

Studied by 8 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/85

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

86 Terms

New cards

What is an independent variable (IV)?

a variable that affects or determines a dependent variable

New cards

What is a dependent variable (DV)?

A variable that is dependent on another variable, typically the outcome.

New cards

What is a nominal scale?

A scale that assigns names or categories with no meaningful order

New cards

What is an ordinal scale?

A scale where numbers represent ordered categories, but differences between values may not be meaningful.

New cards

What are examples of the Nominal scale?

Examples: colors, race/ethnicity, biological sex, yes/no

New cards

What are examples of the Ordinal scale?

Examples:rate level of agreement, from 1-5, pH, Richter scale, NYC restaurant grades, Homeland security warning levels

New cards

What is an interval scale?

A scale with equal differences between values, but no true zero point.

New cards

What are examples of the Interval scale?

Example: The meaning of a 10 degree difference in temperature is the same whether we are talking about the difference between 30 and 40 degrees or 100 and 110 degrees. • Another example: Years (e.g., 2013)

New cards

What is a ratio scale?

A scale with both equal differences between values and a meaningful zero point (e.g., height, weight).

New cards

What is a discrete variable?

A variable that can only take on distinct, whole numbers (e.g., number of students).Can include a decimal in unique situations (e.g., sneaker size 5.5)

New cards

What is a continuous variable?

A variable that can take any value within a range (e.g., height).

New cards

What is a frequency and percent distribution table for nominal data?

It organizes nominal data into categories with the corresponding frequencies and percentages.

New cards

Can pie charts be used for nominal data?

Yes, pie charts can be used as long as the categories are exhaustive.

New cards

What types of graphs can be used for ordinal, interval, or ratio data?

Frequency and percent distribution tables, stem-and-leaf plots, histograms, and line graphs.

New cards

What is a histogram used for?

It is used for ordinal, interval, or ratio data where the lack of space between bars indicates continuity, reflecting the ordinal nature of the variable.

New cards

When is a line graph most appropriate?

most often used when two variables are involved, one of which is time.

New cards

What is a stem-and-leaf plot?

A plot where each data value is split into two parts: the "stem" and the "leaf." The stem represents the first part (usually digits), and the leaf represents the remaining part of the value.

New cards

What is symmetry in data?

occurs when the data is evenly distributed on either side of the mean.

New cards

What is cumulative percent?

The percent of values at or below a certain point in a distribution.

New cards

What is a percentile rank?

he percentage of values that fall below a given value.

New cards

What is a percentile rank example ?

If you're in the 90th percentile, 90% of values fall below your score.

New cards

How do you calculate percentages?

Divide the given amount by the total amount and multiply by 100.
Example: (60/75) x 100 = 80%.

New cards

How do you calculate cumulative percents?

Add the percentage of the current row to the cumulative percentage of the previous rows.
Example: If Row 1 is 15% and Row 2 is 24.6%, the cumulative percent for Row 2 is 39.6%.

New cards

What is the mode?

is the data value that occurs most often. It is valid for nominal, ordinal, interval, and ratio level data, as well as for both discrete and continuous data.

New cards

In what situation might the mode describe the center of a distribution?

describes the center when the distribution is unimodal, meaning the most frequent value reflects the center.

New cards

When might the mode not reflect the center of a distribution?

When the distribution is bimodal or multimodal, the mode may not describe the center as well.

New cards

What is the median?

is the value below which 50% of all data values fall, also known as the second quartile (Q2), fifth decile (D5), or 50th percentile (C50).

New cards

Is the median affected by outliers?

No, its is robust and is not significantly affected by outliers.

New cards

How do you find the median in a data set?

Arrange the values in order. If the number of values is odd, the median is the middle value. If even, the median is the average of the two middle values.

New cards

What is the mean?

is the arithmetic average of all data values in the distribution. It represents the 'balance point' of the data.

New cards

How do you calculate the mean?

Add all the data points together and divide by the number of values.

New cards

Why is the mean referred to as the 'center of balance'?

Because the sum of the distances from the mean for values below it equals the sum of the distances for values above it.

New cards

What happens when outliers are present in a data set?(mean)

Outliers can distort the mean, pulling it towards the extreme values and making it less representative of the data.

New cards

What happens to the mean, median, and mode in a unimodal and symmetric distribution?

In a unimodal and symmetric distribution, the mean, median, and mode are all equal.

New cards

What do measures of dispersion tell us?

provide information on the spread of the data, such as the range, interquartile range (IQR), variance, and standard deviation (SD).

New cards

What questions do measures of dispersion answer?

They help determine the limits (min/max), the central distribution of scores, and whether the data is symmetrical or skewed.

New cards

What are the four most common measurements of dispersion?

Range, Interquartile Range (IQR), Variance, and Standard Deviation (SD).

New cards

Can the range and IQR be used for ordinal data?

Yes, the range and IQR can be used for ordinal, but variance and SD are for interval or ratio data only.

New cards

What is the range?

The range is the simplest measure of dispersion, calculated by subtracting the maximum value from the minimum value.

New cards

Why is the range considered problematic?

he range is sensitive to outliers, as it only considers the two extreme values in a data set.

New cards

What is the interquartile range (IQR)?

The range between the first quartile (Q1) and the third quartile (Q3), representing the middle 50% of the data.

New cards

Why is the IQR considered more stable than the range?

is resistant to outliers because it focuses on the central portion of the data, rather than the extremes.

New cards

What is standard deviation (SD)?

the average distance between each score and the mean, used to measure the spread of the data.

New cards

What does a large SD indicate?

indicates that the data points are widely spread around the mean, while a small SD suggests that the data points are closely clustered around the mean.

New cards

Can SD ever be negative?

No, SD is always positive.

New cards

How is SD commonly reported?

SD is usually reported alongside the mean, for example: M(SD) = 30(4).

New cards

How does SD relate to outliers?

SD is highly sensitive to outliers, which can increase the value of SD significantly.

New cards

What does it mean when scores deviate from the mean?

It means that each score differs from the mean, either slightly or dramatically, which contributes to the SD value.

New cards

What is variance?

the square of the standard deviation, representing the spread of the data in squared units.

New cards

Why do we use SD instead of variance?

SD is more interpretable because it is in the same units as the original data, while variance is in squared units (e.g., IQ² instead of IQ).

New cards

What is a Z-score?

represents how many standard deviations a data point is from the mean.

New cards

What does a Z-score of 0 indicate?

indicates that the data point is exactly at the mean.

New cards

How are positive and negative Z-scores interpreted?

indicate values above the mean, and negative Z-scores indicate values below the mean.

New cards

What is the 5-number summary in a box plot?

includes the minimum, Q1 (first quartile), median (Q2), Q3 (third quartile), and the maximum.

New cards

What does the box in a box plot represent?

The box represents the middle 50% of the data, from Q1 to Q3.

New cards

What do the lines extending from the box (whiskers) represent in a box plot?

The whiskers extend to the minimum and maximum values that are not considered outliers.

New cards

Where is the median located in a box plot?

The median is marked within the box at Q2.

New cards

What is positive skew?

occurs when the majority of the data is on the left, with the tail extending to the right.

New cards

What is negative skew?

occurs when the majority of the data is on the right, with the tail extending to the left.

New cards

What effect does an outlier have on a distribution's skewness?

Outliers can cause skewness, as they pull the mean toward the extreme values, making the distribution asymmetric.

New cards

How do outliers affect the mean and median?

he mean is more affected by outliers, while the median is more resistant to them.

New cards

What are categorical (nominal) data?

are variables that can be divided into distinct groups, with no inherent order or ranking.

New cards

What are important characteristics of categorical data?

should be exclusive and exhaustive, meaning no overlap between categories and all possible categories must be included.

New cards

Why are strict definitions necessary for categorical variables?Example…

Strict definitions ensure clarity and consistency, particularly for complex variables like drug use or survey responses on sensitive topics.

Example: Survey results showed that 59% of a student sample did not consider oral sex to be "sex," illustrating the importance of clear definitions.

New cards

What are crosstabs used for?

rosstabs are used to compare two variables to calculate risk, typically between a factor or predictor and a disease or outcome.

New cards

What is relative risk (RR)?

is a comparison of the risk of developing a disease or outcome in one group compared to another group.

New cards

What does an RR of 1 mean?

An RR of 1 means there is no association between exposure and the disease or outcome.

New cards

What does an RR greater than 1 indicate?

An RR greater than 1 indicates an increased risk of disease or outcome for those exposed.

New cards

What does an RR less than 1 indicate?

An RR less than 1 indicates a decreased risk of disease or outcome for those exposed.

New cards

What is an odds ratio (OR)?

the odds ratio compares the odds of an event occurring in one group to the odds of it occurring in another group.

New cards

How does OR differ from RR?

Unlike RR, which calculates relative risk, OR uses odds instead of probabilities and does not depend on row totals.

New cards

How do you calculate the relative risk for lung cancer among smokers vs. non-smokers?

Risk for smokers = (a/a+b) = 0.023
Risk for non-smokers = (c/c+d) = 0.004
RR = 5.75
Smokers are at almost 6 times greater risk of developing lung cancer compared to non-smokers.

New cards

How do you calculate the odds ratio for lung cancer among smokers vs. non-smokers?

Odds for smokers = (a/b) = 0.024
Odds for non-smokers = (c/d) = 0.004
OR = 6.0
Smokers are at 6 times the odds of developing lung cancer compared to non-smokers.

New cards

hat are the four possible outcomes in a diagnostic test?

True Positive (TP) – test is positive, and the subject has the disease.
False Negative (FN) – test is negative, but the subject has the disease.
False Positive (FP) – test is positive, but the subject does not have the disease.
True Negative (TN) – test is negative, and the subject is free of the disease.

New cards

What is sensitivity in diagnostic tests?

Sensitivity is the ability of a test to correctly identify those with the disease (True Positives). Formula: TP/(TP+FN).

New cards

What is specificity in diagnostic tests?

Specificity is the ability of a test to correctly identify those without the disease (True Negatives). Formula: TN/(TN+FP).

New cards

What is NNT?(Number Needed to Treat)

it represents the number of people who need to receive a treatment to prevent one adverse event or achieve one positive outcome.

New cards

NNT.. Examples

324 girls must be vaccinated with Gardasil® to prevent 1 case of cervical cancer.
175 people must be vaccinated to prevent 1 case of shingles.
Fewer than 50 children must be vaccinated for influenza to prevent 1 case.

New cards

What is a rate in statistics?

describes the frequency of an event occurring, often expressed per 1,000 or 100,000 people.

Formula:
Rate = (Number of events / Total population) x k
Where k is a multiplier (e.g., 1,000 or 100,000).

New cards

What is the incidence rate?

he incidence rate is the number of new cases of a disease over a specific time period, usually expressed per year.
Formula: Incidence Rate = (New cases / Population at risk).

New cards

What is the prevalence rate?

The prevalence rate is the total number of cases (new and existing) at a specific point in time.
Formula: Prevalence Rate = (All cases / Total population).

New cards

What is morbidity rate?

refers to the rate of disease in a population. It can refer to incidence or prevalence rates.

New cards

What is mortality rate?

rate of death in a population during a specific period, often expressed per 1,000 or 100,000 people.

New cards

What is birth rate?

Birth rate is the number of live births per 1,000 people in a population during a specific period.

New cards

How is a crude mortality rate calculated?

rate is calculated by dividing the number of deaths by the total population and multiplying by a given factor (usually 1,000 or 100,000).

Example: In 1980, California had a population of 23 million and 190,247 deaths. The crude mortality rate would be:
Crude Mortality Rate = (190,247 / 23,000,000) x 1,000.

New cards

How is the incidence rate of AIDS calculated?

In 2000, there were 41,311 new cases of AIDS in the US. With a population of 281,500,000, the incidence rate would be:
Incidence Rate = (41,311 / 281,500,000) x 100,000.