Biostatistics for the Health Sciences - Everything for 1st Test

0.0(0)

Studied by 0 people

Call with Kai

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/119

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

120 Terms

New cards

Nominal

order of categories irrelevant (also called unordered)

New cards

Ordinal

order of categories is meaningful (also called ordered)

New cards

Binary

Special case of categorical variable - only 2 possible values (also called dichotomous)

New cards

Discrete

Values equal to integers

New cards

Continuous

Values on a continuum

New cards

Is nominal data categorical or ordinal?

Categorical

New cards

Is blood type categorical or ordinal

Categorical

New cards

Is died of cancer binary or nonbinary

Binary

New cards

What are types of categorical data

Nominal, ordinal, binary

New cards

What are types of quantitative data

Discrete and continuous

New cards

Continuous examples

Blood pressure

Weight

Age

Lead

New cards

Quantitative examples

Number of babies out of 100 births who have low birth weight

Number of admissions to the emergency room

New cards

For a discrete variable, it isn’t sensible to

consider a value between two numbers (e.g. 1.5 heart attacks doesn’t make sense)

New cards

Although a continuous variable may be measured in whole numbers, it is still sensible

to consider a value between two numbers (age – 16.5 years old)

New cards

A quantitative measurement may be categorized and treated as a categorical variable

for the purpose of summaries (>25 years etc.)

New cards

Categorical variables are sometimes called ____, especially in stats classes

factors

New cards

A categorical with inherent logical ordering (age brackets) may be treated as

nominal in some analysis

New cards

Categorical data is usually summarized by:

The proportion (percent) of observations in each of the categories
The number in each category (frequency / count)
Important to provide the totals (denominators of percentages)
N or n is often used to represent the total
N = sample size

New cards

Pie graphs represent a

categorical variable pictorial

New cards

Pie graphs display

data as a percentage of the whole

New cards

Pie graphs require

proportional reasoning

New cards

Pie graphs are especially difficult with

Ordinal data

New cards

Pie graphs are

not the best way to summarize data, but are common in media/non-expert reports

New cards

Pie graphs are most appropriate

And better when

When the aim is to convey the relative size of the parts of a whole

Better when there are not too many categories (3-7)

New cards

Bar graphs present a

summary measure for each category by a bar

New cards

BARGRAPHS: For an ordinal categorical variable, order the bars

in the order of the categories

New cards

BARGRAPHS: For a nominal categorical variable, choose

an ordering that aids understanding (for ex, alphabetical or lowest to highest)

New cards

For quantitative data, we are usually interested in

The distribution of the observations

What are the most common or average values (center of the data?)
How spread out are the data? (variability of the data)
Are there some values far from the bulk of the data (outliers?)

New cards

Strategies for distribution of quantitative data

Visualize the distribution of the data with a graph
Summarize key aspects of the distribution with descriptive statistics, numerical descriptions of the center and spread of the data

New cards

Bar graphs can be

Stacked

New cards

Stacked bar graph methods

Can do totals, percent out of 100, or other

New cards

The histogram is a

graphical display of the distribution of quantitative data

New cards

Histogram: Horizontal scale (x) corresponds

to the values of the quantitative variable

New cards

Histogram: The x-axis is broken into a

contiguous series of sub-intervals (“classes” or “bins”)

New cards

Histogram: Bars are drawn that indicate the

frequencies or percentages of observations within each interval

New cards

Histogram:

The area of each bar corresponds to the number of observations in each bin
If all bins have same width, then heights of the of bars also correspond to the number of observations in each bin

New cards

Information from a histogram

Typical values
How much variation is present
Shape of the distribution
Unusual or outlying values
Approximate frequency or percentage in a given range

New cards

Histograms can be sensitive to bin width (or cut-points)

Rule of thumb:

use number of bins = square root of number of observations

New cards

In a histogram, each observation is given

one unit of area

New cards

With unequal bin widths, easier

to use a bar graph

New cards

Unequal intervals are really a
Similarly for ordinal variable, more common to use term

bar chart – does not necessarily show shape of distribution

bar chart (concept of the distribution not the same as for quantitative variables)

New cards

We reserve the term histogram for

quantitative variables

New cards

Opposite of everyday use

New cards

One difference between histogram and bar-graph is that the number of observations is shown

by the area under the curve, or the integral of the category

New cards

Reason would have longer data is

less observations

New cards

If you have the data, use

equal size bins

New cards

In a proper histogram, each observation is

given one unit of area so the histogram reflects the shape of the distribution

New cards

In a bar graph, observations may not always be given the same unit of area so the bar graph

may not reflect the shape of the distribution

New cards

Histogram needs to show

SHAPE OF DISTRIBUTION!! (not equal sized bins)

New cards

Histograms for

quantitative data and to represent shape

New cards

Similarly for ordinal variable, more common to use the

term bar chart

New cards

We will reserve the term histogram for

quantitative variables

New cards

Stem and leaf plot:

another graph for quantitative data;

New cards

STEM AND LEAF:

Decimal point is 1 digit to the left of |

New cards

(S&P) 0 | 2 2 3 =

0.02, 0.02, 0.03

New cards

Numerical summaries for quantitative data

Central tendency and variation

New cards

Central tendency

The “middle” of the data

New cards

Variation

How “spread out” the data is

New cards

X bar

average/mean

New cards

Median

Middle point; half bigger half smaller

New cards

Mode

Most common value in the dataset

New cards

X bar = formula

(summation i=1 to n (Xsubi)) over n

Xsubi means the ith ordered observation in the data

New cards

Reasons why something can be the mode in a lead detection study

Lowest point to be detected could be why

New cards

Mean is sensitive to

Outliers

New cards

For right (positively) skewed data:

mean > median

New cards

For left (negatively) skewed data:

median > mean

New cards

Median is ___ to outliers

resistant

New cards

If symmetrical

Mean=median

New cards

Range

Smallest and largest, sometimes shown as difference between them

New cards

Interquartile range

25th and 75th percentiles, sometimes shown as difference between them

New cards

Computing IQR for 25th percentile

33 x .25.= 8.25, choose the 9th value

Value with 25% of data below it

New cards

Range and standard deviation

are sensitive to outliers

New cards

IQR is

less sensitive to outliers

New cards

B&W: top of box =

75th percentile

New cards

B&W: line in middle of box =

median

New cards

B&W: line in bottom of box =

25th percentile

New cards

B&W: whisker at top

largest value less than Q3 + 1.5 IQR

New cards

B&W: whisker at bottom

Smallest value greater than Q1 - 1.5IQR

New cards

B&W: dots

Outliers

New cards

Can use ___ box plots to show categories

Side by side

New cards

Cumulative incidence

Is the proportion (fraction) of individuals newly acquiring the disease (outcome) over a specified period of time

New cards

Cumulative incidence =

number of new cases / number at risk

New cards

Contingency table

Summarizes the information from two categorical variables (think treatment, cold-yes, cold-no, and total)

New cards

Risk factor

Variable that may increase or decrease the chance (risk) of outcome

New cards

Difference between treatment, risk factor, and exposure

Treatment is just type, but can be risk factor if it increases/decreases risk; exposure is just whether they were exposed (and would curtail exposed / unexposed categories)

New cards

New cards

Relative risk formula

New cards

Relative risk table

New cards

Relative risk is

Risk of treated over risk of untreated, max can be 1

Summary measure of association between risk factor and outcome; 0 <= RR <= infinity

New cards

If RR <1

Treatment is associated with lower risk of outcome

New cards

If RR > 1

Treatment is associated with higher risk of outcome

New cards

If RR = 1

No association of treatment with outcome

New cards

Depending on study design, RR may have a

Causal interpretation:

If <1, treatment lowers risk; treatment is beneficial

If RR >1, treatment increases risk

If RR = 1, no effect on outcome

New cards

RR = 1.1

10% higher risk

New cards

RR = 2.5

150% higher risk

New cards

RR = 0.6

40% lower risk

New cards

Risk difference formula

Risk treated - risk untreated = (a over a+b), - (c over c+d)

New cards

Risk difference, RD

Summary measure of association between risk factor and outcome

New cards

__ <= RD <= __

-1, 1

100

New cards

If RD < 0

Treatment is associated with lower risk of outcome