Module 1 – Graphing Data (Biostatistics for Evidence Based Practice) Vocabulary

  • Bar Chart

    • Intro to Graphing: Bar Chart is one of the graphing methods used to describe data (listed under Graphing).
  • Histogram

    • Purpose: shows the distribution of a quantitative variable by grouping data into bins.
    • Example from slides: bins labeled as 1-5, 6-10, 11-15, 16-20, 21-25 on the x-axis; y-axis represents frequency or count.
    • Visual cues: The slide displays a histogram with a vertical axis showing frequencies (e.g., values like 0, 10, 20, 30, 40, 50) and multiple bars corresponding to the bins.
    • Context note: The histogram example is tied to dataset visuals used in the module (the slide shows institution labels and axis values).
  • Stem-and-Leaf Plot

    • Purpose: a data-quick view that preserves the original data values while showing distribution.
    • Structure shown: stems and leaves are arranged (e.g., a stem-and-leaf layout is presented with multiple lines of leaves). The stem-and-leaf on the slide demonstrates how data values are split into a stem (leading digits) and a leaf (trailing digits).
    • Use: useful for small data sets to quickly assess shape, center, and spread while retaining actual data values.
  • Frequency Table

    • Purpose: summarizes data by exact values and their frequencies.
    • Columns shown:
    • Final Exam Score (value categories)
    • Frequency (how many observations fall into each score category)
    • Percent (percent of total observations in each category)
    • Valid Percent (percent of valid cases within the total, excluding any missing data)
    • Cumulative Percent (running total of Percent or Valid Percent across categories)
    • Example data outline: scores captured range in increments (e.g., 35, 40, 45, …, 100) with a total of 100 observations (Total row shows 100 for Frequency and 100.0 for Percent/Valid Percent/Cumulative Percent).
    • Total row: shows overall totals, e.g., Frequency = 100, Percent = 100.0, Valid Percent = 100.0, Cumulative Percent = 100.0.
  • Frequency Distribution

    • Purpose: describes how frequently data points occur in a dataset and how the data are distributed overall.
    • Shape descriptors presented:
    • Leptokurtic (thin): a distribution with a sharp peak (high kurtosis).
    • Mesokurtic: a normal, moderate peak (normal-like shape).
    • Platykurtic (flat): a flatter, broader peak (lower kurtosis).
    • Normal curve overlay: a normal distribution curve is shown to compare the observed distribution against the theoretical normal distribution.
    • Skewness concepts:
    • Positive skew: tail extends to the right (higher values are less frequent).
    • Negative skew: tail extends to the left (lower values are less frequent).
  • Graphing - Normal Curve, Skewness, and Kurtosis Labels

    • Normal Curve: the classic bell-shaped distribution used for comparison.
    • Positive Skew vs Negative Skew: descriptors for asymmetry of the distribution.
    • Kurtosis terms included: Leptokurtic, Mesokurtic, Platykurtic to describe the peak sharpness and tail heaviness.
  • Boxplot

    • Components shown on the slide:
    • Outlier: data points outside the typical range (potentially flagged as outliers).
    • Whiskers: lines extending from the quartiles to the smallest and largest values within the 1.5 * IQR range (or similar criterion).
    • 25th percentile (Q1): the lower quartile.
    • 75th percentile (Q3): the upper quartile.
    • Median: middle value of the data set.
    • Mean: average value (sometimes shown in the boxplot as a dot or special symbol in some diagrams).
    • By-week boxplot (as suggested by day labels): the slide appears to show boxplots by category (e.g., Friday, Monday, Saturday, Sunday, Thursday, Tuesday, Wednesday) to illustrate distribution across categories.
  • Central Tendency and Dispersion (key themes across graphs)

    • Central Tendency: measures that describe a data set by a single value representing the center.
    • Dispersion: measures that describe the spread or variability of the data.
    • How graphs support interpretation: choice of graph affects understanding of center and spread (e.g., mean vs median, presence of outliers, and tail behavior).
  • Normality, Skewness, and Kurtosis – Practical implications

    • Normal Curve reference helps assess whether data are approximately normally distributed.
    • Skewness affects the choice of statistical tests (e.g., parametric tests assume normality; nonparametric tests may be more appropriate for skewed data).
    • Kurtosis informs about tail heaviness and peak; affects estimates of sampling distributions and confidence intervals.
  • Connections to foundational principles and real-world relevance

    • Graphing data is foundational for Evidence Based Practice (EBP): visualization guides interpretation and decision-making.
    • Understanding distribution shapes informs test selection and data transformation needs in research and clinical settings.
    • Recognizing outliers (via boxplots) prompts consideration of data quality, measurement error, or real but rare phenomena.
  • Ethical, philosophical, and practical implications

    • Accurate representation of data through graphs reduces misinterpretation and supports transparent reporting.
    • Acknowledge and handle missing data appropriately (Valid Percent vs Total Percent) to avoid biased conclusions.
  • Quick reference formulas (LaTeX)

    • Mean: ar{x} = rac{1}{n}
      \sum{i=1}^{n} xi
    • Median: If n is odd, median is the middle value; if n is even, median is the average of the two middle values: ext{Median} = \begin{cases}\ x{\frac{n+1}{2}} & \text{if } n ext{ is odd} \\ \frac{x{\frac{n}{2}} + x_{\frac{n}{2}+1}}{2} & \text{if } n ext{ is even} \end{cases}
    • Quartiles and IQR: Q1 = ext{25th percentile},\; Q3 = \text{75th percentile},\; \text{IQR} = Q3 - Q1
    • Normal distribution (example): X \sim N(\mu,\sigma^2)
  • Summary takeaways

    • Use bar charts for categorical comparisons, histograms for distribution of a continuous variable, stem-and-leaf for quick data inspection and retention of data values, frequency tables for exact counts, frequency distributions to assess shape, and boxplots for a concise view of center, dispersion, and outliers.
    • Interpret normality, skewness, and kurtosis to inform analysis choices and data preparation in evidence-based practice.