Quantitative Data Analysis Notes

Quantitative Data Analysis

Introduction

  • After data collection, researchers organize and analyze data to clarify study results.
  • Data analysis considerations:
    • Type of design
    • Type of data collected
    • Hypothesis and/or research question
  • Statistics are extensively used in nursing and health research.

Quantitative Data

  • Descriptive Statistics: Summary statistics that organize data to give meaning and facilitate insight.
    • Example: describing the sample (age, education level, gender).
  • Inferential Statistics: Statistics that allow inference from a sample statistic to a population parameter.
    • Researchers estimate how reliably they can make predictions and generalize findings.

Statistics: Levels of Measurement

  • Four levels of data in statistics
  • Levels of measurement: Assignment of numbers to variables based on statistical rules.
    • Nominal
    • Ordinal
    • Interval
    • Ratio

Levels of Measurement

  • Nominal
    • Classified in mutually exclusive categories.
    • No ranking within categories.
    • Example:
      • Gender
      • Marital status
      • Religious affiliation
      • Ethnicity
  • Ordinal
    • Data are mutually exclusive and exhaustive and sorted on relative ranking of variables.
    • Example: Education level
      • High school graduate
      • College certificate
      • Bachelor's degree
      • Master’s degree

Levels of Measurement (Continued)

  • Interval
    • Mutually exclusive, exhaustive categories with ranking order, and equal distances between intervals.
    • No absolute zero point.
    • Example: Temperature
      • 20°C – 24.9°C
      • 25.0 – 29.9°C
  • Ratio
    • Highest level of measurement.
    • Mutually exclusive, exhaustive categories with ranking order, equal spacing between intervals, and a continuum of values.
    • Examples: Weight, length, and volume.
    • Absolute zero exists (absence of weight).

Types of Analysis

  • Frequency Distribution: Number of times each event occurs is counted; data grouped into categories.
    • The frequency of each group reported.
    • Sample: Scores of 9 students released by Dr. D: 14, 14, 15, 15, 16, 17, 17, 17, 20
    • Frequency distribution by marks:
      • Score | Grade
      • ----- | -----
      • 14 | 2
      • 15 | 2
      • 16 | 1
      • 17 | 3
      • 20 | 1

Measures of Central Tendency

  • Mean: Average calculated by summing values and dividing by the # of values.
    • Example: (14+14+15+15+16+17+17+17+20=145/9=16.11)(14 + 14 + 15 + 15 + 16 + 17 + 17 + 17 + 20 = 145 / 9 = 16.11)
  • Median: Midpoint in a set of values (50% of distribution falls below, 50% above).
    • Example: 16
  • Mode: Most frequently occurring score in the distribution.
    • Example: 17

Normal Distribution

  • A theoretical concept where interval or ratio data group themselves about a midpoint, closely approximating the normal curve.
  • Mean, median, and mode are equal.

Distribution Curves

  • Positive Skew (Right-Sided Skew): Mean is usually greater than the median.
  • Negative Skew (Left-Sided Skew): Mean is usually less than the median.

Types of Data Analysis

  • Range: Difference between the highest and lowest scores.
    • Example: (2014=6)(20 - 14 = 6)
    • Reported with other measures of variability.
    • Simplest but most unstable measure of variability.

Types of Data Analysis (Continued)

  • Percentile: Percentage of cases a given score exceeds.
    • Median is the 50th percentile.
    • A score in the 90th percentile is exceeded by only 10% of scores.

Types of Data Analysis (Continued)

  • Standard Deviation: Average variability in a set of scores or the scores’ average deviation from the mean.
    • Looks at how the data is spread across the data set.
    • https://www.youtube.com/watch?v=MRqtXL2WX2M (3.5 min)
    • In a normal data set, 68% of the data is one (1) deviation from the mean.

Standard Deviation

  • A standard deviation (σ) measures data dispersion relative to the mean, calculated via a statistical formula.
    • It represents the average distance from the mean.
    • Low standard deviation: data clustered around the mean.
    • High standard deviation: data more spread out.

Levels of Measurement and Analysis

  • Nominal Data
    • Lowest level; data in only one category (e.g., marital status).
    • Analysis:
      • Mode
      • Frequency distribution
  • Ordinal Data
    • Places values into categories with an order (e.g., educational level).
    • Analysis:
      • Mode and median
      • Rank order of coefficients
      • Range
      • Percentile

Levels of Measurement and Analysis (Continued)

  • Interval Data
    • Categories with equal distances (e.g., points on a scale).
    • Analysis:
      • Mean, median, and mode
      • Range
      • Percentile
      • Standard deviation
  • Ratio Data
    • Highest level; allows for a true zero (e.g., weight, height, volume).
    • Analysis:
      • Mean, median, and mode
      • Range
      • Percentile
      • Standard deviation

Decision Tree for Statistical Analysis

  • Is the study quantitative?
    • If yes, proceed with quantitative analysis.
    • If no (qualitative), see chapters 7 and 8.
  • Flowchart for selecting appropriate descriptive statistics based on the level of measurement (Nominal, Ordinal, Interval, Ratio).
    • Nominal: Frequency distribution, Mode
    • Ordinal: Range, Percentile, Mode, Median
    • Interval: Mean, Mode, Median, Range, Percentile, Standard deviation
    • Ratio: Mean, Mode, Median, Range, Percentile, Standard deviation

Inferential Statistics

  • Combines mathematical processes with logic to test hypotheses about populations using data from probability samples.
    • Purpose:
      • Estimate the probability that sample statistics accurately reflect the population parameter.
      • Test a hypothesis about a population.

Inferential Statistics: Parameters and Statistics

  • Parameter: A characteristic of a population.
    • A well-defined set with specific properties.
  • Statistic: A characteristic of a sample used to estimate population parameters.
    • Example: Survey of 100 heart failure patients shows an average knowledge score of 72%.
      • This represents the sample’s average knowledge level.
      • Researchers use this to identify knowledge deficits and improve teaching plans.

Inferential Statistics: Parametric vs. Non-Parametric Tests

  • Parametric Tests: Statistical procedures used when three assumptions are present.
    • Sample from the population has a normal distribution.
    • Level of measurement is interval or ratio with a normal distribution.
    • Sample obtained through random sampling.
  • Non-Parametric Tests: Statistical procedures used when:
    • Sample from the population does not have a normal distribution.
    • Level of measurement is nominal or ordinal.
    • Sample obtained through non-random sampling.

Inferential Statistics: Hypothesis Testing

  • Hypothesis (H1): Formal statement of the expected relationship between variables in a specified population.
  • Null Hypothesis (H0): States no relationship between variables; used for testing and interpreting statistical outcomes.
    • Example: No significant differences in IV catheter patency between flushes with 2ml normal saline vs. 2ml heparinized saline.

Hypothesis Testing: Scientific vs. Null Hypotheses

  • For a quantitative study, the researcher(s) will develop two hypotheses
  • Scientific Hypothesis (H1): IV catheters flushed with 2ml of heparinized saline will have increased patency than those flushed with 2ml of normal saline.
    • Directional hypothesis
  • Null Hypothesis (H0): There will be no significant differences in the duration of IV patency between those flushed with 2ml normal saline and those flushed with 2ml of heparinized saline
    • Indicates no differences will occur between the two variables or groups being studied.

Hypothesis Testing: Statistical Procedures

  • The null hypothesis is tested using statistical procedures.
  • If no difference occurs between the control and intervention groups (or variables), then the null hypothesis is correct, then the findings are based on chance.
  • If there is a difference between the groups then the null hypothesis is rejected.
  • A second analysis determines if the difference is significant enough to declare the scientific hypothesis correct.

Hypothesis Testing: Rejecting the Null Hypothesis

  • If the null hypothesis (H0) is rejected, a relationship exists between the variables.
    • Example: IV catheters flushed with 2ml of heparinized saline had increased patency compared to those flushed with 2ml of normal saline.
    • Statistical procedure determines if a relationship exists.
  • This testing is subject to two types of errors:
    • Type I
    • Type II

Type I and Type II Errors

  • Type I Error: Rejection of the null hypothesis when it is true.
    • More serious; the researcher states relationships exist when they do not.
    • Consumers consider instrument reliability and validity.
  • Type II Error: Accepting the null hypothesis when it is false.
    • Can occur if the sample is too small.

Significance Level (Alpha Level)

  • Before statistical analysis, the level of significance or alpha level is determined.
    • The probability of making a Type I error.
    • Minimum for nursing is 0.05.
    • Meaning if the study were done 100 times, then the decision to reject the null hypothesis would be wrong 5/100 times.

Adjusting the Alpha Level

  • Researchers can set probability at 0.01 for a smaller risk of incorrectly rejecting a true null hypothesis (the decision to reject the null hypothesis would be wrong 1 time out of 100 trials).
  • Researchers will select an alpha level depending on how important it is not to make an error.

Practical vs. Statistical Significance

  • Practical and statistical significance are not the same.
  • A statistically significant hypothesis = unlikely that the findings have occurred by chance.
    • If the level of significance was set at 0.05 – then there is a 95% chance the researcher will make the correct conclusion based on statistical tests performed on the data
  • Magnitude of significance is vital to the outcome of data analysis.
  • Practical significance – examines the practical value that the study contributes.
    • If heparinized saline maintains IV catheter patency longer than normal saline = value to practice – maintain IV access longer – fewer IV sticks, increased IV treatments

Types of Inferential Statistical Tests

  • Researchers use different parametric and non-parametric tests to determine:
    • Differences between means (average):
      • Examples: t-test and ANOVA
    • Presence of a relationship:
      • Examples: Pearson r, Wilcoxon matched pairs test, the signed rank test and multiple regression

Testing for Differences: Algorithm

  • Is the research question asking for a difference?
    • If yes, proceed to determine the number of groups.
    • If no, the research question is asking for a relationship (refer to the other algorithm).
  • One group or more than one group?
    • Two groups:
      • Interval measure? t test
      • Nominal or ordinal measure? Chi-square
    • One group:
      • Interval measure? Correlated t test, ANOVA
      • Nominal or ordinal measure? Sign test, Kolmogorov-Smirnov, Signed rank, Mann-Whitney U

Testing for a Relationship: Algorithm

  • Is the research question asking for a relationship?
    • If yes, determine the number of variables.
    • If no, the research question is asking for a difference (refer to the other algorithm).
  • Two variables or more than two variables?
    • Two variables:
      • Interval measure? Pearson product moment correlation, Point-biserial
      • Nominal or ordinal measure? Phi coefficient, Kendall's tau, Spearman's rho, Contingency coefficient
    • More than two variables:
      • Interval measure? Multiple regression, Path analysis, Canonical correlation, Discriminant function analysis
      • Nominal or ordinal measure? Logistic regression

Conclusion: Evaluating Data Analysis

  • When examining the data analysis, ask yourself
    • Is the data analysis (testing) appropriate for the:
      • Research question or hypothesis?
      • Design of the study?
      • Methods used in the study?
      • Type of data collected?
    • Clues to the appropriate test must come from the research question or hypothesis.
    • Look at the findings to determine if they are appropriate and applicable to the patient population and practice setting.

Review for Descriptive Statistics

  • Were appropriate descriptive statistics used?
  • What level of measurement is used for each major variable?
  • Is the sample size large enough to prevent one extreme score from affecting the summary statistics used?
  • What descriptive statistics are reported?
  • Were these descriptive statistics appropriate to the level of measurement for each variable?
  • Are appropriate summary statistics provided for each major variable?

Review for Inferential Statistics

  • Does the level of measurement enable the use of parametric statistics?
  • Is the sample size large enough to use parametric statistics?
  • Are the results for each of the hypotheses presented clearly and appropriately?
  • Are the results clear?
  • Is a distinction made between practical significance and statistical significance?

Summary

  • Review key points and critical thinking questions at the end of the chapter.
  • Questions/concerns
  • Email: binchj@algonquincollege.com