Quantitative Data Analysis Notes
Quantitative Data Analysis
Introduction
- After data collection, researchers organize and analyze data to clarify study results.
- Data analysis considerations:
- Type of design
- Type of data collected
- Hypothesis and/or research question
- Statistics are extensively used in nursing and health research.
Quantitative Data
- Descriptive Statistics: Summary statistics that organize data to give meaning and facilitate insight.
- Example: describing the sample (age, education level, gender).
- Inferential Statistics: Statistics that allow inference from a sample statistic to a population parameter.
- Researchers estimate how reliably they can make predictions and generalize findings.
Statistics: Levels of Measurement
- Four levels of data in statistics
- Levels of measurement: Assignment of numbers to variables based on statistical rules.
- Nominal
- Ordinal
- Interval
- Ratio
Levels of Measurement
- Nominal
- Classified in mutually exclusive categories.
- No ranking within categories.
- Example:
- Gender
- Marital status
- Religious affiliation
- Ethnicity
- Ordinal
- Data are mutually exclusive and exhaustive and sorted on relative ranking of variables.
- Example: Education level
- High school graduate
- College certificate
- Bachelor's degree
- Master’s degree
Levels of Measurement (Continued)
- Interval
- Mutually exclusive, exhaustive categories with ranking order, and equal distances between intervals.
- No absolute zero point.
- Example: Temperature
- 20°C – 24.9°C
- 25.0 – 29.9°C
- Ratio
- Highest level of measurement.
- Mutually exclusive, exhaustive categories with ranking order, equal spacing between intervals, and a continuum of values.
- Examples: Weight, length, and volume.
- Absolute zero exists (absence of weight).
Types of Analysis
- Frequency Distribution: Number of times each event occurs is counted; data grouped into categories.
- The frequency of each group reported.
- Sample: Scores of 9 students released by Dr. D: 14, 14, 15, 15, 16, 17, 17, 17, 20
- Frequency distribution by marks:
- Score | Grade
- ----- | -----
- 14 | 2
- 15 | 2
- 16 | 1
- 17 | 3
- 20 | 1
Measures of Central Tendency
- Mean: Average calculated by summing values and dividing by the # of values.
- Example: (14+14+15+15+16+17+17+17+20=145/9=16.11)
- Median: Midpoint in a set of values (50% of distribution falls below, 50% above).
- Mode: Most frequently occurring score in the distribution.
Normal Distribution
- A theoretical concept where interval or ratio data group themselves about a midpoint, closely approximating the normal curve.
- Mean, median, and mode are equal.
Distribution Curves
- Positive Skew (Right-Sided Skew): Mean is usually greater than the median.
- Negative Skew (Left-Sided Skew): Mean is usually less than the median.
Types of Data Analysis
- Range: Difference between the highest and lowest scores.
- Example: (20−14=6)
- Reported with other measures of variability.
- Simplest but most unstable measure of variability.
Types of Data Analysis (Continued)
- Percentile: Percentage of cases a given score exceeds.
- Median is the 50th percentile.
- A score in the 90th percentile is exceeded by only 10% of scores.
Types of Data Analysis (Continued)
- Standard Deviation: Average variability in a set of scores or the scores’ average deviation from the mean.
- Looks at how the data is spread across the data set.
- https://www.youtube.com/watch?v=MRqtXL2WX2M (3.5 min)
- In a normal data set, 68% of the data is one (1) deviation from the mean.
Standard Deviation
- A standard deviation (σ) measures data dispersion relative to the mean, calculated via a statistical formula.
- It represents the average distance from the mean.
- Low standard deviation: data clustered around the mean.
- High standard deviation: data more spread out.
Levels of Measurement and Analysis
- Nominal Data
- Lowest level; data in only one category (e.g., marital status).
- Analysis:
- Mode
- Frequency distribution
- Ordinal Data
- Places values into categories with an order (e.g., educational level).
- Analysis:
- Mode and median
- Rank order of coefficients
- Range
- Percentile
Levels of Measurement and Analysis (Continued)
- Interval Data
- Categories with equal distances (e.g., points on a scale).
- Analysis:
- Mean, median, and mode
- Range
- Percentile
- Standard deviation
- Ratio Data
- Highest level; allows for a true zero (e.g., weight, height, volume).
- Analysis:
- Mean, median, and mode
- Range
- Percentile
- Standard deviation
Decision Tree for Statistical Analysis
- Is the study quantitative?
- If yes, proceed with quantitative analysis.
- If no (qualitative), see chapters 7 and 8.
- Flowchart for selecting appropriate descriptive statistics based on the level of measurement (Nominal, Ordinal, Interval, Ratio).
- Nominal: Frequency distribution, Mode
- Ordinal: Range, Percentile, Mode, Median
- Interval: Mean, Mode, Median, Range, Percentile, Standard deviation
- Ratio: Mean, Mode, Median, Range, Percentile, Standard deviation
Inferential Statistics
- Combines mathematical processes with logic to test hypotheses about populations using data from probability samples.
- Purpose:
- Estimate the probability that sample statistics accurately reflect the population parameter.
- Test a hypothesis about a population.
Inferential Statistics: Parameters and Statistics
- Parameter: A characteristic of a population.
- A well-defined set with specific properties.
- Statistic: A characteristic of a sample used to estimate population parameters.
- Example: Survey of 100 heart failure patients shows an average knowledge score of 72%.
- This represents the sample’s average knowledge level.
- Researchers use this to identify knowledge deficits and improve teaching plans.
Inferential Statistics: Parametric vs. Non-Parametric Tests
- Parametric Tests: Statistical procedures used when three assumptions are present.
- Sample from the population has a normal distribution.
- Level of measurement is interval or ratio with a normal distribution.
- Sample obtained through random sampling.
- Non-Parametric Tests: Statistical procedures used when:
- Sample from the population does not have a normal distribution.
- Level of measurement is nominal or ordinal.
- Sample obtained through non-random sampling.
Inferential Statistics: Hypothesis Testing
- Hypothesis (H1): Formal statement of the expected relationship between variables in a specified population.
- Null Hypothesis (H0): States no relationship between variables; used for testing and interpreting statistical outcomes.
- Example: No significant differences in IV catheter patency between flushes with 2ml normal saline vs. 2ml heparinized saline.
Hypothesis Testing: Scientific vs. Null Hypotheses
- For a quantitative study, the researcher(s) will develop two hypotheses
- Scientific Hypothesis (H1): IV catheters flushed with 2ml of heparinized saline will have increased patency than those flushed with 2ml of normal saline.
- Null Hypothesis (H0): There will be no significant differences in the duration of IV patency between those flushed with 2ml normal saline and those flushed with 2ml of heparinized saline
- Indicates no differences will occur between the two variables or groups being studied.
Hypothesis Testing: Statistical Procedures
- The null hypothesis is tested using statistical procedures.
- If no difference occurs between the control and intervention groups (or variables), then the null hypothesis is correct, then the findings are based on chance.
- If there is a difference between the groups then the null hypothesis is rejected.
- A second analysis determines if the difference is significant enough to declare the scientific hypothesis correct.
Hypothesis Testing: Rejecting the Null Hypothesis
- If the null hypothesis (H0) is rejected, a relationship exists between the variables.
- Example: IV catheters flushed with 2ml of heparinized saline had increased patency compared to those flushed with 2ml of normal saline.
- Statistical procedure determines if a relationship exists.
- This testing is subject to two types of errors:
Type I and Type II Errors
- Type I Error: Rejection of the null hypothesis when it is true.
- More serious; the researcher states relationships exist when they do not.
- Consumers consider instrument reliability and validity.
- Type II Error: Accepting the null hypothesis when it is false.
- Can occur if the sample is too small.
Significance Level (Alpha Level)
- Before statistical analysis, the level of significance or alpha level is determined.
- The probability of making a Type I error.
- Minimum for nursing is 0.05.
- Meaning if the study were done 100 times, then the decision to reject the null hypothesis would be wrong 5/100 times.
Adjusting the Alpha Level
- Researchers can set probability at 0.01 for a smaller risk of incorrectly rejecting a true null hypothesis (the decision to reject the null hypothesis would be wrong 1 time out of 100 trials).
- Researchers will select an alpha level depending on how important it is not to make an error.
Practical vs. Statistical Significance
- Practical and statistical significance are not the same.
- A statistically significant hypothesis = unlikely that the findings have occurred by chance.
- If the level of significance was set at 0.05 – then there is a 95% chance the researcher will make the correct conclusion based on statistical tests performed on the data
- Magnitude of significance is vital to the outcome of data analysis.
- Practical significance – examines the practical value that the study contributes.
- If heparinized saline maintains IV catheter patency longer than normal saline = value to practice – maintain IV access longer – fewer IV sticks, increased IV treatments
Types of Inferential Statistical Tests
- Researchers use different parametric and non-parametric tests to determine:
- Differences between means (average):
- Examples: t-test and ANOVA
- Presence of a relationship:
- Examples: Pearson r, Wilcoxon matched pairs test, the signed rank test and multiple regression
Testing for Differences: Algorithm
- Is the research question asking for a difference?
- If yes, proceed to determine the number of groups.
- If no, the research question is asking for a relationship (refer to the other algorithm).
- One group or more than one group?
- Two groups:
- Interval measure? t test
- Nominal or ordinal measure? Chi-square
- One group:
- Interval measure? Correlated t test, ANOVA
- Nominal or ordinal measure? Sign test, Kolmogorov-Smirnov, Signed rank, Mann-Whitney U
Testing for a Relationship: Algorithm
- Is the research question asking for a relationship?
- If yes, determine the number of variables.
- If no, the research question is asking for a difference (refer to the other algorithm).
- Two variables or more than two variables?
- Two variables:
- Interval measure? Pearson product moment correlation, Point-biserial
- Nominal or ordinal measure? Phi coefficient, Kendall's tau, Spearman's rho, Contingency coefficient
- More than two variables:
- Interval measure? Multiple regression, Path analysis, Canonical correlation, Discriminant function analysis
- Nominal or ordinal measure? Logistic regression
Conclusion: Evaluating Data Analysis
- When examining the data analysis, ask yourself
- Is the data analysis (testing) appropriate for the:
- Research question or hypothesis?
- Design of the study?
- Methods used in the study?
- Type of data collected?
- Clues to the appropriate test must come from the research question or hypothesis.
- Look at the findings to determine if they are appropriate and applicable to the patient population and practice setting.
Review for Descriptive Statistics
- Were appropriate descriptive statistics used?
- What level of measurement is used for each major variable?
- Is the sample size large enough to prevent one extreme score from affecting the summary statistics used?
- What descriptive statistics are reported?
- Were these descriptive statistics appropriate to the level of measurement for each variable?
- Are appropriate summary statistics provided for each major variable?
Review for Inferential Statistics
- Does the level of measurement enable the use of parametric statistics?
- Is the sample size large enough to use parametric statistics?
- Are the results for each of the hypotheses presented clearly and appropriately?
- Are the results clear?
- Is a distinction made between practical significance and statistical significance?
Summary
- Review key points and critical thinking questions at the end of the chapter.
- Questions/concerns
- Email: binchj@algonquincollege.com