Statistics Tests, Intro to Calibration Curves

Statistical Significance and Testing

Two sets of measurements on the same quantity typically differ in average and standard deviation.
Statistical tools determine the probability of a conclusion.
Accept conclusions with a high probability of being correct.
Reject conclusions with a high probability of being incorrect.
Null hypothesis: Two datasets are drawn from populations with the same properties.
- Observed differences arise only from random variation in measurements.
- Reject the null hypothesis if there is less than 5% probability of observing experimental results from two populations with the same value.
The null hypothesis is used with several tests.
- F-test: compares standard deviation ( $s$ ).
- t-test: compares mean ( $m$ ).

F-Test: Comparison of Standard Deviations

Determines if standard deviations of two measurement sets are statistically different.
Formula: $F = \frac{s1^2}{s2^2}$ , where $s1 \ge s2$ .
- The larger standard deviation is placed in the numerator to ensure $F \ge 1$ .
Decision Rule:
- If $F{\text{calculated}} > F{\text{table}}$ , reject the null hypothesis.
- This indicates a less than 5% chance that the two data sets came from populations with the same population standard deviation.
- The difference is considered significant.
Degrees of freedom for $n$ measurements are $n - 1$ .

Example Problem 1: F-Test

Context: Testing for unethical injection of NaHCO3 in horses to neutralize lactic acid.
Objective: Certify a new instrument by comparing its standard deviation to that of an original instrument.
Data:
- Original instrument:
  - Mean: 36.14 mM
  - Standard deviation: 0.28 mM
  - Number of measurements: 10
- Substitute instrument:
  - Mean: 36.20 mM
  - Standard deviation: 0.47 mM
  - Number of measurements: 4
Question: Is the standard deviation from the substitute instrument significantly different from the original instrument?

F-Test Critical Values (Two-Tailed)

Table of critical F-values for a two-tailed test with α = 0.05.
Degrees of freedom: Numerator ( $s1$ ) and denominator ( $s2$ ).
If $F{\text{calculated}} > F{\text{table}}$ , reject the null hypothesis that $\sigma1 = \sigma2$ .
Excel function: F.INV.RT(probability, degree_freedom1, degree_freedom2).
- =F.INV.RT(0.025, 7, 6) reproduces F = 5.70.
- F.INV.RT(0.05, 7, 6) gives F = 4.21 for 90% confidence (two-tailed) or 95% confidence (one-tailed).

iClicker Question: F-Test Interpretation

True or False: If $F{\text{calculated}} > F{\text{table}}$ at the 95% confidence level, there's less than a 5% chance the data sets come from the same population with the same standard deviation.
Answer: True

Calculating Confidence Intervals

Student's t-test is used to compare results from different experiments.
It evaluates the probability that an observed experimental result agrees with a known value.
If we repeat $n$ measurements many times and compute mean and standard deviation, the 95% confidence interval would include the true population mean in 95% of the sets.
Interpretation: We are 95% confident that the true mean lies within the confidence interval.

Table 4-4: Values of Student’s t

Table provides t-values for different degrees of freedom and confidence levels.
Used in calculating confidence intervals.
σ may be substituted for s if you have extensive experience with a particular method and have determined its “true” population standard deviation.
Values of t in this table apply to two-tailed tests
Excel function: T.INV.2T
- For 12 degrees of freedom and 95% confidence, the function T.INV.2T (0.05,12) gives t = 2.179.

Example Problem 2

The carbohydrate content of a glycoprotein is found to be 12.6, 11.9, 13.0, 12.7, and 12.5 wt% in replicate analyses.
Find the 50% and 90% confidence intervals for the carbohydrate content.

Reporting Uncertainty

Report uncertainty as standard deviation or confidence interval.
Reduce uncertainty by making more measurements.

t-Test: Comparison of Means

Determines if the means of two measurement sets are statistically different.
If $t{\text{calculated}} > t{\text{table}}$ , reject the null hypothesis.
- Implies less than 5% chance that the two datasets came from populations with the same population mean.
- The difference is considered significant.
Random variations in measurements typically cause the means of two sets to differ. The t-test helps determine if this difference is statistically significant.

t-Test: Comparison of Means - Cases

Case 1: Compare to a known value:
- Measure a quantity multiple times.
- Obtain $\bar{x}$ and $s$ .
- Compare $\bar{x}$ to the accepted answer, $\mu$ .
Case 2: Compare $\bar{x1}$ to $\bar{x2}$ with replicate samples:
- Measure a quantity multiple times by two different methods.
- Obtain $\bar{x1} \pm s1$ and $\bar{x2} \pm s2$ .
- Do $\bar{x1}$ and $\bar{x2}$ agree within experimental uncertainty?
Case 3: Compare two methods where samples are not duplicated:
- Measure sample A once by method 1 and once by method 2.
- Measure sample B once by method 1 and once by method 2.
- Do the two methods agree within experimental uncertainty?

Example Problem 3: Case 1

Comparing with “known” Value
A coal sample is certified to contain 3.19 wt% sulfur.
A new analytical method measures values of 3.29, 3.22, 3.30, and 3.23 wt% sulfur, giving a mean of 3.260 and standard deviation of 0.04.
Does the answer using the new method agree with the known answer?

Example Problem 4: Case 2a

Comparing Replicate Measurements When Standard Deviations Are Not Significantly Different
Recall: HCO3- in horse blood is measured after each race.
- Original instrument:
  - Mean: 36.14 mM
  - Standard deviation: 0.28 mM
  - Number of measurements: 10
- Substitute instrument:
  - Mean: 36.20 mM
  - Standard deviation: 0.47 mM
  - Number of measurements: 4
$F{\text{calculated}} = 2.82 < F{\text{table}} = 5.08$ , so $s1$ and $s2$ are not significantly different.
Do the means of the two methods differ?
$F = \frac{s1^2}{s2^2} = \frac{(0.47)^2}{(0.28)^2} = 2.82$

Example Problem 5: Case 2b Comparing Replicate Measurements

When Standard Deviations Are Significantly Different
Dry air is composed of ~1/5 oxygen and ~4/5 nitrogen
Measured nitrogen with two experiments: (at constant temperature, pressure, and volume)
- Mass of N2 after removing O2 from air
- Mass of N2 generated from chemical decomposition
Do the means of the two methods differ?
Lord Rayleigh and the Discovery of Argon

t-Test with Unequal Standard Deviations

If standard deviations of the two methods differ (F-test):
$t{\text{calculated}} = \frac{|\bar{x1} - \bar{x2}|}{\sqrt{\frac{s1^2}{n1} + \frac{s2^2}{n_2}}}$
Degrees of freedom:
- $\nu = \frac{(\frac{s1^2}{n1} + \frac{s2^2}{n2})^2}{\frac{(\frac{s1^2}{n1})^2}{n1-1} + \frac{(\frac{s2^2}{n2})^2}{n2-1}}$
Round degrees of freedom to the nearest integer.
Compare $t{\text{calculated}}$ to $t{\text{table}}$ at 95% confidence using appropriate degrees of freedom.

Case 3: Paired t-Test for Comparing Individual Differences

Do the methods give the same answer?
Use two methods to make single measurements on several different samples.
No measurement is duplicated.
$t = \frac{|\bar{d}|}{\frac{sd}{\sqrt{n}}}$ , where $\bar{d}$ is the average difference and $sd$ is the standard deviation of the differences.

iClicker Question: t-Test Interpretation

A t-test compares results from two different methods. If the difference is significant, this indicates:
A. There is a systematic error in at least one of the methods.
D. there is less than a 5% chance that the two sets of data were drawn from populations with the same population mean.
Correct Answer: D

One-Tailed and Two-Tailed Significance Tests

Two-tailed tests:
- t-test calculations assume the certified value lies in the outer 5% of the area under the curve.
One-tailed tests:
- Compare mean with regulatory limit.
- 5% region lies only on one side of the certified mean.
Drinking water example: Concerned if arsenic (As) levels exceed the limit.
- EPA maximum permissible level = 10 µg As/L
- Water samples: 10.06, 10.12, 10.19, and 10.04 µg As/L; $\bar{x} = 10.1025 \pm 0.0675$

Grubbs Test: Check for Outliers

Used to determine whether to discard a discrepant data point (outlier).
Decision Rule:
- If $G{\text{calculated}} > G{\text{table}}$ , reject the null hypothesis.
- Implies less than 5% chance that the suspicious data point belongs to the same population as the other measurements.
- The difference is considered significant.
When a data point is far from others:
- Check your notebook for recorded observations.
- Discard any data point based on recorded faulty procedure (“blunder”).

Grubbs Test Details

$G{\text{calculated}} = \frac{|x{\text{questionable}} - \bar{x}|}{s}$

Example Problem 6: Grubbs Test

Volumes for replicate titrations (mL): 28.54, 28.39, 28.47, 27.68
Should we reject the outlier?

iClicker Question: Grubbs Test Application

Molarity determination of NaOH by titration against KHP:
- Results: 0.1025 M, 0.1087 M, 0.1100 M, 0.1052 M, and 0.0997 M.
- Mean: 0.1052 M
- Standard deviation: 0.0043 M
Can 0.0997 M be discarded as an outlier?

Method of Least Squares

Compares response to known quantities (standards).
Calibration curve: response vs. known standards.
Work in linear region (usually).
Method of least squares: “best” straight line through scattered experimental data.
Equation: $y = mx + b$ (quantify unknown from signal).

Finding the Equation of the Line (Least Squares)

Assumptions:
- Uncertainty in y values is much greater than in x values ( $sy >> sx$ ).
- Uncertainties of all y values are similar.
Minimize vertical deviations between points and line.
Vertical deviation: $di = yi - \hat{y} = yi - (mxi + b)$
Minimize sum of squared deviations to eliminate sign influence.

Determinants for Least Squares

$m = \frac{\begin{vmatrix} \sum xi^2 & \sum xi \ \sum xi yi & \sum yi \end{vmatrix}}{\begin{vmatrix} n & \sum xi \ \sum xi & \sum xi^2 \end{vmatrix}} = \frac{(\sum xi^2)(\sum yi) - (\sum xi)(\sum xi yi)}{n(\sum xi^2) - (\sum x_i)^2}$
$b = \frac{\begin{vmatrix} n & \sum xi \ \sum xi & \sum xi yi \end{vmatrix}}{\begin{vmatrix} n & \sum xi \ \sum xi & \sum xi^2 \end{vmatrix}} = \frac{n(\sum xi yi) - (\sum xi )(\sum yi)}{n(\sum xi^2) - (\sum x_i)^2}$ where n is the number of points.

Example Problem 7

Use the data provided to construct a line of best fit with least squares.

Reliability of Least Square Parameters

Estimate uncertainties in $m$ and $b$ using uncertainty analysis.
Estimate $\sigmay$ by calculating $sy$ . Then,
- $sy = \sqrt{\frac{\sum di^2}{n-2}}$
- $\sigmam = sy \sqrt{\frac{n}{n(\sum xi^2) - (\sum xi)^2}}$
- $\sigmab = sy \sqrt{\frac{\sum xi^2}{n(\sum xi^2) - (\sum x_i)^2}}$
Use calculated uncertainty to determine significant figures.

Calibration Curves

Calibration curve shows the response of an analytical method to known quantities of analyte.
Standard solutions contain known concentrations of analyte.
Blank solutions contain all reagents and solvents used in the analysis, but contain no deliberately added analyte.
A spectrophotometer measures the absorbance of light (y-axis), which is proportional to the quantity of protein analyzed (x-axis).

Constructing a Linear Calibration Curve

Prepare known samples of analyte to cover the range (0 to 150%) of concentrations expected for unknowns.
Tabulate amount of analyte in each standard and response.
Subtract the average absorbance of the blank solutions from each measured absorbance (corrected absorbance).
Blanks measure the response of the procedure when no analyte is present.
Make a graph of corrected absorbance vs. quantity of analyte.
Inspect the graph for linearity, outliers, and consistent y- uncertainty.
Use the least-squares procedure to find the best straight line through the linear portion of the data.
If you analyze an unknown at a future time, run a blank at that time. Subtract the new blank signal from the unknown to correct.

Example Problem 8

An unknown protein sample gave an absorbance of 0.406, and a blank had an absorbance of 0.104.
Amount of protein (μg) Absorbance of independent standards
Corrected absorbance

iClicker Question

Calculate the concentration of the unknown protein in µg.
Answer: A. 11 µg

Linear Response

The linear range of an analytical method is the analyte concentration range over which response is proportional to concentration.
Dynamic range is the concentration range over which there is a measurable response to analyte, even if the response is not linear.
Calibration procedures with a linear response are preferred.
Corrected analytical signal µ quantity of analyte.
It is possible to obtain valid results beyond the linear region by fitting with a nonlinear equation.

Non-Linear Calibration Curves

Consider fitting all data points with a quadratic equation:
$y = (-1.17 \times 10^{-4})x^2 + (0.01855)x - 0.0007$
Insert y = 0.375 into the equation and rearrange to the form $ax^2 + bx + c = 0$
Solve for x. $x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}$

Propagation Uncertainty with Calibration Curves

An unknown with a corrected absorbance of $y = 0.302$ had a protein content of $x = 18.2(4) \mu g$ .

Proper Practice

Always make a graph of your data!
Helps reject bad data, stimulus to repeat a measurement, or decision that a straight line is not appropriate
All three data sets were fit to $y = 0.5x + 3$
It is not reliable to extrapolate any calibration curve beyond the measured range of standards.
At least six calibration concentrations and two replicate measurements of each unknown are recommended.
Make each standard solution from a certified material.
Avoid serial dilution of a single stock solution (serial dilution propagates systematic error).
Measure calibration solutions in random order, not in consecutive order of increasing concentration.

Spreadsheet Instructions

Excel has functions called SLOPE and INTERCEPT and the function LINEST to do linear regressions.

Adding Error Bars on Bar Graph

Excel 2016 instructions:
To insert error bars for uncertainties in column D:
- Double click in graph to get Chart Tools/Design ribbon
  Click Add Chart Element, select Error Bars
  Choose More Error Bar Options
  Click Error Bar Options menu
  Click Custom and Specify Value
  For Positive Error Value, highlight D4:D9
  For Negative Error Value, highlight D4:D9
  Click OK and error bars appear on the chart
  Click on one x error bar and press Delete to remove all x error bars