Numerical Computation and Algorithms

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/40

There's no tags or description

Looks like no tags are added yet.

Last updated 3:26 PM on 6/13/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

41 Terms

New cards

a numerical method usually produces an approximation, not the exact answer
if the error becomes too large:
- the result becomes unreliable
- engineering conclusions may be incorrect
main types of numerical error:
1. roundoff error
2. truncation error
also important:
- error accumulation
- numerical stability
- convergence testing

Numerical Error

New cards

computers store real numbers using floating point representation
a floating point number contains:
- sign bit
- exponent
- significand (mantissa)
common precisions:
Precision
Bits
Approx. Accuracy
Single precision
32-bit
~7 decimal digits
Double precision
64-bit
~16 decimal digits
python/NumPy usually use:
- float32 → single precision
- float64 → double precision

Precision	Bits	Approx. Accuracy
Single precision	32-bit	~7 decimal digits
Double precision	64-bit	~16 decimal digits

floating point representation

New cards

some numbers can be represented exactly
example:
- 6.5 is exact in binary floating point
however, many decimal numbers cannot be represented exactly
example:
- 0.1
stored approximately as:
- 0.100000001490116119…
this creates representation error before calculations even begin

representation error

New cards

absolute error:
- measures the direct difference:
  |x−fl(x)|
- where:
  - x = exact value
  - fl(x) = floating point approximation
relative error
- scales the error relative to the size of the number:
  |x−fl(x)∣|/|x|
- relative error is usually more useful because it behaves like a percentage error

absolute and relative error

New cards

machine epsilon (ϵ_m) is the maximum relative error caused by floating point representation
|x−fl(x)∣|/|x| <= ϵ_m
IEE 754 Values:
Precision
Machine Epsilon
Single precision
(2^{-23} \approx 1.19 \times 10^{-7})
Double precision
(2^{-52} \approx 2.22 \times 10^{-16})
smaller epsilon = higher precision

Precision	Machine Epsilon
Single precision	(2^{-23} \approx 1.19 \times 10^{-7})
Double precision	(2^{-52} \approx 2.22 \times 10^{-16})

machine epsilon

New cards

normal arithmetic laws do not always hold in floating point arithmetic
associativity failure:
- normally:
  (a+b)+c=a+(b+c)
- but floating point rounding can make the results different
important implication
- never compare floating point numbers using:
```
a == b
```
- instead use:
```
abs(a-b) < tolerance
```

floating point arithmetic problems

New cards

occurs when ading number with very different magnitudes
example:
- 1.2257 + 249.237
the smaller number loses significant digits when exponents are aligned
result:
- small contributions may partially disappear
- this is called absorption

absorption error

New cards

after arithmetic operations, the result may require more digits than available precision
the extra digits must be rounded away
example:
- 425 + 670 = 1095
with only 3-digit precision:
- 1.095×10³
- becomes: 1.09×10³
information is lost due to rounding

roundoff error

New cards

multiplication often requires approximately double the significand bits for exact storage
example:
6.954 × 8.212 = 57.106248
if only 4-digit precision is available:
- 57.10
- 57.11
must be used instead
this introduces rounding error

multiplication error

New cards

occurs when subtracting two nearly equal numbers
example:
102.0577 - 102.055 = 0.0027
many leading digits cancel out
small representation errors because huge relative errors
why it is dangerous:
- the final answer may have:
  - very few accurate digits
  - extremely large relative error

catastrophic cancellation

New cards

round by chop:
- simply cuts off extra digits
- example: 1.299 → 1.2
round to nearest:
- rounds to the closest machine number
- 1.299 → 1.3
- comparison:
  - more accurate
  - slightly more computationally expensive
  - used in most modern systems

rounding strategies

New cards

truncation error occurs because numerical methods simplify or approximate mathematics
even with perfect arithmetic, the approximation itself causes error
example: Leibniz series for pi
pi = 4∑^∞_k=0(-1)^k/(2k+1)
the series is infinite, so we truncate it after N terms:

π≈4∑_k=0^N (−1)^k/(2k+1)

because the inifinite series is cut short:
- approximation error remains
- this is truncation error

truncation error

New cards

errors build up during repeated calculations
sources include:
- representation error
- absorption
- roundoff
- cancellation
- truncation
iterative algorithms are especially vulnerable because they perform many operations
lower precision (e.g. half precision) accumulates error faster

error accumulation

New cards

a good iterative numerical method should:
- gradually approach the true solution
- reduce error as iterations increase
example:
- increasing N in the Leibniz series improves approximation of pi

convergence of numerical methods

New cards

used to determine whether iterations are approaching a stable solution
absolute convergence test:
∣y_k−y_k−1∣<ϵ_t
- characteristics:
  - sensitive for large values

relative convergence test:
∣y_k−y_k−1∣/∣y_k∣<ϵ_t
- characteristics:
  - sensitive for small values
uniform convergence test:
∣y_k−y_k−1∣/1+∣y_k∣<ϵ_t
- characteristics:
  - combines benefits of both tests
  - often the most robust
- where:
  - k = iteration number
  - ϵ_t = tolerance

numerical convergence tests

New cards

many numerical methods are iterative (they repeatedly improve an approximation)
a method is said to converge if the solution approaches the true answer as:
- the number of iterations increases, or
- the step size decreases
example: Leibniz series for pi
pi = 4∑^∞_k=0(-1)^k/(2k+1)
the approximation gets closer to the true value of pi

why convergence matters

New cards

error can be checked directly:
|π−π_N|≤ϵt
where:
- ϵ_t= chosen tolerance

when the exact solution is known

New cards

for many engineering ODEs the true solution is unavailable
instead:
1. solve using step size h
2. solve again using a smaller step size
3. compare the results
if the difference becomes sufficiently small, convergence is assumed

when the exact solution is unknown

New cards

absolute test
|y_k−y_k−1|<ϵ_t
relative test
|y_k−y_k−1|/|yk|<ϵt
uniform test
|y_k−y_k-1|/1+|y_k|
where:
- k = iteration number
- ϵ_t= tolerance

common convergence tests

New cards

absolute test:
- good for: small-valued solution
- problem: sensitive to large values

relative test:
- good for: large-valued solutions
- problem: unstable when solution approaches zero
uniform test:
- combines both approaches:
  - behaves like absolute test for small values
  - behaves like relative test for large values
  - most robust convergence test

advantages and disadvantages of convergence tests

New cards

General form:

dy/dt=f(t,y)

where:

t = independent variable
y = dependent variable
f(t,y) = derivative function

Goal:

Determine y(t)

Initial condition y(0)
Symbol
Meaning
(t)
Independent variable
(y)
Dependent variable
(h)
Step size
(k)
Iteration number
(k=0)
Initial condition
(t^{(k+1)})
Next time step
(y^{(k)})
Numerical solution at step k

Symbol	Meaning
(t)	Independent variable
(y)	Dependent variable
(h)	Step size
(k)	Iteration number
(k=0)	Initial condition
(t^{(k+1)})	Next time step
(y^{(k)})	Numerical solution at step k

first-order ODEs

New cards

Runge-Kutta methods are numerical techniques used to solve ODEs
they estimate the solution by evaluating derivatives at several locations within each step
general form:
y^(k+1)=y^(k)+h∑α_if_i
where:
- fi = derivative evaluations
- αi= weights
two categories:
- Explicit RK Methods:
  - current derivative evaluations depend only on previous ones
  - examples:
    - euler
    - improved euler
    - RK4
- Implicit RK Methods:
  - derivative evaluations depend on future unknown values

Runge-Kutta (RK) Methods

New cards

The simplest RK Method
Formula:
y^(k+1)=y(^k)+hf₀
where:
interpretation:
- uses:
  - one derivative evaluation
  - evaluated at the start of the step
- essentially assumes:
  - slope remains constant throughout the step
characteristics:
- advantages:
  - very simple
  - very cheap computationally
- disadvantages:
  - least accurate
  - larger truncation error
  - can become unstable

Euler Method

New cards

also called a predictor-corrector method
uses two derivative evaluations
step 1: predictor
- calculate initial slope:
  f₀=f(t^(k),y^(k))
- predict endpoint:
  y^∗=y^(k)+hf₀
step 2: corrector
- evaluate slope at predicted endpoint
  f₁=f(t^(k)+h,y^∗)

final update:
y^(k+1)=y^(k)+h(f₀+f₁/2)
characteristics:
- advantages:
  - more accurate than Euler
  - better estimate of average slope
- disadvantages:
  - twice the computational cost

Improved Euler Method

New cards

uses four derivative evaluations
RK4 Equations:
- first slope:
  f₀=f(t,y)
- second slope:
  f₁=f(t+h/2,y+hf₀/2)
- third slope:
  f₂ = f(t + h/2, y + hf₁/2)
- fourth slope:
  f₃ = f(t + h, y + hf₂)
- final update:
  y^(k+1) = y^(k) + h(f₀ + 2f₁ + 2f₂ + f₃/6)
why RK4 works well:
- uses:
  - start of interval
  - middle of interval
  - middle again
  - end of interval
    to estimate the average slope
- produces very high accuracy for reasonable cost

Classical RK4 Method

New cards

a compact way of writing Runge-Kutta methods
contains:
- β values:
  - nodes: wher derivatives are evaluated
- γ values:
  - RK matrix: how derivative evaluations depend on each other
- α values:
  - weights: used in final weighted average
important property:
- for explicit RK methods:
  - the RK matrix must be:
    - lower triangular
    - zero along diagonal
  - reason: each derivaive can only depend on previously computed derivatives

Butcher Tableau

New cards

error caused by simplifying mathematics
occurs even with perfect arithmetic
RK methods only sample the derivative at finite locations
the actual derivative may vary continuously between those locations
therefore some information is lost

truncation error

New cards

Taylor Series:
y(t+h)=y(t)+hdt/dy+(h²/2!)(d²y/dt²)+⋯
euler keeps only the first derivative term
everything else is ignored
therefore:
local error = O(h²)

Taylor Series and Euler Error

New cards

local error
- error introduced in ONE step
- for euler:
  ϵ_l∝h²
- halving h:
  - local error reduced by factor 4
global error
- total accumulated error after many steps
- for euler:
  ϵ_g∝h
- halving h:
  - global error reduced by factor 2

local vs global error

New cards

the higher the order:
- faster convergence
- greater accuracy
for the same step size

Method	Derivatives	Order	Local Error
Euler	1	1st	(O(h^2))
Improved Euler	2	2nd	(O(h^3))
RK4	4	4th	(O(h^5))

more derivative evaluations usually improve accuracy
however:
number of derivatives does not equal order
higher-order methods can be constructed in many ways

order of accuracy

New cards

two appraoches:
method 1: smaller step size
- reduce h
- advantages:
  - more accurate
- disadvantages:
  - more iterations
  - more computation time
method 2: higher-order method
- use:
  - improved euler
  - RK4
- advantages:
  - greater accuracy
- disadvantages:
  - more derivative evaluations per step

reducing numerical error

New cards

a method is stable if its numerical solution behaves similarly to the true solution
stable solution
if exact solution decays:
- numerical solution also decays

unstable solution:
- if exact solution decays:
  - numerical solution grows
  - this causes error to dominate

numerical stability

New cards

consider:
dt/dy=ay
exact solution:
y=Ce^at
if a<0:
- the true solution decays toward zero
euler update:
y^(k+1)=(1+ha)y^(k)
- after k steps:
  y^(k)=(1+ha)^ky⁽⁰⁾

stability analysis of euler method

New cards

for stability:
|1+ha|<1
which gives:
−2<ha<0
therefore:
h< -2/a
(for a<0)
stability regions:
Condition
Behaviour
(-2<ha<0)
Stable
(ha=-2)
Neutral
(ha<-2)
Unstable

Condition	Behaviour
(-2<ha<0)	Stable
(ha=-2)	Neutral
(ha<-2)	Unstable

euler stability criterion

New cards

sampled data: measurements of a continuous process taken at discrete points
independent variable: t (e.g. time)
dependent variable: y
for n data points: t_i are measurement points, y_i = y(t_i) are the measured values
common questions answered using sampled data:
- what is the relationship between y and y?
- whast is y at an unmeasured point (interpolation)?
- what is y outside the measured range (extrapolation)?
three key tools: regression, interpolation, extrapolation

sampled data

New cards

concept:

regression fits a single function (commonly a polynomial) that approximate the overall trend of the data
general m-order polynomial:
f(t)=a₀+a₁t+a₂t²+⋯+a_mt^m=∑^m_k=0a_kt^k
linear regression = special case where m = 1
goal: choose the lowest-order polynomial that fits the data reasonably well

goodness of fit — sum of squared differences (R²):

R²=ⁱ⁼⁰∑_n−1(y(t_i)−f(t_i))²

smaller R² = better fit
least-squares regression = finding the m + 1 coefficients a₀, …, a_m that minimise R²

minimising R²:

treat R² as a multivariable function of the coefficients of a₀, … a_m
find stationary points by setting all partial derivatives to zero:
∂(R²)/∂a₀=∂(R²)/∂a₁=⋯=∂(R²)/∂a_m=0
the gives m + 1 equations for m + 1 unknown coefficients

Matrix Form — The Vandermonde System:

the resulting system can be written as Ax = b:

A = Vandermonde (coefficient) matrix
b = vector built from sums involved y_i
solve Ax = b with a linear algebra solver → get coefficients a₀, …, a_m. Once found, evaluate f(t_j) to approximate y(t_j) at any desired point.

Worked Example (8 data points):

first-order (linear) fit: poor fit, large R² = 277
higher-order fit (m = n - 1 = 7): passes through every data point exactly, R² = 0

Overfitting:

R² = 0 does not mean the model is “best” — it may just be overfitting
warning signs/checks for overfitting:
- does the function make physical sense?
- are the polynomial coefficients unusually large?
- are there many parameters relative to the number of data points
aim: a model that is reasonable, not just one that minimises R².

regression

New cards

General Idea:

interpolation fits n - 1 piecewise functions f_i(t), each exactly matching the data at the endpoints of its subinterval:
- f₀(t₀) = y(t₀), f₀(t₁) = y(t₁)
- f₁(t₁) = y(t₁), f₁(t₂) = y(t₂)
- … and so on
to estimate y(t_j) for t_i <= t_j <= t_i+1, use f_i(t_j).

Linear Interpolation:

fits a first-order polynomial (straight line) in each subinterval:
f_i(t) = y_i + (y_i+1 - y_i/t_i+1 - t_i)(t - t_i)
simple but only C⁰continuous (line has “kinks” at data points)

Natural Cubic Splines:

fits a cubic polynomial in each of the n - 1 subintervals:
f_i(t)=a_i+b_i(t−t_i)+c_i(t−t_i)²+d_i(t−t_i)³
first derivative:
f_i′(t)=b_i+2c_i(t−t_i)+3d_i(t−t_i)²
second derivative:
f_i′′(t)=2c_i+6d_i(t−t_i)
“natural” condition: second derivative = 0 at the two endpoints of the dataset
provides C⁰, C¹, and C² continuity → much smoother curve than linear interpolation

Building the System of 4(n-1) equations

for n-1 cubics, there are 4(n-1) unknown coefficients (a_i, b_i, c_i, d_i), the equations come from:
1. function value matching (2(n-1) equations)
  - left end: a_i = y_i
  - right end: a_i + b_ih_i + c_ih_i² + d_ih_i³ = y_i+1, where h_i = t_i+1 - t_i
2. first derivative continuity (n-2 equations) — between adjacent cubics at shared points:
  b_i+2c_ih_i+3d_ih_i²−b_i+1=0
3. second derivative continuity (n-2 equations):
  2c_i+ 6d_ih_i- 2c_i+1 = 0
4. natural boundary conditions (2 equations) — zero curvature at first and last points:
  - 2c₀ = 0
  - 2c_n-2 + 6d_n-2h_n-2 = 0
total: 4(n-1) equations → solved as Ax = b for all the cubic coefficients

Interpolation

New cards

extrapolation: estimating y outside the range of the sampled data (vs. interpolation, which is within the range).
any interpolating or regression function g(x) can in principle be used as the extrapolating function
key danger: accuracy of extrapolation is not bounded by known data — be very cautious
example: a linear fit extrapolates “reasonably”, but the high-order (order-7) polynomial that fit the data perfectly (R² = 0) blows up wildly outside the data range — a classic overfitting/extrapolation hazard

extrapolation

New cards

5.1 Overview

Used to numerically approximate I = ^b_a∫y(t) dt using sampled data
closed rules: integration limits are known data points (t₀ = a, t_n-1 = b) — the focus of this course
open rules: limits are not known data points (not covered here)

Trapezoid Rule:

approximates the integrand with a straight line between a and b

^b_a∫y(t)dt≈h/2(y(a)+y(b)),h=b−a

geometrically: area of the trapezoid under the line connecting the endpoints

5.3 Simpson’s 1/3 Rule

fits a quadratic through a, midpoint, and b:

∫^a_by(t)dt≈h/3[y(a)+4y(a+b/2)+y(b)],h=(b−a)/2

requires the function value at the midpoint

5.4 Simpson’s 3/8 Rule

fits a cubic through 4 points (a, two interior points, b):
^b_a∫y(t)dt≈3h/8[y(a)+3y(2a+b/3)+3y(a+2b/3)+y(b)],h=3b−a
requires two additonal interior points

5.5 Limitations:

assumes the integrand is known at specific points — may not match available sampled data
accuracy suffers when:
- h (step size) is large, or
- the function varies significantly between the limits

Newton-Cotes Rules (Numerical Integration)

New cards

6.1 Idea

split [a,b] into multiple subintervals and sum the integral approximation over each:
I = I₁ + I₂ + I₃ + …
increasing the number of subintervals increases accuracy (e.g. example showed % error dropping from -26.3% with 2 subintervals to -2.5% with 7 subintervals)

6.2 Composite Trapezoidal Rule

each subinterval contributes:
I_i=(t_i+1−t_i)(y(t_i)+y(t_i+1))/2
sum over all n-1 subintervals:
^b_a∫y(t)dt≈ ⁿ⁻²_i=0∑t_i+1−t_i/2(y(t_i)+y(t_i+1))

Composite Simpson’s 1/3 Rule

Requires fitting a quadratic per subinterval → needs 3 points per subinterval
preconditions:
- uniform step size
- odd number of data points (so subintervals can pair up correctly)
adjacent subintervals share only their endpoints:
- subinterval 0: y(t₀), y(t₁), y(t₂)
- subinterval 1: y(t₂), y(t₃), y(t₄)
- etc.
formula:

^b_^a∫y(t)dt≈h_i/3=0 ^{⌊n/2⌋−1}_i=0∑[y(t_2i)+4y(t_2i+1)+y(t_2i+2)]

a composite Simpson’s 3/8 rule also exists (not derived in these slides)

Composite Rules

New cards

7.1 Overview

approximates an integral via:
1. normalising (change of variable) the integral to the interval [-1,1]
2. evaluating the integrand at specific Gauss points, weight appropriately
for low-order polynomial, this can give an exact result (zero error)

7.2 General Formula

¹_-1∫f(ξ)dξ=ϵ+ ^G−1_g=0∑w_gf(ξ_g)

ϵ = numerical error
ξ_g = Gauss points
w_g = weights (sum to 2, the width of [−1,1])
G = number of Gauss points

7.3 Three-point Guass Quadrature (example scheme)

ξ0=−sqrt(3/5),ξ1=0,ξ2=sqrt(3/5) w₀=5/9,w₁=8/9,w₂=5/9

7.4 Worked Example

exact integral: ¹_-1∫3x²dx=2
using 3-point quadrature with f(x) = 3x²:
I = 5/9(3)(sqrt(3/5))² + 8/3(3)(0)² +5/9(3)(sqrt(3/5))² = 1 + 0
result is exact (ϵ=0) — demonstrates Gaussian quadrature’s strength for polynomial integrands

Gaussian Quadrature