1/84
Notes
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is functional data?
Functional data are data where each observation is a function, typically a smooth curve, surface, or anything that varies continuously over a domain such as time, space, or frequency.
key assumption of functional data is smoothness
Key assumption is smoothness:
yij = xi(tij) + ij with t in a continuum (usually time), and xi(t) smooth Functional data = the functions xi(t).
Neccessities for functional data? (5)
must believably derive from a smooth process
process should not be easily parameterizable (should not be able to write down a formula)
enough data to resolve the essential features of the process (peaks, zero-crossings, speed... will depend on application)
some repetition in the process
do not need equally-spaced or perfect measurements
Longitudinal vs Functional data
Because of the intense recording, functional data have been customarily modeled with a nonparametric approach.
Often smoothness of the functions is assumed.
Longitudinal data have traditionally been modeled by a parametric approach, such as a linear mixed-effects model. However, it may not be easy to spot the pattern due to sparsity of and noise in the longitudinal data.
Both longitudinal and functional data may be observed with noise (measurement errors).
A strength of the FDA approach is its ability to handle noise.
Diescrete to Functional Data
Allow evaluation of record at any time point (especially if observation times are not the same across records).
Evaluate rates of change.
Reduce noise.
Allow registration onto a common time-scale.
From Diescrete to Functional Data
Basis Expansion
Fourier Basis Examples
Φ(t) = (1,sin(ωt), cos(ωt))
Φ(t) = (1,sin(ωt), cos(ωt),sin(2ωt), cos(2ωt),sin(3ωt), cos(3ωt))
Φ(t) = (1,sin(ωt), cos(ωt), . . . ,sin(6ωt), cos(6ωt))
Fourier Basis Advantages
Only alternative to monomial bases until the middle of the 20th century
Excellent computational properties, especially if the observations are equally spaced.
Natural for describing periodic data, such as the annual weather cycle
BUT functions are periodic; this can be a problem if the data are, for example, growth curves. Fourier basis is still the first choice in many fields, such as signal analysis, even when the data are not periodic.
Splines
Splines are polynomial segments joined end-to-end
Segments are constrained to be smooth at the join
The points at which the segments join are called knots
The order m (order = degree+1) of the polynomial segments and
the location of the knots define the system.
Bsplines are a particularly useful means of incorporating the constraints.
Properties of B-Splines
Number of basis functions:
order + number interior knots
Derivatives up to m − 2 are continuous.
B-spline basis functions are positive over at most m adjacent intervals → fast computation for even thousands of basis functions.
Sum of all B-splines in a basis is always 1; can fit any polynomial of order m.
Most popular choice is order 4, implying continuous second derivatives. Second derivatives have straight-line segments.
B-SplineS: Choosing knots and order
The order of the spline should be at least k + 2 if you are interested in k derivatives.
Knots are often equally spaced (a useful default)
But there are two important rules:
Place more knots where you know there is strong curvature, and fewer where the function changes slowly.
Be sure there is at least one data point in every interval.
Other Basis
The fda library in R also allows the following bases:
Constant φ(t) = 1, the simplest of all.
Power t λ1 ,t λ2 ,t λ3 , . . ., powers are distinct but not necessarily integers or positive.
Exponential e λ1t , e λ2t , e λ3t , . . .
Other possible bases include
Wavelets especially for sharp, local features
Empirical we will investigate functional Principal Components
Designer see our section on dynamic models: tailoring a basis to data (if you know something about the data) can be much more efficient.
Choosing the Number of Basis Functions
Choosing the Number of Basis Functions Tradeoff
Trade off: Too many basis functions over-fits the data and reflect errors of measurement
Too few basis functions fails to capture interesting features of the curves.
Bias and Variance Trade-Off
Mean Squared Error
Cross Validation
Least Squares
Linear Regression on Basis Functions
Smoothing Penalties
What do we mean by Smoothness?
Some things are fairly clearly smooth:
constants
straight lines
What we really want to do is eliminate small “wiggles” in the data
The D Operator
The Roughness of Derivatives
The Smoothing Spline Theorem
Computing the Smoothing Spline
Calculating the Penalized Fit
More General Smoothing Penalties
A Very General Notion
Linear Smooths and Degrees of Freedom
Choosing the Smoothing Parameter
Generalized Cross Validation
Understanding the Distribution of Collections of Functions
Variance
Mechanics of PCA
Functional PCA
Re-Interpretation
Why Orthogonality?
PCA and Karhunen-Loève
Computing FPCA
Displays of PCA
Varimax Rotations
Defining New Inner Products
fPCA with Multivariate Functions
Smoothing and fPCA
Including Derivatives
A New Measure of Size
Size and Orthogonality
Scalar to Function: Identification
Scalar to Function: Smoothing
Scalar to Function: Calculating
Scalar to Function: Confidence Intervals
Multivariate and Mixed Functional Linear Regression
Multivariate and Mixed Functional Linear Regression Calculations
Principal Components Regression
Functional PCR
Problems of Inference
Permutation Tests
Diagramatic representation of permutation test
Functional Linear Regression and Permutation F-Tests
Permutation t-Tests
Max t
Functional Response Models
Response is a set of curves
yi(t) i = 1,..., n.
Covariates may be group labels scalar values functions
Partitioning Effects
Permutation Test
Functional Response Models – scalar covariate
Functional Covariates: Concurrent Linear Model
Mechanics
Smoothing and Confidence Intervals
Confidence Intervals
Functional Response Models in General
Estimating a Coefficient Function
Estimating B
Interpretation
Some Useful Restrictions