Parametric Modeling and System Identification
Parametric Modeling
- ARX Model Structure
- System identification with noise.
- Data: (u(t),y(t))
- ARX model:
- A(q−1)y(t)=B(q−1)u(t)+e(t)
- [1+a<em>1q−1+…+a</em>naq−na]y(t)=[b<em>0+b</em>1q−1+…+bnbq−nb]u(t)+e(t)
- q−1 is the delay operator.
- y(t) is the output at time t.
- u(t) is the input at time t.
- e(t) is the noise at time t.
- ai are the coefficients of the polynomial A.
- bi are the coefficients of the polynomial B.
- Regression form:
- y(t)=ϕT(t)θ+e(t)
- ϕ(t) is the regressor vector (past inputs and outputs).
- θ is the parameter vector.
- Regressors: ϕ(t)
- Parameters: θ
- A,B⟹θ
ARX Representation
- Equivalent Form
- y(t;θ)=ϕT(t)θ
- θ=[a<em>1 … a</em>na b<em>0 … b</em>nb]
- ϕ(t)=[−y(t−1) … −y(t−na) u(t) … u(t−nb)]
- Multiple Linear Regression
- J(θ)=N1∑t=0N−1ϵ2(t;θ)
- One-step ahead prediction error: ϵ(t;θ)=y(t)−y(t;θ)
- Prediction error.
- Mean-squared error (MSE) cost function.
Cost Function and Linear System
- Expressing the cost function:
- J(θ)=N1∑t=0N−1ϵ2(t;θ)
- Taking the derivative to find the minimum:
- θ^=argminJ(θ)⟹∑t=0N−1ϕ(t)ϵ(t;θ)=0
- This leads to a linear system of equations.
- Solution:
- θ^N=R−1(N)f(N)
- R(N)=∑t=0N−1ϕ(t)ϕT(t)
- f(N)=∑t=0N−1ϕ(t)y(t)
- If the data is "rich enough", R(N) will be full rank (invertible).
- R(N) is a p×p matrix.
- f(N) is a p×1 matrix.
Ordinary Least Squares (OLS) Estimation
- LS Estimate
- θ^N=R−1(N)f(N)
- Consistency: Is OLS a "good" estimate?
- Assumption: The data-generating process is an ARX model.
- y(t)=ϕT(t)θ0+e(t)
- θ0 represents the ground-truth parameters.
- Ideally, we want θ^<em>N=θ</em>0
OLS and Estimation Error
- Substituting the Data Generating Process
- θ^<em>LS=R−1(N)∑</em>t=0N−1ϕ(t)y(t)
- θ^<em>LS=R−1(N)∑</em>t=0N−1ϕ(t)[ϕT(t)θ0+e(t)]
- θ^<em>LS=θ</em>0+R−1(N)∑t=0N−1ϕ(t)e(t)
- Estimation Error
- θ~<em>N=θ^</em>N−θ<em>0=R−1(N)f</em>e(N)
- f<em>e(N)=∑</em>t=0N−1ϕ(t)e(t)
Consistency of OLS
- Definition of Consistency
- An estimate θ^ is consistent if lim<em>N→∞θ^</em>N=θ0
- Question: Is OLS consistent?
- Analysis
- lim<em>N→∞R−1(N)f</em>e(N)=0
- lim<em>N→∞R(N)=lim</em>N→∞N1∑t=0N−1ϕ(t)ϕT(t)
- lim<em>N→∞N1∑</em>t=0N−1[−y(t−1) … −y(t−na) u(t) … u(t−nb)][−y(t−1)amp;…amp;−y(t−na)amp;u(t)amp;…amp;u(t−nb)]
- Covariance Matrices
- R<em>y(τ)=lim</em>N→∞N1∑t=0N−1y(t+τ)y(t)
Covariance Matrix and Open-Loop Experiment
- lim<em>N→∞NR(N)=[R</em>y(0)amp;R<em>y(1)…R</em>y(na)amp;R<em>yu(0)…R</em>yu(nb) R<em>y(1)R</em>y(0)amp;…amp;R<em>y(na−1)R</em>yu(1)amp;…amp;R<em>yu(nb) ………………… R</em>y(na)amp;R<em>y(na−1)…R</em>y(0)amp;R<em>yu(na)…R</em>yu(nb) R<em>uy(0)R</em>uy(1)amp;…amp;R<em>uy(na)R</em>u(0)amp;…amp;R<em>u(nb) ………………… R</em>uy(nb)amp;R<em>uy(nb−1)…R</em>uy(na)amp;R<em>u(nb)…R</em>u(0)]
- lim<em>N→∞Nf</em>e(N)=lim<em>N→∞N1∑</em>t=0N−1[−y(t−1) … −y(t−na) u(t) … u(t−nb)]e(t)=[−R<em>ye(1) … −R</em>ye(na) R<em>ue(0) … R</em>ue(nb)]=0
- Since e(t) is assumed to be white noise.
- Any open-loop experiment.
Regularization and Prediction Error
- Regularization
- J(θ)=MSE(θ)+λ∣∣θ∣∣2
- Adds bias but controls variance when N≈p
- Prediction
- y^(t∣t−1)=E[y(t)∣Ω(t−1)]
- ϵ(t)=y(t)−y^(t∣t−1)
- Ω(t−1) is the information set up to time t−1.
- ϵ(t) is the "new" information in y(t) that did not exist in Ω(t−1).
- If the model learned well, ϵ(t) should be uncorrelated with Ω(t−1).
Prediction Error Framework
- General Principle
- Good Model = Good Prediction
- Minimize prediction error.
- V<em>N(θ)=N1∑</em>t=1Nϵ2(t,θ)
- For ARX Models
- Prediction error should be uncorrelated with past data.
- ∑t=1Nϵ(t)ϕ(t)=0
Model Mismatch
- Question: What happens if we fit an ARX model to data coming from a non-ARX system?
- Scenario: True system is not ARX.
- y(t)=ϕT(t)θ0+v(t)
- v(t) is colored noise.
- Model: ARX model.
- y(t;θ)=ϕT(t)θ+e(t)
- Model Mismatch
- A<em>0y(t)=B</em>0u(t)+C0e(t)
- Ay(t)=Bu(t)+e(t)
- We can always re-arrange into an ARX-looking form, except that the noise won't be white.
Consequences of Model Mismatch
- LS Estimate
- θ^LS=R−1(N)f(N)
- θ^<em>LS=θ</em>0+R−1(N)fv(N)
- lim<em>N→∞R−1(N)f</em>v(N)=0
- Not generally zero.
- Bias if v is colored.
- θ^ won't be consistent if the experiment is open-loop.
Instrumental Variable (IV) Method
- System and Model
- Assume System: y(t)=ϕT(t)θ0+v(t)
- Model: y^(t;θ)=ϕT(t)θ+e(t)
- Instead of minimizing MSE
- ∑t=0N−1ζ(t)ϵ(t;θ)=0
- ζ(t) is the instrumental variable.
- ζ(t)e(t;θ)=0
- For now, assume ζ(t) is given
- I.V.
IV Estimation
- Cost Function
- J(θ)=N1∑t=0N−1ζ(t)[y(t)−ϕT(t)θ]
- θ^<em>IV=[∑</em>t=0N−1ζ(t)ϕ(t)T]−1[∑t=0N−1ζ(t)y(t)]
- Asymptotic Analysis
- lim<em>N→∞N1∑</em>t=0N−1ζ(t)v(t)=0
- lim<em>N→∞θ^</em>IV=θ<em>0+[R</em>ζϕ]−1[Rζv]
Designing the Instrumental Variable
- lim<em>N→∞θ^</em>IV=θ<em>0+[R</em>ζϕ]−1[Rζv]
- We want:
- Rζv=0
- ζ should be uncorrelated with v.
- Rζϕ
- Should be invertible and even well-conditioned.
- ζ should be correlated with ϕ.