FDA PP Answers

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/60

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

61 Terms

1
New cards

T/F - . Fourier series can never be used to smooth non-periodic data.

FALSE; they can be used as they can accommodate variations from periodicity

2
New cards

T/F - The degrees-of-freedom of a penalised smoother is always less than the number of basis functions used to smooth the data.

TRUE; as penalisation creates more constraints

3
New cards

T/F - When performing function on function regression if one uses a concurrent model the slope parameter is a scalar.

FALSE; it is a function

4
New cards

T/F - The mean squared error is always smaller than variance.

FALSE; MSE also includes bias so equal to or greater than var

5
New cards

T/F - If we are using a harmonic acceleration roughness penalty, the resulting x(t) becomes exactly periodic as λ → 0.

FALSE; It becomes exactly periodic when λ → ∞

6
New cards

T/F - For a B-spline one can increase the number of basis functions, either by increasing the number of knots or by increasing the order of spline.

TRUE; nbasis = number of internal knots + order

7
New cards
term image

i and iii

8
New cards
term image

(v); rows = number of time points , columns (number of basis function = internal knots (11)+ order (4) = 15

9
New cards
term image

i and iii

10
New cards

Let tempfd be a functional data object obtained by using the smooth.basis function in the fda package.

The code plot(deriv.fd(tempfd$fd))

will plot the

first derivative of the curve

11
New cards

Suppose you have observed data at 81 equally spaced time points on a single curve. The dataset is given by y0, y1, y2, . . . , y80 corresponding to time-points t = 0, . . . , 80 (a) If you are using a saturated Fourier basis, how many basis functions do you need to use?

(b) Write the expressions for the first 3 and the last 2 basis functions in the saturated Fourier basis.

© Write the R code using functions from fda to create the saturated Fourier series

(d) Can you obtain a continuous fourth derivative of the fitted curve if you just used 3 Fourier basis to fit the curve? Justify your answer.

(a) 81

(b) 1, sin(ωt), cos(ωt) sin(40ωt), cos(40ωt)

© timerange=c(0,80) create.fourier.basis(timerange,81)

(d) Yes, Smoothing with Fourier series have infinite derivatives

12
New cards
<p>i. What are the dimensions of c, Φ, R and λ?</p><p>ii. Write the expression of R in terms of the basis function Φ(t) for a harmonic acceleration penalty. Hint: You do not need to workout the actual derivatives</p><p>iii. Name the functions in the fda library that you need to use to fit the penalised smoother to the data.</p><p>iv. Let yˆ be the fitted value of y using the penalised fit with harmonic acceleration penalty. Express yˆ in terms of y, Φ, R and λ and argue that yˆ is a linear smoother.</p>

i. What are the dimensions of c, Φ, R and λ?

ii. Write the expression of R in terms of the basis function Φ(t) for a harmonic acceleration penalty. Hint: You do not need to workout the actual derivatives

iii. Name the functions in the fda library that you need to use to fit the penalised smoother to the data.

iv. Let yˆ be the fitted value of y using the penalised fit with harmonic acceleration penalty. Express yˆ in terms of y, Φ, R and λ and argue that yˆ is a linear smoother.

(i) c → 81 × 1, Φ → 81 × 81, R → 81 × 81 and λ → 1 × 1

<p>(i) c → 81 × 1, Φ → 81 × 81, R → 81 × 81 and λ → 1 × 1</p><p></p>
13
New cards

Describe, the steps specifying the relevant functions from the fda library that you would need to test the null hypothesis, that the rate of change in growth in the two types of chicken is the same.

• Smooth the data by Choosing an appropriate basis function. One should use a b-spline basis or a basis with monotone increase as we are modelling growth curve. create.bspline.basis()

• Use smooth.basis to smooth the data either by using a penalised or unpenalised estimator

• Take the derivative of the growth curve using the function derv.fd for the two groups

• Two perform the hypothesis test of whether the two rates of growths are same one cane use the function tperm.fd to perform a two sample t-test for functional objects among the two groups are same vs they are different

• Alternatively, one can use the regression setting with the group as a indicator variable and test the hypothesis of β(t) = 0 using the function Fperm.fd

14
New cards

(a) Log total CO2 emission on the cement production curve and latitude.

knowt flashcard image
15
New cards

(b) A concurrent model of CO2 emission curve on the cement production curve and latitude.

knowt flashcard image
16
New cards

(c) A full functional linear model of CO2 emission curve on the cement production curve and latitude.

knowt flashcard image
17
New cards

T/F - A cubic spline basis with no internal knots is the same as a polynomial basis

TRUE; they can be used as they can accommodate variations from periodicitysimilar to third order polynomial fit

18
New cards

T/F - The degrees-of-freedom of a penalised smoother increases with increase in the magnitude of penalty paramater .

FALSE; degrees-of-freedom of a penalised smoother decreases

19
New cards

T/F - When performing function on function regression if one uses a historical model the slope parameter is a surface.

TRUE; Surface

20
New cards

T/F - Fitting smooth curves is just linear regression using basis functions as independent variables

TRUE

21
New cards

T/F - The bases of fourier series of any order is orthogonal to each other

TRUE

22
New cards

Suppose we are smoothing the temperature data observed every 5 days in 2010 of 50 cities in the UK using a B-spline of order 3 and knots placed every time point. Let Φ be the evaluation of the basis functions Φ(t) at the observed time-points. The dimension of Φ is:

365 X 15 ; rows = number of time points 365/5=73, columns (number of basis function = internal knots (72)+ order (3) = 75

23
New cards

Let waterfd be a functional data object obtained by using the smooth.basis function in the fda package.

The code plot(deriv.fd(waterfd$fd,2)) will plot the:

i. smooth curve

ii. smooth curve with linetype 2

iii. first derivative of the curve

iv. second derivative of the curve

iv

24
New cards

Suppose you have observed data at 11 equally spaced time points on a single curve.

The dataset is given by y0, y1, y2, . . . , y11 corresponding to time-points t = 0, . . . , 10

(a) If you are using a cubic b-spline with internal knots at t = 3, 6, 8 how many basis function do you have.

(b) Using this example or in general prove that you cannot fit a unpenalized cubic spline if you put a knot at every time-point.

(c) What is the maximum number of adjacent intervals each of the basis functions of a cubic spline can have positive support on

(d) The R code from fda library to create b-spline takes the following argument (rangeval=___, nbasis=___, norder=___, breaks=___) Write the code for defining a cubic spline basis with knots at every time point using

(i) Only the arguments (rangeval=___, nbasis=___, norder=___)

(ii) Only the arguments (rangeval=___, nbasis=___, breaks=___)

(iii) Only the arguments (rangeval=___, norder=___, breaks=___)

(e) What order of spline should you use if you wish to calculate the third order derivative of the curve

(a) 4 + 3 = 7

(b) resulting number of basis 9+4=13, but only 11 data points

© 4 same as order

(d) > a1=create.bspline.basis(rangeval = c(0,10),nbasis=13,norder = 4)

> a2=create.bspline.basis(rangeval = c(0,10),nbasis=13, breaks = 0:10)

> a3=create.bspline.basis(rangeval = c(0,10),norder=4, breaks = 0:10)

(e) order 5

25
New cards
term image

i. Write the expression of R for a fourth derivative penalty

ii. What are the dimensions of c, Φ, R and λ?
iii. Name the functions in the fda library that you need to use to fit the penalised smoother to the data.

iv. State at least two approaches of determining an optimal value of λ

<p>i. Write the expression of R for a fourth derivative penalty</p><p>ii. What are the dimensions of c, Φ, R and λ?<br>iii. Name the functions in the fda library that you need to use to fit the penalised smoother to the data.</p><p>iv. State at least two approaches of determining an optimal value of λ</p>
26
New cards

Suppose we have data on the daily covid cases (over 6 months) with similar testing capacity from 30 countries, 10 from each of the 3 continents Asia, Europe and South America. We wish to find out if there is difference in how the disease has progressed in the three continents. Accounting for the difference of population and the first case of the diseases we should ideally look at the rate of change

Describe, the steps specifying the relevant functions from the fda library that you would need to test the null hypothesis, that the rate of change covid cases in the three continents are similar.

• Smooth the data by Choosing an appropriate basis function. One can use any basis e.g. create.bspline.basis()

• Use smooth.basis to smooth the data either by using a penalised or unpenalised smoother

• Take the derivative of the growth curve using the function deriv.fd for each of the 30 countries

• Use the regression setting with the group as a indicator variable and test the hypothesis of µ1(t) = µ2(t) = µ2(3) using the function Fperm.fd

27
New cards

i. Death as functional object on total monthly number of cases for each of the six months and proportion of population above the age of 80.

knowt flashcard image
28
New cards

ii. A concurrent model of Death as functional object on the smoothing the daily number of cases and proportion of population above the age of 80.

knowt flashcard image
29
New cards

iii. A full functional linear model Death as functional object on the smoothing the daily number of cases and proportion of population above the age of 80.

knowt flashcard image
30
New cards

(b) Given the fact that Deaths follow cases, and there is a lag between the number of cases and the number of deaths, do you think the concurrent model in part (ii) or the full functional model in (iii) is appropriate. Justify your answer and propose an alternative model and write it out as a functional linear model. [

knowt flashcard image
31
New cards

T/F - An order 4 B-spline basis with exactly one internal knot is the same as a cubic polynomial basis.

FALSE; It would be true if there was not an internal knot

32
New cards

T/F - The degrees-of-freedom of a penalised smoother with the penalty parameter λ = 0 is exactly same as the number of basis functions.

TRUE; If penalty parameter is non-zero dof < number of basis

33
New cards

T/F - When performing a functional regression of a single response function on another functional explanatory variable, if one uses a concurrent model the slope parameter is a surface.

FALSE; function

34
New cards

T/F - For a Fourier basis one can increase the number of basis functions, either by increasing the number of knots or by increasing the order of spline.

FALSE; true for B-splines

35
New cards

T/F - Functional principal components are always orthogonal to each other

FALSE; Not orthogonal for penalised versions

36
New cards

T/F - For any linear smoother both OCV (ordinary cross validation) and GCV (generalised cross validation) have a closed form expression.

TRUE

37
New cards

(b) Suppose we are temporally smoothing the weekly total covid cases, for 52 weeks in 2021 for 20 cities in the UK using a B-spline of order 5 and knots placed every week.

Let Φ be the evaluation of the basis functions Φ(t) at the observed timepoints of a particular city. The dimension of Φ is …. × ….

(c) The B-spline basis in functions in part(b) will be positive over at most … adjacent intervals.

(d) As the knots belonging to the B-spline in part (b) are distinct, it will have continuous derivative up to degree ...

(b) rows = number of time points 52; columns (number of basis function = internal knots (51)+ order (5) = 56

© 5

(d) 3

38
New cards

(e) Let covidfd be a functional data object obtained by using the smooth.basis function in the fda package.

The code plot(deriv.fd(covidfd$fd,3)) will plot the …

Third derivative of covidfd

39
New cards

(f) The function … in the fda package will allow us to evaluate the value of the functional object covidfd at 10 distinct time points.

evalfd

40
New cards
<p>i. Log total GDP on HS curve and Income</p>

i. Log total GDP on HS curve and Income

knowt flashcard image
41
New cards
<p>ii. A concurrent model of GDP curve on HS curve adjusting for the level of INC of the country.</p>

ii. A concurrent model of GDP curve on HS curve adjusting for the level of INC of the country.

knowt flashcard image
42
New cards
<p>iii. A full functional linear model of GDP curve on the HS curve adjusting for the level of INC of the country.</p>

iii. A full functional linear model of GDP curve on the HS curve adjusting for the level of INC of the country.

knowt flashcard image
43
New cards
<p>iv. A functional linear model of GDP curve on the yearly HS totals adjusting for the level of INC of the country.</p>

iv. A functional linear model of GDP curve on the yearly HS totals adjusting for the level of INC of the country.

knowt flashcard image
44
New cards
<p>Instead of using the HS curve, if you choose to use the first 4 functional Principal Components score of the HS curve as explanatory variables and the GDP curve as the response variable, how will your analysis differ from part a (iii) in terms of the model and dimensions of β’s.</p>

Instead of using the HS curve, if you choose to use the first 4 functional Principal Components score of the HS curve as explanatory variables and the GDP curve as the response variable, how will your analysis differ from part a (iii) in terms of the model and dimensions of β’s.

knowt flashcard image
45
New cards
<p>Make a list of the R functions and describe the steps you will need to implement part (b) using the fda package.</p>

Make a list of the R functions and describe the steps you will need to implement part (b) using the fda package.

create.fourier.basis or create.bspline.basis smooth.basis pca.fd fRegress

• Smooth both curves using

• Perform a pca on the HS curves

• Look at the scree plot/variation explained and retain k components

• Calculate the score functions

• Perform a function on scalar/multivariate regression on each of the principal components

46
New cards

why a classical two-sample t-test is not appropriate for comparing growth curves (i.e. functional data, not scalar data)

  • Classical t-tests require scalar/vector inputs.

  • Applying them pointwise causes correlation and multiple testing issues.

  • Distributional assumptions on entire curves are hard to verify.

  • There is an infinite hypothesis testing problem when working with functions.

47
New cards

Why are the pointwise critical values different for each time point?

Even though n1n​ and n2n stay the same across time:

  • The variability in the data (sample variances) at each time point changes.

  • This causes the approximate degrees of freedom to change.

  • Hence, the critical values of the t-test differ pointwise.

48
New cards

Hypothesis Test for Mean Growth Curves

knowt flashcard image
49
New cards

How would you modify the code tperm.fd(hgtmfd,hgtffd) to test the hypothesis that the rate of growth of boys and girls are the same? [2

tperm.fd(deriv.fd(hgtmfd,1),deriv.fd(hgtffd,1))

50
New cards

Suppose you are restricted from using functional principal component techniques. Show the steps for obtaining the projection of the curves X1(t), . . . , Xn(t) on the first 5 principal component directions using techniques from multivariate principal components. At each step specify the dimension of the matrices.

knowt flashcard image
51
New cards
term image

In general σ(s, t) Represents continuous surface in two dimensions so we need infinite sum to represent all possible variance covariance functions.

52
New cards

Derive an expression for the proportion of variance explained by the first k principal components.

knowt flashcard image
53
New cards

Show that the first functional principal component is orthogonal to the sum of k following principal components.

knowt flashcard image
54
New cards

(d) Denoting µ(t) as the mean of X1(t), . . . , Xn(t), find an expression for the principal component score of the j th observation on the i th principal component. [5 M

knowt flashcard image
55
New cards

What will be the dimensions of the principal components score matrix for the data on the first 5 functional principal components?

n rows and 5 columns

56
New cards

Functional data analysis can be applied to study not only temporal data but also spatial data, making it a versatile approach for understanding patterns varying over a continuum.

TRUE; The same techniques work support needs to be changed to space etc

57
New cards

. The degrees-of-freedom of a penalised smoother is less than or equal to the number of basis functions used to smooth the data.

TRUE; as penalisation creates more constraints the degrees of freedom is smaller or same

58
New cards

When performing functional regression the response variable can be a scalar, vector or function.

TRUE; FDA regression have different techniques for each of these situations

59
New cards

When regressing a scalar random variable on a functional covariate, we always need to penalize the fitted slope function.

TRUE; Otherwise it will not be identifiable

60
New cards

If we are using a harmonic acceleration roughness penalty, the resulting x(t) becomes exactly periodic as penalty paramater λ = 1.

FALSE; λ → ∞ the x(t) has no contribution from the data and defined entirely by the penalty function which is periodic

61
New cards

i. Keeping k and λ fixed, if the order is increased to p + 1 then the degrees of freedom changes to d + 1.

ii. Keeping p and λ fixed, if the number of knots is decreased to k − 2 then the degrees of freedom will be less than or equal to d.

iii. Keeping p and k fixed, if the penalty parameter λ is increased then the degrees of freedom will be greater than or equal to d.

iv. Fixing λ = 1, if the order is increased to p + 1 and the number of knots is decreased to k − 1 the degrees of freedom remains unchanged.

i) FALSE as it is penalized the increase might be less than 1 in degrees of freedom

ii) TRUE - degrees freedom will decrease with decrease in knots

iii) FALSE - degrees freedom d will decrease with increase in penalty

iv) FALSE - nbasis = number of internal knots + order and nbasis=degree of freedom for un-penalised estimators i.e. only for λ = 0