DAPAB W4L2/W4D3: Mixed model using all relationships and ReML

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/43

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 5:51 PM on 4/24/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

44 Terms

1
New cards

Main traditional reasons for mixed model

Non-independence/estimate the fixed effects accounting for all the noise (P-values, SE)

2
New cards

Estimating variance components: reasons for ReML
ANOVA does not work for VCE with unbalanced data

Use ReML

3
New cards

Traditional reasons for mixed models and ReML

Recovery of inter-block information with incomplete blocks

Use ranodm block effects for identifiability of fixed effects of interest

4
New cards

In animal breeding, interest is in the

random breeding values → BLUPs

and in the aditive genetic ariance in non-designed populations (dismgaA) → ReML

The fixed effects merely serve as correction factors (herd, year, season age)

The BLUEs of fixed effects are usually not of main interest in animal breeding

5
New cards

Mixed models and ReML in animal breeing

Interest is in the random effects

  • Breeding values

  • Gene effects

  • → need the variance components

Unbalanced data

  • unequal family sizes

General pedigrees

  • Related sires and dsams

  • Any type of relatedness

  • Use all family information, e.g. all dutch cows

Non-random samples

  • selection of superior individuals

Non-ranodm mating

Include genomic informaiton

  • Genomic breeding values

    • GWAS

6
New cards

Mixed models and ReML in plant breeding

Interest is often in the fixed effects

Unbalanced data (e..g incomplete blocks)

Include genomic information

  • GWAS

7
New cards

Fixed or random

Simplicity

  • Fixed effects are simpler, models converge better (least squared instead of ReML)

Conceptual reasons

  • Is your model term a random sample of a population?

  • Sex effect: NO, there are only two sexes→ sex is fixed

  • Pen: YES, there could be many different pens → pen is random

Avoid false significance in hypothesis testing (avoid too small P-values)

  • For example: does sex affect body weight of pigs

    • Include sufficient noise in the model to avoid false significance

    • Test for sex effect: fit pen, batch etc as random effects

Avoid overstimation of effects (BLUPS)

  • Random effects (BLUPs) get shrinkage → estimates are conservative

  • Fixed effects (BLUEs) get no shrinkage → suffer from winners curse

  • Fixed effects are greedy

  • Fixed effects absorb too much varaition when classes are small

Partial confounding, fixed effects have priority over random effects

  • Recovery inter-block information → blocks must be random

    • Fixed variety effect has priority over the random block effect

    • So inter=block info can be utilized

Relationships/correlations

  • Random effects can be correlated (e.g. related animals)

  • → genetic effects are treated as random

8
New cards

Real life: why did we not simply skip ANOVA

ANOVA requires many assumptions that never hold

  • Real data are never balnaced

  • Sires and dams tend to be related

  • Preselection is common in breeding populations

ANOVA shows where the information comes from → helps understanding of what we do

ReML is way more powerful, but we do not see where the inforamtion comes from

9
New cards

For balanced designs; ANOVA and ReML are

Equivalent

10
New cards

The general mixed model for genetic analysis

knowt flashcard image
11
New cards

The general mixed model for genetic analysis

Y is a vector → the variance of y is a matrix V

knowt flashcard image
12
New cards

Animal MOdel: the simplest mixed model for genetic analysis

Residual identical distributed

<p>Residual identical distributed</p>
13
New cards

The u-effects (breeding values) of related animals are

Positively correlated, related animals have similar breeding values

14
New cards

About the (co)variances of the genetic effects, var(u)

knowt flashcard image
15
New cards

Relationship matrix

Matrix of relationships between all individuals in the analysis

  • Pedigree relationship matrix A

  • Genomic relationship matrix G

<p>Matrix of relationships between all individuals in the analysis</p><ul><li><p>Pedigree relationship matrix A</p></li><li><p>Genomic relationship matrix G</p></li></ul><p></p>
16
New cards

The pedigree relationship matrix A

knowt flashcard image
17
New cards

Base generations

knowt flashcard image
18
New cards

Example of a pedigree relationship matrix A

knowt flashcard image
19
New cards

Summary: the simplest mixed model for genetic analysis

knowt flashcard image
20
New cards

Models for special cases

Sire model

Common environmental effects (litter effects)

Repeated records and permanent non-genetic (environmental effects)

21
New cards

The sire model

Sometimes, we don’t know the mother of an individual, while the father is known (group mating and natural birth, sheep etc). The number of animals is very large, the full model becomes too big to fit

Then we may fit the genetic effect of the father (sire) instead of the animal itself

<p>Sometimes, we don’t know the mother of an individual, while the father is known (group mating and natural birth, sheep etc). The number of animals is very large, the full model becomes too big to fit</p><p>Then we may fit the genetic effect of the father (sire) instead of the animal itself</p>
22
New cards

The sire model - interpretation

Interpretaion

  • similar to the half sib model

  • - Offspring of the same sire are half sibs → var(sire) = cov(half sibs)

  • Cov(half-sibs) = 1/4 var(A)

    • The sire variance is an estimate of a quarter of the additive genetic variance

<p>Interpretaion</p><ul><li><p>similar to the half sib model</p></li><li><p>- Offspring of the same sire are half sibs → var(sire) = cov(half sibs)</p></li><li><p>Cov(half-sibs) = 1/4 var(A)</p><ul><li><p>The sire variance is an estimate of a quarter of the additive genetic variance</p></li></ul></li></ul><p></p>
23
New cards

The sire model - assumptions

  1. Without genetypes or pedigree: var(s) = Isigmas2

the sires are assumed unrelated

The sires are assumed to be random sample of the population

  1. The dams

are assumed to be random sample of the population

unrealted to the sires

mated at random to the sires

each dam contributes a single offspring (No full isbs in the data)

Same assumptions as for the HS design with ANOVA (no need for balanced data)

24
New cards

What would happen if we appply a sire model to a pig population?
pig data contain full sibs

We overestimate heritability

25
New cards

Common environmental effects - piglest born in the same litter

Are full sibs (usually)

Developed in the same uterus

Also receive the ordinary animal model to data on weaning weight: y=Xb+Zu +e

26
New cards

Solution for common environmental effects

The common environmental model: y=Xb + Xu +ZCec +e

ec: vector of random effects common to litter mates (common environment)

Zc: matrix linking observations on indiviuduals to their birth litter

Litter effects are assumed independent 𝑣𝑎𝑟 𝐞𝒄 = 𝐈𝜎𝐸2
Genetic effects are correlated via the pedigree or genotypes: var(u) = 𝐀𝜎𝐴2

The common litter effect also captures (most of) the dominance variance, therefore we don’t usually fit a separate dominance effect

27
New cards

We can separate litter and genetics when

The sire is not completely confounded with litter, thus HS provide info for simgaA2

The different dams may be related, but their litter effects are independent

  • Therefore the ReML likelihood depends on sigmaA2 and we can estiamte it

  • With FSHS ANOVA we could not separate genetic and non-genetic dam effects

But

  • The full-sib covariance is completely confounded with litter, thus litter effects reduce precision (and power) to estimate SigmaA2

28
New cards

Repeated records - some traits we may record multiple times on the same animal

Body weight at different ages, milk yield each week or month, litter size in different parities

29
New cards

Suppose we fit the ordinary model to repeated records on body weight

The similarity between repeated records of the same individual (i) is assumed fully genetic

But what if

  • the animal has been sick when it was young, poor nutrition when it was young → permanent environmental effects (developmental effects)

<p>The similarity between repeated records of the same individual (i) is assumed fully genetic</p><p>But what if</p><ul><li><p>the animal has been sick when it was young, poor nutrition when it was young → permanent environmental effects (developmental effects)</p></li></ul><p></p>
30
New cards

Repeated records - permanent environment model

knowt flashcard image
31
New cards
<p>Repeated records - example</p>

Repeated records - example

knowt flashcard image
32
New cards

Repeated records and common environment

Data consists of the following:

  • For each ltiter, we have weaning weight of each piglet

    • Within a ltiter, piglets experience a common environment (Ec)

    • But piglets in different litters are also nursed by the same mother (Ep)

knowt flashcard image
33
New cards

Genetic relationships

knowt flashcard image
34
New cards

Geneomic relationships

replace the pedigree relationship matrix A with the genomic relationship matrix G

→ more powerful than pedigree, similar to pedigree

35
New cards

Genomic relationships and the G-matrix

If we have genotyped the animals:

Use the observaded covariance between the markers of two individuals

cov(ui,uj) = covmarkersSigmaA2

Genomic prediction and G matrix

MARKERS = SNP-markers

36
New cards

SNP-markers

m marker loci

Each locus has 2 alleles, e.g. A and T

We count one of the two alleles, say A

Diploids → counts are 0,1 or 2, TT=0, AT and TA are 1, AA=2

p is the frequency of the counted allele - count ~ BIn(n=2,p)

The mean allele count is 2p

The varaince in allele count = 2p(1-p)

<p>m marker loci</p><p>Each locus has 2 alleles, e.g. A and T</p><p>We count one of the two alleles, say A</p><p>Diploids → counts are 0,1 or 2, TT=0, AT and TA are 1, AA=2</p><p>p is the frequency of the counted allele - count ~ BIn(n=2,p)</p><p>The mean allele count is 2p</p><p>The varaince in allele count = 2p(1-p)</p>
37
New cards

Genomic relationships: the observed covariance between markers of animals

The common way to calculate a covariance from a sample

knowt flashcard image
38
New cards

Genomic relationships: the observed covariance between markers of animals

With markers - covariance

We want the covariance between two individuals, say i an j, due to all the m markers

x and y are the allele counts in the two individuals, with values M = { 0,1,2}

For a specific locus (l), the mean x and y is two times the allele frequency, 𝑥(^-) =y(^-)𝑙 = 2𝑝l

There are many loci, so n -1 = n, ignore the -1

39
New cards

Genomic relationships: the observed covariance btween markes of animals

The numerator of the relationship between i and j

knowt flashcard image
40
New cards

Genomic relationships: the observed convarience between markers of animals

Next, we want to write var(u) as: var(u) = G(sigmau2)

where 𝐆𝜎𝑢2 is a covariance matrix

Hence, G is a standardized covariance matrix

We have to standardize (divide) the covariance by the marker variance

The resulting geomic relationship in yellow

Gij is the similarity of the markers of i and j, measured as a standardized covariance

<p><span>where 𝐆𝜎𝑢2 is a covariance matrix</span></p><p><span>Hence, G is a standardized covariance matrix</span></p><p><span>We have to standardize (divide) the covariance by the marker variance</span></p><p></p><p><span>The resulting geomic relationship in yellow</span></p><p><span>Gij is the similarity of the markers of i and j, measured as a standardized covariance</span></p>
41
New cards

A and G do not depend on the trait, but only on the

Pedigree (A) or on the genotypes (G)

42
New cards

The genomic relationship matrix (G)

Note:

  • G has both positive and negative values

  • Off-diagnoals Gij: relationships between different individuals

  • - positive values → individuals are more similar than average

  • negative values → individuals are less similar than average

Diagonal elements Gij: self relationships = 1+ inbreeding coefficient

  • smaller than 1 → individual is less inbred (homozygous) than average

  • larger than 1 → individuals is more inbred (homozygous) than average

in contrast to the pedigree relatiosnhisp (A), negative relationships and inbreeding can exist

This is because G measures allele sharing, relative to the population aveage

Large sample in HWE: mean off-diagnoals: G(^-)ij = 0, mean diagonal g(^-)ii=1

<p>Note: </p><ul><li><p>G has both positive and negative values</p></li><li><p>Off-diagnoals Gij: relationships between different individuals</p></li><li><p>- positive values → individuals are more similar than average</p></li><li><p>negative values → individuals are less similar than average</p></li></ul><p>Diagonal elements Gij: self relationships = 1+ inbreeding coefficient</p><ul><li><p>smaller than 1 → individual is less inbred (homozygous) than average</p></li><li><p>larger than 1 → individuals is more inbred (homozygous) than average</p></li></ul><p>in contrast to the pedigree relatiosnhisp (A), negative relationships and inbreeding can exist</p><p>This is because G measures allele sharing, relative to the population aveage</p><p>Large sample in HWE: mean off-diagnoals: G(^-)ij = 0, mean diagonal g(^-)ii=1</p><p></p>
43
New cards

Base generations

If we fit a genetic LMM, we get an estimate of the addivie genetic variance

BUt to which indivudals or subpopulation does this additive genetic variance refer to

GENOTYPES

In G, we subtract 2p when calculating the covariance → the mean genomic relationship is zero in the pouplation we used to calculate p → with genomic relationships the additive genetic variance refers to the pouplation we used to calculate p, usually htese are the genotyped individuals

<p>In G, we subtract 2p when calculating the covariance → the mean genomic relationship is zero in the pouplation we used to calculate p → with genomic relationships the additive genetic variance refers to the pouplation we used to calculate p, usually htese are the genotyped individuals</p>
44
New cards

Summary - mixed model using all relationships and ReML

knowt flashcard image