MCMC Output
- Introduce MCMC output process.
- Utilizes Google Colab for modeling.
Deterministic Variables in Models
- Deterministic Variables: Transformations introduced without additional uncertainty.
- Generative Model Treatment:
- Explicit and implicit treatment of deterministic variables.
- Example:
- θ^∼Beta(a,b)
- a^=θ^(κ−2)+1
- b^=(1−θ^)(κ−2)+1
- θi∼Beta(a^,b^)
- y<em>i∼Binomial(n</em>i,θi)
- Tracking variables in JAGS (Just Another Gibbs Sampler).
JAGS Model for Deterministic Variables
- Model structure defined for JAGS:
model {
theta_hat ~ dbeta(a, b)
a_transformed <- theta_hat * (kappa - 2) + 1
b_transformed <- (1 - theta_hat) * (kappa - 2) + 1
for (i in 1:M) {
theta[i] ~ dbeta(a_transformed, b_transformed)
y[i] ~ dbin(theta[i], n[i])
}
}
- Convenience: Transformations are not required but simplify the model.
Generative and Graphical Models
- Explicit vs. Implicit representation:
- Explicit: Tracks values of \ \hat{a}, \hat{b}, \hat{\theta}.
- Implicit: Depends on parameters without explicit tracking.
Mixture Models
- Importance: Used for datasets not well-represented by standard distributions (e.g., Gaussian).
- Framework:
- Many datasets are mixtures of distributions.
- Commonly used in Bayesian clustering.
- Example:
- θ<em>k∼p(θ</em>k)
- zi∣π∼Categorical(π)
- x<em>i∣z</em>i,θ∼p(x<em>i∣θ</em>zi)
- Gaussian mixture model as common example.
- Model representation for Gaussian mixture:
model {
pi ~ ddirch(alpha)
for (k in 1:K) {
mu[k] ~ dnorm(170, 0.01)
sigma[k] ~ dunif(0.0, 10.0)
tau[k] <- pow(sigma[k], -2)
}
for (i in 1:n) {
z[i] ~ dcat(pi[])
x[i] ~ dnorm(mu[z[i]], tau[z[i]])
}
}
- Performance: Inference may require substantial computation time.
Challenges in Mixture Models
- Symmetry Issues: Multi-modal distribution complicates inference.
- Posterior Predictive: Must scale densities of components correctly.
- p(x<em>i∣μ,σ,π)=∑</em>k=1Kπ<em>kGaussian(x</em>i∣μ<em>k,σ</em>k)
- Use of strategies to reduce symmetries by enforcing ordering in μ values:
model {
for (k in 1:K) {
mu0[k] ~ dnorm(170, 0.01)
}
mu <- sort(mu0)
}
Bayesian Linear Regression
- Framework: Relationships modeled using known predictors.
- Generalized Linear Model (GLM) encompasses regression and classification paradigms.
- Linear Regression:
- Model used for predicting continuous variables:
- Single predictor: y=w<em>0+w</em>1x
- Multiple predictors brought together as:
- y=w<em>0+w</em>1x<em>1+…+w</em>Kx<em>K=w</em>0+∑<em>k=1Kw</em>kxk
- Error Representation:
- y=wTx+ϵ, where \epsilon denotes error.
Bayesian Inference
- Focus on learning weights which imply relationships between predictors and responses.
- Practical application of Bayesian inference facilitated through JAGS for model setup and execution:
model {
w0 ~ dnorm(0, 1e-3)
w1 ~ dnorm(0, 1e-3)
tau <- pow(sigma, -2)
for (i in 1:n) {
mu[i] <- w0 + w1 * x[i]
y[i] ~ dnorm(mu[i], tau)
}
}
Logistic Regression as a Special Case
- Transitioning between regression types smoothly in Bayesian frameworks:
- Logistic regression employs a Bernoulli likelihood and inverse logit link function.
- Mixture models and linear models serve essential roles in data science.
- Applications extend to various domains requiring analysis of complex datasets including clustering and prediction tasks.