Artificial Intelligence Exam Notes

MCMC Output

Introduce MCMC output process.
Utilizes Google Colab for modeling.

Deterministic Variables in Models

Deterministic Variables: Transformations introduced without additional uncertainty.
Generative Model Treatment:
- Explicit and implicit treatment of deterministic variables.
- Example:
- $\hat{\theta} \sim \text{Beta}(a, b)$
- $\hat{a} = \hat{\theta}(\kappa - 2) + 1$
- $\hat{b} = (1 - \hat{\theta})(\kappa - 2) + 1$
- $\theta_i \sim \text{Beta}(\hat{a}, \hat{b})$
- $yi \sim \text{Binomial}(ni, \theta_i)$
Tracking variables in JAGS (Just Another Gibbs Sampler).

JAGS Model for Deterministic Variables

Model structure defined for JAGS:

  model {
    theta_hat ~ dbeta(a, b)
    a_transformed <- theta_hat * (kappa - 2) + 1
    b_transformed <- (1 - theta_hat) * (kappa - 2) + 1
    for (i in 1:M) {
        theta[i] ~ dbeta(a_transformed, b_transformed)
        y[i] ~ dbin(theta[i], n[i])
    }
  }

Convenience: Transformations are not required but simplify the model.

Generative and Graphical Models

Explicit vs. Implicit representation:
- Explicit: Tracks values of \ \hat{a}, \hat{b}, \hat{\theta}.
- Implicit: Depends on parameters without explicit tracking.

Mixture Models

Importance: Used for datasets not well-represented by standard distributions (e.g., Gaussian).
Framework:
- Many datasets are mixtures of distributions.
- Commonly used in Bayesian clustering.
- Example:
- $\thetak \sim p(\thetak)$
- $z_i | \pi \sim \text{Categorical}(\pi)$
- $xi | zi, \theta \sim p(xi | \theta{z_i})$
Gaussian mixture model as common example.

Inference with JAGS for Mixture Models

Model representation for Gaussian mixture:

  model {
    pi ~ ddirch(alpha)
    for (k in 1:K) {
        mu[k] ~ dnorm(170, 0.01)
        sigma[k] ~ dunif(0.0, 10.0)
        tau[k] <- pow(sigma[k], -2)
    }
    for (i in 1:n) {
        z[i] ~ dcat(pi[])
        x[i] ~ dnorm(mu[z[i]], tau[z[i]])
    }
  }

Performance: Inference may require substantial computation time.

Challenges in Mixture Models

Symmetry Issues: Multi-modal distribution complicates inference.
Posterior Predictive: Must scale densities of components correctly.
- $p(xi | \mu, \sigma, \pi) = \sum{k=1}^K \pik \text{Gaussian}(xi | \muk, \sigmak)$
Use of strategies to reduce symmetries by enforcing ordering in μ values:

  model {
    for (k in 1:K) {
        mu0[k] ~ dnorm(170, 0.01)
    }
    mu <- sort(mu0)
  }

Bayesian Linear Regression

Framework: Relationships modeled using known predictors.
Generalized Linear Model (GLM) encompasses regression and classification paradigms.
Linear Regression:
- Model used for predicting continuous variables:
- Single predictor: $y = w0 + w1x$
- Multiple predictors brought together as:
 - $y = w0 + w1x1 + … + wKxK = w0 + \sum{k=1}^K wk x_k$
Error Representation:
- $y = w_T x + \epsilon$ , where \epsilon denotes error.

Bayesian Inference

Focus on learning weights which imply relationships between predictors and responses.
Practical application of Bayesian inference facilitated through JAGS for model setup and execution:

  model {
    w0 ~ dnorm(0, 1e-3)
    w1 ~ dnorm(0, 1e-3)
    tau <- pow(sigma, -2)
    for (i in 1:n) {
        mu[i] <- w0 + w1 * x[i]
        y[i] ~ dnorm(mu[i], tau)
    }
  }

Logistic Regression as a Special Case

Transitioning between regression types smoothly in Bayesian frameworks:
- Logistic regression employs a Bernoulli likelihood and inverse logit link function.

Final Remarks

Mixture models and linear models serve essential roles in data science.
Applications extend to various domains requiring analysis of complex datasets including clustering and prediction tasks.