Artificial Intelligence Exam Notes

MCMC Output

  • Introduce MCMC output process.
  • Utilizes Google Colab for modeling.

Deterministic Variables in Models

  • Deterministic Variables: Transformations introduced without additional uncertainty.
  • Generative Model Treatment:
    • Explicit and implicit treatment of deterministic variables.
    • Example:
    • θ^Beta(a,b)\hat{\theta} \sim \text{Beta}(a, b)
    • a^=θ^(κ2)+1\hat{a} = \hat{\theta}(\kappa - 2) + 1
    • b^=(1θ^)(κ2)+1\hat{b} = (1 - \hat{\theta})(\kappa - 2) + 1
    • θiBeta(a^,b^)\theta_i \sim \text{Beta}(\hat{a}, \hat{b})
    • y<em>iBinomial(n</em>i,θi)y<em>i \sim \text{Binomial}(n</em>i, \theta_i)
  • Tracking variables in JAGS (Just Another Gibbs Sampler).

JAGS Model for Deterministic Variables

  • Model structure defined for JAGS:
  model {
    theta_hat ~ dbeta(a, b)
    a_transformed <- theta_hat * (kappa - 2) + 1
    b_transformed <- (1 - theta_hat) * (kappa - 2) + 1
    for (i in 1:M) {
        theta[i] ~ dbeta(a_transformed, b_transformed)
        y[i] ~ dbin(theta[i], n[i])
    }
  }
  • Convenience: Transformations are not required but simplify the model.

Generative and Graphical Models

  • Explicit vs. Implicit representation:
    • Explicit: Tracks values of \ \hat{a}, \hat{b}, \hat{\theta}.
    • Implicit: Depends on parameters without explicit tracking.

Mixture Models

  • Importance: Used for datasets not well-represented by standard distributions (e.g., Gaussian).
  • Framework:
    • Many datasets are mixtures of distributions.
    • Commonly used in Bayesian clustering.
    • Example:
    • θ<em>kp(θ</em>k)\theta<em>k \sim p(\theta</em>k)
    • ziπCategorical(π)z_i | \pi \sim \text{Categorical}(\pi)
    • x<em>iz</em>i,θp(x<em>iθ</em>zi)x<em>i | z</em>i, \theta \sim p(x<em>i | \theta</em>{z_i})
  • Gaussian mixture model as common example.

Inference with JAGS for Mixture Models

  • Model representation for Gaussian mixture:
  model {
    pi ~ ddirch(alpha)
    for (k in 1:K) {
        mu[k] ~ dnorm(170, 0.01)
        sigma[k] ~ dunif(0.0, 10.0)
        tau[k] <- pow(sigma[k], -2)
    }
    for (i in 1:n) {
        z[i] ~ dcat(pi[])
        x[i] ~ dnorm(mu[z[i]], tau[z[i]])
    }
  }
  • Performance: Inference may require substantial computation time.

Challenges in Mixture Models

  • Symmetry Issues: Multi-modal distribution complicates inference.
  • Posterior Predictive: Must scale densities of components correctly.
    • p(x<em>iμ,σ,π)=</em>k=1Kπ<em>kGaussian(x</em>iμ<em>k,σ</em>k)p(x<em>i | \mu, \sigma, \pi) = \sum</em>{k=1}^K \pi<em>k \text{Gaussian}(x</em>i | \mu<em>k, \sigma</em>k)
  • Use of strategies to reduce symmetries by enforcing ordering in μ values:
  model {
    for (k in 1:K) {
        mu0[k] ~ dnorm(170, 0.01)
    }
    mu <- sort(mu0)
  }

Bayesian Linear Regression

  • Framework: Relationships modeled using known predictors.
  • Generalized Linear Model (GLM) encompasses regression and classification paradigms.
  • Linear Regression:
    • Model used for predicting continuous variables:
    • Single predictor: y=w<em>0+w</em>1xy = w<em>0 + w</em>1x
    • Multiple predictors brought together as:
      • y=w<em>0+w</em>1x<em>1++w</em>Kx<em>K=w</em>0+<em>k=1Kw</em>kxky = w<em>0 + w</em>1x<em>1 + … + w</em>Kx<em>K = w</em>0 + \sum<em>{k=1}^K w</em>k x_k
  • Error Representation:
    • y=wTx+ϵy = w_T x + \epsilon, where \epsilon denotes error.

Bayesian Inference

  • Focus on learning weights which imply relationships between predictors and responses.
  • Practical application of Bayesian inference facilitated through JAGS for model setup and execution:
  model {
    w0 ~ dnorm(0, 1e-3)
    w1 ~ dnorm(0, 1e-3)
    tau <- pow(sigma, -2)
    for (i in 1:n) {
        mu[i] <- w0 + w1 * x[i]
        y[i] ~ dnorm(mu[i], tau)
    }
  }

Logistic Regression as a Special Case

  • Transitioning between regression types smoothly in Bayesian frameworks:
    • Logistic regression employs a Bernoulli likelihood and inverse logit link function.

Final Remarks

  • Mixture models and linear models serve essential roles in data science.
  • Applications extend to various domains requiring analysis of complex datasets including clustering and prediction tasks.