AI 5.1

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/59

There's no tags or description

Looks like no tags are added yet.

Last updated 5:23 PM on 4/15/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

60 Terms

New cards

What is the "curse of dimensionality" in the context of probability?

It refers to the fact that to represent a full joint distribution of d binary variables, we need 2^d - 1 terms. The number of parameters grows exponentially with the number of variables.

New cards

If you have 3 binary variables (X, Y, Z), how many terms are needed to represent their full joint distribution?

2^3 = 8 probabilities are listed (or 7 independent parameters).

New cards

What is a joint probability distribution?

A function that gives the probability of every possible combination of values for a set of random variables.

New cards

What is a random variable?

A variable whose possible values are numerical outcomes of a random phenomenon. (e.g., Rain = Yes or No).

New cards

What is a binary variable?

A random variable that can only take two possible values, such as 0/1, True/False, or Yes/No.

New cards

Write the Chain Rule of Probability for n variables.

P(X1, X2, …, Xn) = P(X1) * P(X2|X1) * P(X3|X1, X2) * … * P(Xn|X1, X2, …, Xn-1)

New cards

What does the symbol P(A|B) represent?

The conditional probability of event A occurring given that event B has already occurred.

New cards

In the chain rule, what does the final term P(Xn | X1, X2, …, Xn-1) represent?

The probability of the last variable Xn, conditioned on the specific values of all the variables that came before it.

New cards

In the jar example with 3 red and 1 blue ball, what does P(B1 = red) mean and what is its value?

It is the probability that the first ball drawn is red. Its value is 3/4.

New cards

In the jar example, what does P(B2 = red | B1 = red) mean and what is its value?

It is the probability that the second ball is red, given that the first ball drawn was red. Its value is 2/3.

New cards

In the jar example, what does the term "without replacement" mean?

It means that after the first ball is drawn, it is not put back into the jar before the second draw, which changes the probabilities for the second draw.

New cards

What are the two main components of a Bayesian Network?

A Directed Acyclic Graph (DAG). 2. Conditional Probability Distributions (often in tables, CPTs) for each node given its parents.

New cards

In a Bayesian Network graph, what do the nodes and edges represent?

Nodes represent random variables. Edges (arrows) represent direct influence or a "causal" relationship from a parent node to a child node.

New cards

What does DAG stand for and what are its properties?

DAG stands for Directed Acyclic Graph. It is "directed" meaning edges have arrows, and "acyclic" meaning there are no cycles (you cannot follow arrows and return to your starting point).

New cards

What is the fundamental factorization rule for a Bayesian Network?

P(x1, x2, …, xn) = ∏ P(xi | parents(xi)) for i=1 to n. (The joint distribution is the product of each node's conditional probability given its parents).

New cards

In the Bayesian Network factorization, what does "parents(xi)" refer to?

The set of nodes in the graph that have a direct arrow pointing into node xi.

New cards

In the Naïve Bayes model, what is the key assumption encoded in the graph?

All input features (evidence variables X_i) are conditionally independent of each other given the class label Y.

New cards

Write the factorization formula for a Naïve Bayes model.

P(Y, X1, X2, …, Xn) = P(Y) * ∏ P(Xi | Y) for i=1 to n.

New cards

What is a Conditional Probability Table (CPT)?

A table that shows the probability distribution over a node's values for each possible combination of its parents' values.

New cards

In the Wet Grass example, which variables are the parents of "Wet Grass"?

Sprinkler and Rain.

New cards

In the Wet Grass example, how many independent parameters does the CPT for "Wet Grass" have and why?

Because it has two binary parents (Sprinkler and Rain), which gives 2^2 = 4 combinations, and for each combination we need one independent probability (e.g., P(W=true | S,R)).

New cards

How many parameters would a full joint distribution require for 4 binary variables?

2^4 - 1 = 15 independent parameters.

New cards

How many total parameters does the Wet Grass Bayesian Network require, and how many does it save?

It requires 9 parameters (C:1, S:2, R:2, W:4). It saves 15 - 9 = 6 parameters compared to the full joint distribution.

New cards

Define unconditional independence.

Two variables A and B are independent if P(A, B) = P(A) * P(B), or equivalently, P(A|B) = P(A). Knowing B tells you nothing about A.

New cards

Define conditional independence.

Two variables A and B are conditionally independent given a third variable C if P(A, B | C) = P(A | C) * P(B | C). This is written as (A ⟂ B) | C.

New cards

What is the notation for "A is independent of B given C"?

(A ⟂ B) | C

New cards

How many parameters are needed for the full joint P(A,B,C) if all are binary, using the chain rule P(C)P(A|C)P(B|A,C)?

1 + 2 + 4 = 7 parameters.

New cards

How many parameters are needed for the joint if A is independent of B given C, i.e., P(C)P(A|C)P(B|C)?

1 + 2 + 2 = 5 parameters.

New cards

State the Local Markov Property in your own words.

A node is conditionally independent of its non-descendants, given its parents.

New cards

In the Local Markov Property, what are considered "non-descendants" of a node X?

All nodes that are not parents, children, or descendants (i.e., not children, grandchildren, etc.) of X.

New cards

Explain the intuition behind the Local Markov Property.

"I only need to know my immediate parents to ignore my non-descendants." Once you know the parents' values, other unrelated nodes provide no extra information.

New cards

What is a Markov Blanket?

The set of nodes that completely shields a node X from the rest of the network. X is independent of all other nodes given its Markov blanket.

New cards

What three groups of nodes make up the Markov blanket of a node X?

Parents of X. 2. Children of X. 3. Co-parents of X (other parents of X's children).

New cards

Explain the intuition behind the Markov Blanket.

"I only need to know my Markov blanket (parents, children, and co-parents) to ignore everything else in the network."

New cards

What is D-separation?

A graphical criterion used to determine if two sets of nodes are conditionally independent given a third set of observed nodes.

New cards

What does it mean if two nodes A and C are d-separated by a set of observed nodes Z?

It means that A and C are conditionally independent given Z. (A ⟂ C | Z)

New cards

What are the three basic structures that determine if a path is blocked?

Chain (A → B → C). 2. Fork (A ← B → C). 3. Collider (A → B ← C).

New cards

In a Chain (A → B → C), when is the path blocked?

The path is blocked if the middle node B is in the observed set Z.

New cards

In a Fork (A ← B → C), when is the path blocked?

The path is blocked if the middle node B is in the observed set Z.

New cards

In a Collider (A → B ← C), when is the path blocked?

The path is blocked if the middle node B (or any of its descendants) is NOT in the observed set Z. The path becomes unblocked if B is observed.

New cards

What is the "explaining away" phenomenon?

It occurs in a common effect structure. If the effect is observed, confirming one cause reduces the probability of the other cause, as it "explains away" the observation.

New cards

In the structure S (Sprinkler) → W (Wet Grass) ← R (Rain), what happens to the independence of S and R if W is observed?

If W is observed, S and R become dependent (explaining away).

New cards

Write the factorization for a simple direct influence A → B.

P(A, B) = P(A) * P(B|A)

New cards

Write the factorization for an indirect influence (chain) A → B → C.

P(A, B, C) = P(A) * P(B|A) * P(C|B)

New cards

Write the factorization for a common cause (fork) A ← B → C.

P(A, B, C) = P(B) * P(A|B) * P(C|B)

New cards

Write the factorization for a common effect (collider) A → B ← C.

P(A, B, C) = P(A) * P(C) * P(B|A, C)

New cards

In a common cause structure (A ← B → C), what is the independence relationship if B is given?

A and C are independent given B. (A ⟂ C | B)

New cards

In a chain structure (A → B → C), what is the independence relationship if B is given?

A and C are independent given B. (A ⟂ C | B)

New cards

In a common effect structure (A → B ← C), what is the independence relationship if B is NOT given?

A and C are independent. (A ⟂ C)

New cards

State one advantage of Bayesian Networks related to visualization.

They provide a graphical representation that offers a visual and intuitive way to understand complex relationships between variables.

New cards

State one advantage of Bayesian Networks related to knowledge types.

They can combine prior knowledge (from experts) with statistical information from data.

New cards

What is a key disadvantage of Bayesian Networks regarding knowledge?

They often require prior knowledge to specify many probabilities, which can be difficult to obtain.

New cards

What is a key disadvantage of Bayesian Networks regarding computation?

Performing exact inference can be computationally intractable (very difficult or impossible) for large, complex networks.

New cards

Write the formula for the Chain Rule of Probability.

P(X1, X2, …, Xn) = P(X1) * P(X2|X1) * P(X3|X1, X2) * … * P(Xn|X1, X2, …, Xn-1)

New cards

Write the general factorization formula for a Bayesian Network.

P(x1, x2, …, xn) = ∏ P(xi | parents(xi))

New cards

Write the factorization formula for a Naïve Bayes model.

P(Y, X1, X2, …, Xn) = P(Y) * ∏ P(Xi | Y)

New cards

Write the formula for unconditional independence.

P(A, B) = P(A) * P(B)

New cards

Write the formula for conditional independence.

P(A, B | C) = P(A | C) * P(B | C)

New cards

Write the formula for the Local Markov Property.

X ⟂ Non-descendants | Parents(X)

New cards

Write the formula for the Markov Blanket property.

X ⟂ All other nodes | MarkovBlanket(X)