1/32
Flashcards reviewing key concepts from lecture notes, including power laws, normal distribution, association rules, and boosting.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is a Power Law?
A relationship of the form P(x) ∞ 𝑥 −𝑎 where a is the exponent which controls how quickly the probability drops.
Can you give examples of where Power Laws are commonly observed?
Virtual world examples include money, sales, views, traffic, sales data, wealth data, YouTube views, word frequency, and clicks.
When does the asymmetry increase when considering incomes?
Having more sales, more views, or more wealth.
What type of tail does a Power Law distribution have?
Heavy-tailed.
How do you produce a straight line when graphing a power law distribution?
Plotting the logarithm of frequency versus value.
What is the 80/20 rule?
80 percent of the effects come from 20 percent of the causes.
Could you provide examples of where Normal Distribution are commonly observed?
Physical world examples include weight, height, cholesterol, and blood counts.
How do observations relate to the mean in a Normal Distribution?
Observations are linked closely to the mean, and the odds of deviation decline faster as you move away from the average.
What type of tail does a Normal Distribution have?
Thin-tailed.
In a normal distribution, what do the mean and standard deviation represent?
The mean is the center of the bell curve, and the standard deviation controls the width.
What is the purpose of Association Rules?
To discover relationships among variables in large databases.
What is an itemset?
A collection of one or more items.
What is a transaction in the context of association rules?
A set of items bought together (shopping basket).
What is a frequent itemset?
Something that appears in a dataset more frequently than a predefined threshold.
What does 'Support' measure in association rules?
The proportion of transactions in the dataset that contain a particular itemset.
What is the formula for Support?
Support (A → B) = P(B ∩ A)
What does 'Confidence' measure in association rules?
How often items in B appear in transactions that contain A.
What is the formula for Confidence?
Confidence(A → B) = P(B | A) = Support (A ∩ B) / Support (A)
What does 'Lift' measure in association rules?
A measure of how much more likely B is to occur when A has occurred, compared to B occurring independently.
What is the formula for Lift?
Lift( A → B ) = Confidence (A → B) / Support (B)
What does a Lift greater than 1 indicate?
A and B occur more together than expected.
What does a Lift equal to 1 indicate?
A and B are independent.
What does a Lift less than 1 indicate?
Negative correlation.
What is the purpose of the Apriori Algorithm?
To identify all frequent itemsets in a dataset.
What is pruning in the Apriori algorithm?
Avoid checking larger combinations if any of their subsets were already found to be infrequent.
What threshold does the Apriori algorithm require?
Requires a minimum support threshold.
How do confidence and support relate?
The confidence of a rule can never be greater than the support.
What association rule term is equivalent to likelihood ratio?
Likelihood ratio is LIFT.
What do Bayesian rules do?
Update beliefs with new evidence.
What do association rules do?
Find co-occurrences.
What are prior odds?
The ratio of the probability of a hypothesis being true to it being false, before seeing evidence.
What is boosting?
A machine learning technique that combines many simple models into one powerful model.
Does boosting rely on random sampling?
It uses a full dataset.