1/13
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
What is the goal of association rule analysis?
Discovering associations among feature values.
What are some of the use cases of association rule analysis and explain the relationship to assoication rule analysis.
Market basket analysis and fraud detection.
How does the approach for association rule analysis look like? Explain each step in detail.
1 data preperation
convert features to catagorical variables
convert catagorical variables to dummies
2 method
Apply the apriori algorithm
Calculate the support of each individual item and keep only those items whose support is above a user-specified minimum support threshold.
Generate larger itemsets from frequent itemsets and again only keep those that match the threshold.
Generate association rules for each frequent itemsets and only keep those that meet a minimum confidence (and/or) lift threshold.
3 interpretation
Interpret an association rule if A then B (assuming it meets al the required thresholds)
association rule analysis converts interval variables to categorical variables and creates contingency tables to find combinations. What is a problem that can arise with this procedure and how is it solved?
If the number of features is large, then the number of contingency tables we need will increase very quickly. To solve this problem, each categorical variable is recoded into a dummy variable, with a dummy for each catagory, that indicates if it is part of a transaction or not.
What is the support?
Measures the relative frequency of an item(set) or association rule in the whole dataset.
Does the support of an association rule depend on order?
No its simply the union of two items occurring in the dataset, so this does not change.
Describe the antecedent?
The set of item(s) that appear on the left hand side of the association rule and represent the condition or “if” part of the rule. E.g. , if A happens, then B is likely to happen as well.
Describe the consequent?
The set of item(s) that appear on the right hand side of the association rule and represent the outcome or “then” part of the rule. E.g. if A happens, then B is likely to happen as well.
What is the coverage of an association rule?
The support for its antecedent.
What is the expected confidence of an association rule?
The support for its Consequent.
What is the confidence of an association rule and how should we interpret this?
its support divided by the coverage and can be interpreted as the conditional probability of B given A. It tells you how often items in B appear in transactions that already contain A.
What is the lift of association rule and how should we interpret this?
The lift of association rule: Is its confidence divided by the expected confidence. If lift = 1 there is no association between A and B, the occur together as much as you would expect by chance. If lift > 1 there is a positive association between A and B, meaning that the two occur together more often then we would otherwise expect.
Explain what the appriory property is?
The apriori property tells us that if an item(set) is frequent, all of its subsets must also be frequent. Conversely, if an item(set) is not frequent, none of its subsets will be frequent. This lets the algorithm efficiently reduce the search space because it can skip many combinations that cannot possibly be frequent (match the threshold).
What are some of the limitations to association rule analysis and the apriori algorithm?
applicable to a restrictive form of the data
exponential increase of association rules with decreasing t
high confidence or lift but low support, will not be discovered