1/19
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Q: What is Association Rule Mining?
A: A data mining technique used to find associations or relationships between items in a dataset, often applied to market-basket analysis or recommender systems.
Q: What is a Market-Basket Transaction?
A: A data set representing multiple transactions, each containing a set of items bought together. The goal is to find associations between these items.
Q: What is an Itemset?
A: A group of items of interest within a dataset. For example, an itemset might be {Milk, Diapers, Beer}.
Q: Define an Association Rule.
A: A rule that suggests a relationship between itemsets, represented as X→YX \to YX→Y, where X is the antecedent and Y is the consequent. Example: {Diapers} → {Beer}.
Q: What does Support Count (σ) measure?
A: The number of transactions that include a specific itemset.
Q: How do you calculate Support (s)?
A: Support is the fraction of transactions containing the itemset, calculated as:
Support Count of X/ Total Transactions
Q: What does Confidence (c) measure?
A: The strength of an association rule, showing how often items in YYY appear in transactions that contain XXX. It ranges from 0 to 1.
Q: What does a Confidence of 0.67 indicate for the rule {Milk, Diapers} → {Beer}?
A: It means that 67% of the time when a transaction includes Milk and Diapers, it also includes Beer.
Q: What is Lift in association rule mining?
A: A metric that measures how much more likely items in X→Y appear together compared to random chance.
What does greater than and less than mean in regards to lift?
Lift > 1 suggests a stronger-than-random association, while Lift < 1 indicates a weaker or negative relationship.
Q: What does a Lift greater than 1 imply?
A: It indicates that the occurrence of X→YX \to YX→Y together is more likely than expected by chance, showing a stronger association.
Q: In the Beer and Diapers example, what was the surprising finding?
A: Data analysis revealed that young American males who buy diapers on Friday afternoons are also likely to buy beer—an unexpected correlation.
Q: What type of analysis was used in the Amazon Recommender System example?
A: Association rule mining to suggest items based on current product views, improving recommendations.
Q: What is the significance of setting thresholds in association rule mining?
A: Thresholds for support, confidence, and lift help filter out weaker rules, ensuring only strong and actionable rules are considered.
Q: Why should you not blindly follow the numbers in association rule mining?
A: Numbers alone may not capture the context or relevance of associations, so interpretations should be validated with domain knowledge.
Q: What does a Lift of less than 1 indicate about the association between two items?
A: It suggests that the items are less likely to occur together than expected by chance, indicating a negative relationship.
Q: What are the three main metrics in association rule mining?
A: Support, Confidence, and Lift.
Q: Can you have high confidence but low lift? Why?
A: Yes, high confidence may not always mean a strong association if the items involved are already common, leading to low lift.
Q: How can association rule mining be applied outside of retail?
A: In analyzing service usage, such as evaluating the effect of Netflix subscriptions on Cable TV, using metrics like lift to identify negative associations.
Q: What does it mean if 40% of transactions contain an itemset {Milk, Diapers, Beer}?
A: The support for that itemset is 0.4, indicating a moderate frequency of occurrence in the data.