Association Rule Mining

0.0(0)
studied byStudied by 1 person
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/19

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

20 Terms

1
New cards

Q: What is Association Rule Mining?

A: A data mining technique used to find associations or relationships between items in a dataset, often applied to market-basket analysis or recommender systems.

2
New cards

Q: What is a Market-Basket Transaction?

A: A data set representing multiple transactions, each containing a set of items bought together. The goal is to find associations between these items.

3
New cards

Q: What is an Itemset?

A: A group of items of interest within a dataset. For example, an itemset might be {Milk, Diapers, Beer}.

4
New cards

Q: Define an Association Rule.

A: A rule that suggests a relationship between itemsets, represented as X→YX \to YX→Y, where X is the antecedent and Y is the consequent. Example: {Diapers} → {Beer}.

5
New cards

Q: What does Support Count (σ) measure?

A: The number of transactions that include a specific itemset.

6
New cards

Q: How do you calculate Support (s)?

A: Support is the fraction of transactions containing the itemset, calculated as:

Support Count of X/ Total Transactions

7
New cards

Q: What does Confidence (c) measure?

A: The strength of an association rule, showing how often items in YYY appear in transactions that contain XXX. It ranges from 0 to 1.

8
New cards

Q: What does a Confidence of 0.67 indicate for the rule {Milk, Diapers} → {Beer}?

A: It means that 67% of the time when a transaction includes Milk and Diapers, it also includes Beer.

9
New cards

Q: What is Lift in association rule mining?

A: A metric that measures how much more likely items in X→Y appear together compared to random chance.

10
New cards

What does greater than and less than mean in regards to lift?

Lift > 1 suggests a stronger-than-random association, while Lift < 1 indicates a weaker or negative relationship.

11
New cards

Q: What does a Lift greater than 1 imply?

A: It indicates that the occurrence of X→YX \to YX→Y together is more likely than expected by chance, showing a stronger association.

12
New cards

Q: In the Beer and Diapers example, what was the surprising finding?

A: Data analysis revealed that young American males who buy diapers on Friday afternoons are also likely to buy beer—an unexpected correlation.

13
New cards

Q: What type of analysis was used in the Amazon Recommender System example?

A: Association rule mining to suggest items based on current product views, improving recommendations.

14
New cards

Q: What is the significance of setting thresholds in association rule mining?

A: Thresholds for support, confidence, and lift help filter out weaker rules, ensuring only strong and actionable rules are considered.

15
New cards

Q: Why should you not blindly follow the numbers in association rule mining?

A: Numbers alone may not capture the context or relevance of associations, so interpretations should be validated with domain knowledge.

16
New cards

Q: What does a Lift of less than 1 indicate about the association between two items?

A: It suggests that the items are less likely to occur together than expected by chance, indicating a negative relationship.

17
New cards

Q: What are the three main metrics in association rule mining?

A: Support, Confidence, and Lift.

18
New cards

Q: Can you have high confidence but low lift? Why?

A: Yes, high confidence may not always mean a strong association if the items involved are already common, leading to low lift.

19
New cards

Q: How can association rule mining be applied outside of retail?

A: In analyzing service usage, such as evaluating the effect of Netflix subscriptions on Cable TV, using metrics like lift to identify negative associations.

20
New cards

Q: What does it mean if 40% of transactions contain an itemset {Milk, Diapers, Beer}?

A: The support for that itemset is 0.4, indicating a moderate frequency of occurrence in the data.