Lecture Notes on Power Laws, Normal Distribution, Association Rules, and Boosting

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/32

flashcard set

Earn XP

Description and Tags

Flashcards reviewing key concepts from lecture notes, including power laws, normal distribution, association rules, and boosting.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

33 Terms

1
New cards

What is a Power Law?

A relationship of the form P(x) ∞ 𝑥 −𝑎 where a is the exponent which controls how quickly the probability drops.

2
New cards

Can you give examples of where Power Laws are commonly observed?

Virtual world examples include money, sales, views, traffic, sales data, wealth data, YouTube views, word frequency, and clicks.

3
New cards

When does the asymmetry increase when considering incomes?

Having more sales, more views, or more wealth.

4
New cards

What type of tail does a Power Law distribution have?

Heavy-tailed.

5
New cards

How do you produce a straight line when graphing a power law distribution?

Plotting the logarithm of frequency versus value.

6
New cards

What is the 80/20 rule?

80 percent of the effects come from 20 percent of the causes.

7
New cards

Could you provide examples of where Normal Distribution are commonly observed?

Physical world examples include weight, height, cholesterol, and blood counts.

8
New cards

How do observations relate to the mean in a Normal Distribution?

Observations are linked closely to the mean, and the odds of deviation decline faster as you move away from the average.

9
New cards

What type of tail does a Normal Distribution have?

Thin-tailed.

10
New cards

In a normal distribution, what do the mean and standard deviation represent?

The mean is the center of the bell curve, and the standard deviation controls the width.

11
New cards

What is the purpose of Association Rules?

To discover relationships among variables in large databases.

12
New cards

What is an itemset?

A collection of one or more items.

13
New cards

What is a transaction in the context of association rules?

A set of items bought together (shopping basket).

14
New cards

What is a frequent itemset?

Something that appears in a dataset more frequently than a predefined threshold.

15
New cards

What does 'Support' measure in association rules?

The proportion of transactions in the dataset that contain a particular itemset.

16
New cards

What is the formula for Support?

Support (A → B) = P(B ∩ A)

17
New cards

What does 'Confidence' measure in association rules?

How often items in B appear in transactions that contain A.

18
New cards

What is the formula for Confidence?

Confidence(A → B) = P(B | A) = Support (A ∩ B) / Support (A)

19
New cards

What does 'Lift' measure in association rules?

A measure of how much more likely B is to occur when A has occurred, compared to B occurring independently.

20
New cards

What is the formula for Lift?

Lift( A → B ) = Confidence (A → B) / Support (B)

21
New cards

What does a Lift greater than 1 indicate?

A and B occur more together than expected.

22
New cards

What does a Lift equal to 1 indicate?

A and B are independent.

23
New cards

What does a Lift less than 1 indicate?

Negative correlation.

24
New cards

What is the purpose of the Apriori Algorithm?

To identify all frequent itemsets in a dataset.

25
New cards

What is pruning in the Apriori algorithm?

Avoid checking larger combinations if any of their subsets were already found to be infrequent.

26
New cards

What threshold does the Apriori algorithm require?

Requires a minimum support threshold.

27
New cards

How do confidence and support relate?

The confidence of a rule can never be greater than the support.

28
New cards

What association rule term is equivalent to likelihood ratio?

Likelihood ratio is LIFT.

29
New cards

What do Bayesian rules do?

Update beliefs with new evidence.

30
New cards

What do association rules do?

Find co-occurrences.

31
New cards

What are prior odds?

The ratio of the probability of a hypothesis being true to it being false, before seeing evidence.

32
New cards

What is boosting?

A machine learning technique that combines many simple models into one powerful model.

33
New cards

Does boosting rely on random sampling?

It uses a full dataset.