Recommender Systems and Graph Theory Concepts

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/58

Earn XP

Description and Tags

Vocabulary flashcards covering key concepts in recommender systems, matrix factorization, feedback types, distance metrics, and community detection in graphs.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

59 Terms

New cards

User–Item Rating Matrix

A matrix where each row is a user, each column is an item, and entries contain ratings when available.

New cards

Missing Values

Unobserved user–item interactions represented as empty cells in the rating matrix.

New cards

Sparsity

The condition where most user–item ratings are missing, leading to a sparse matrix.

New cards

Impact of Sparsity

Makes similarity estimates noisy because users share few co-rated items.

New cards

Long-Tail Distribution

Popular items form a small 'head,' while many niche items form a large 'tail.'

New cards

User–User Collaborative Filtering

Predict ratings using users who have similar rating patterns.

New cards

Item–Item Collaborative Filtering

Predict ratings using items that are similar to those a user has rated.

New cards

Neighborhood

A set of users or items deemed similar based on similarity metrics.

New cards

Cold-Start Problem

Difficulty recommending items or users with insufficient historical data.

New cards

Latent Factors

Low-dimensional vectors representing hidden traits of users and items.

New cards

Global Mean

Overall average rating across all user–item pairs.

New cards

User Bias

Tendency of a user to rate higher or lower than average.

New cards

Item Bias

Tendency of an item to receive higher or lower ratings than average.

New cards

Latent Space

The learned embedding space where users and items are represented as vectors.

New cards

RMSE

Rating prediction error measuring average squared deviation between predicted and true ratings.

New cards

Precision@k

Fraction of top-k recommended items that are relevant.

New cards

Recall@k

Fraction of relevant items that appear in the top-k recommendations.

New cards

Hit-Rate

Whether at least one relevant item appears in the recommendations.

New cards

NDCG@k

Ranking metric that assigns higher weight to correctly ranked relevant items near the top.

New cards

Rating Prediction Task

Predicting explicit numerical ratings.

New cards

Ranking Task

Ordering items by predicted relevance rather than predicting exact rating values.

New cards

Explicit Feedback

Direct user-provided ratings or evaluations.

New cards

Implicit Feedback

Behavioral signals such as clicks, views, or watch time.

New cards

Noisy Feedback

Implicit signals that do not directly reflect true preference strength.

New cards

Exposure Bias

Observed behavior depends on what users were shown, not all available items.

New cards

Position Bias

Higher-ranked items receive more attention regardless of true relevance.

New cards

Missing-Not-Negative

Absence of interaction is not the same as disliking an item.

New cards

Similarity Measure

Quantifies how similar two users or items are (e.g., cosine similarity).

New cards

Cosine Similarity

Measures angle-based similarity between vectors.

New cards

Euclidean Distance

Measures straight-line distance between vectors.

New cards

Curse of Dimensionality

Distance metrics become less meaningful in high-dimensional spaces.

New cards

Feature Scaling

Adjusting feature magnitudes to ensure equal influence in distance-based models.

New cards

k Value (k-NN)

Number of neighbors; low k risks high variance, high k risks high bias.

New cards

Linear Separability

Existence of a linear boundary that perfectly separates classes.

New cards

Perceptron Convergence

The perceptron converges only if the data is linearly separable.

New cards

Decision Boundary

A hyperplane that divides classes.

New cards

Order Dependence

Perceptron updates depend on the sequence of training examples.

New cards

Margin

Distance between the decision boundary and the nearest data points.

New cards

Support Vectors

Data points that lie on or near the margin and define the decision boundary.

New cards

Soft-Margin SVM

Allows some misclassification to improve generalization.

New cards

Kernel Trick

Method for learning non-linear boundaries by computing similarity in transformed feature spaces.

New cards

RBF Kernel

A popular kernel that measures similarity based on distance in feature space.

New cards

Overfitting

Model fits training data too closely and performs poorly on unseen data.

New cards

Underfitting

Model is too simple and fails to capture important patterns.

New cards

Train/Validation/Test Split

Partitioning data to train, tune, and evaluate a model.

New cards

Cross-Validation

Repeated training/testing on multiple splits for more reliable evaluation.

New cards

Regularization

Penalizing model complexity to reduce overfitting.

New cards

L2 Regularization

Penalizes large parameter values via squared magnitude.

New cards

Community

Group of nodes densely connected internally and sparsely connected externally.

New cards

Modularity

Metric evaluating how well a division separates dense communities.

New cards

Modularity Resolution Limit

Modularity may fail to detect small but real communities.

New cards

Edge Betweenness

Number of shortest paths that pass through an edge.

New cards

Girvan–Newman Algorithm

Detects communities by iteratively removing edges with highest betweenness.

New cards

Bridge Edge

An edge whose removal disconnects parts of the network.

New cards

Bottleneck

Node or edge that many shortest paths depend on.

New cards

Affiliation Graph Model

Model where nodes belong to multiple communities and connect based on shared memberships.

New cards

Overlapping Communities

Communities where nodes can have more than one membership.

New cards

Overlap Region

Area where nodes with multiple affiliations show higher connectivity.

New cards

Connection Probability

Likelihood that two nodes connect increases with the number of shared affiliations.