Bias & Feedback Loops in Music Recommendations

Introduction Objective: Look into how record labels influence music recommendations and their effects on recommendation systems. Traditionally focused on:

Artist name
Track title
Album title
User ID
Listening context (like what was played) Recently, there’s growing interest in factors like popularity and gender, but record labels need attention too.

Multi-Stage Web Crawling: Collect record label info for albums and link them to major companies (Universal, Sony, Warner, Independent).
Datasets for Analysis:
- Spotify Million Playlist Dataset: 1 million playlists from US users.
- LFM-2b Dataset: Listening profiles from Last.fm based on user data.

The enhanced dataset helps spot characteristics and biases in music recommendations.
Feedback Loop Simulation: Explore how recommendations might change the distribution of record labels.

Major vs. Independent labels:
- Major labels (like Sony, Universal, Warner) heavily influence streaming platforms.
- Independent labels have a tougher time competing with major labels.
- Recommendations affect which music gets noticed and played more.

Step 1: Preprocessing - Gather basic record label info using Spotify API.
Step 2: Mapping Trivial Cases - Identify clear matches (like Universal Group).
Step 3: Label Crawling from Discogs - Collect structured metadata through an API.
Step 4: Label Crawling from Wikipedia - Get unstructured label info (like parent company).
Step 5: Interim Mapping - Add more info for better mapping.
Step 6: Copyright Classification - Check copyright data for accuracy.
Step 7: Final Mapping - Classify remaining unknown labels as Independent.

Calculate the Simpson index to measure track diversity in playlists.
Findings:
- Most playlists show high diversity.
- Trends indicated that less diversity correlates with major labels gaining more prominence.

Simulation of Recommendations:
- Use Alternating Least Squares (ALS) for collaborative filtering recommendations.
- Simulate top recommendations and user interactions.
Results:
- MPD showed stable distributions without clear feedback loops.
- LFM-2b indicated an over-representation of major labels over iterations, even if independent labels were strong initially.

Initial findings show record labels have a complicated role in recommendation biases.
More research is needed to understand how popularity biases influence recommendations and diversity.
It’s important to assess fairness in music recommendations as a long-term goal.