1/76
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Q1: What libraries are imported at the start of the project?
A1: pandas
, numpy
, matplotlib.pyplot
, TfidfVectorizer
, cosine_similarity
, linear_kernel
.
Q2: Why is pd.set_option("display.max_colwidth", 200)
used?
A2: To show full text (no truncation) when displaying columns like overviews or taglines.
Q3: How is the dataset loaded efficiently?
A3: By selecting only needed columns (name
, overview
, tagline
, up to 8 genres
, and numeric stats).
Q4: What does fillna("")
do during preprocessing?
A4: Replaces missing values with empty strings for safe concatenation.
Q5: How are genres combined into one field?
A5: By joining all genres[i].name
columns into a single string called genres_combined
.
Q6: What is the purpose of the content
column?
A6: It concatenates overview + tagline + genres
into one text field for text analysis.
Q7: How is genre distribution analyzed?
A7: By tokenizing genres_combined
, counting occurrences, and plotting the top 20 genres.
Q8: Which numeric features are used for correlation analysis?
A8: vote_average
, vote_count
, popularity
, number_of_episodes
, number_of_seasons
.
Q9: How is the correlation heatmap created?
A9: By converting numeric columns to numbers, computing .corr()
, and visualizing with a heatmap.
Q10: What does TfidfVectorizer(stop_words="english")
do?
A10: Converts text into numerical vectors based on word importance, removing common stop words.
Q11: What is stored in tfidf_matrix
?
A11: The TF-IDF representation of all shows’ content
text.
Q12: Why is an indices
mapping created?
A12: To quickly map show names to their row index in the dataframe.
Q13: How does the basic recommend
function work?
A13:
Q14: What improvement is added in the second recommend
function?
A14: A fuzzy search fallback that suggests close matches if the title is not found.
Q15: How are recommendations visualized?
A15: With a horizontal bar chart showing the cosine similarity of the top similar shows.