Lecture 15: Text analytics

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/16

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

17 Terms

1
New cards

Q: What are some applications of text mining?

Spam filters, search engine relevancy, social media summarization, essay grading, author attribution, AI-written news stories.

2
New cards

Q: What format is used for tidy text mining in R?

Tidy text format using tibbles.

3
New cards

Q: What function in R splits text into individual words?

unnest_tokens() from the tidytext package.

4
New cards

Q: What does unnest_tokens(word, text) do?

Breaks each line of text into separate words.

5
New cards

Q: What is a stopword in text mining?

A common word like “the” or “and” that is usually removed because it carries little meaning.

6
New cards

Q: How do you remove stopwords in R?

Use antijoin(stopwords).

7
New cards

Q: What command counts word frequencies after removing stopwords?

count(word, sort = TRUE).

8
New cards

Q: What are the three major sentiment datasets mentioned?

AFINN, Bing, and NRC.

9
New cards

Q: What does the get_sentiments("afinn") function do?

Loads a table mapping words to sentiment scores.

10
New cards

Q: What does a negative value in AFINN sentiment scores indicate?

A negative or unpleasant sentiment.

11
New cards

Q: How do you compute average sentiment by line in R?

Unnest tokens ➔ inner join with sentiment ➔ group by line ➔ summarize mean(value).

12
New cards

Q: What is an example of text for sentiment analysis?

“I hate the dentist”, “I love candy”.

13
New cards

Q: What is the goal of clustering in text analytics?

Group data points without using labels.

14
New cards

Q: What clustering algorithm is mentioned?

K-means clustering.

15
New cards

Q: How does K-means clustering work?

Iteratively reassign points to clusters based on the nearest cluster center.

16
New cards

Q: What are strengths of K-means clustering?

Simple to compute and easy to explain.

17
New cards

Q: What are weaknesses of K-means clustering?

Requires choosing K beforehand and only finds convex-shaped clusters