Spam Filtering/ Text classification

studied byStudied by 5 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions

1 / 22

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

23 Terms

1

language identification

the process of determining the language of a given text

New cards
2

supervised learning

a type of machine learning where a model is trained using labeled data

New cards
3

training/ testing data

data used to train a machine learning model and evaluate its performance

New cards
4

document classification

the task of assigning a document to one or more predefined categories

New cards
5

binary classification

a classification task with two possible outcomes

New cards
6

multi-class classfication

a classification task with more than two possible categories

New cards
7

Bayes Rule

a mathematival formla used to update probabilities based on new evidence

New cards
8

Naive Bayes

a probabilistic classifier based on Bayes’ Theorem with an assumption of independence among features

New cards
9

logistic regression

a statistical model used for binary classification problems

New cards
10

false positives

incorrectly identifying a non-relebant instance as relevant

New cards
11

false negatives

failing to identify a relevant instance

New cards
12

character n-gram

a sequence of N consecutive characters used in text analysis

New cards
13

spam

unwanted or unsolicited messages, typically emails

New cards
14

spam-filter

a system used to detect and black spam messages

New cards
15

blacklist

a list of entities that are blocked from accessingg a system or service

New cards
16

whitelist

a list of approved entities that are allowed access to a system or service

New cards
17

rule-based filtering

a spam detection approach usingh manually crafted rules

New cards
18

spam probability

the likelihood that a given message is spam

New cards
19

statistical filtering

a spam detection method based on statstical analysis of message content

New cards
20

hand crafted features

features manually designed by experts for machine learning models

New cards
21

kitchen sink features

an approach that includes many featires without filtering for relevance

New cards
22

sparse features

features that have many zero or missing values

New cards
23

dense features

features that have mostly nonzero values and provide rich information

New cards
robot