AI and DataScience

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/44

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 5:45 AM on 4/29/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

45 Terms

1
New cards

Mode

most frequency value in the dataset

2
New cards

Standard deviation

spread of a dataset relative to the mean

3
New cards

Quartiles

divide the data into 4 sections: 1st quartile; median; and 3rd quartile

4
New cards

Boxplots

min and max on the wiskers; 1st and 3rd quartiles are the box ends; median is the line in the box

5
New cards

Histogram

bar graph showing frequency on the y axis and value on the x axis

6
New cards

Probability of an event

p

7
New cards

Probability of independent events

P(A & B) = P(A) x P(B)

8
New cards

Conditional probability

P(B|A) = P(B & A)/P(A)

9
New cards

Posterior Probability

P(A|B)= P(B|A) P(A) / P(B)

10
New cards

Likelihood Table

table of probabilities

11
New cards

Laplace estimator

Add 1 to eliminate zeros

12
New cards

Entropy

Measure of randomness (purely random E = 1

13
New cards

SSE

Sum of squared errors

14
New cards

Covariance matrix

???????

15
New cards

Correlation matrix

1=strong positive relationship; 0=no relationship; -1=strong negative relationship

16
New cards

Confusion Matrix

table of true positive; false positive; etc.

17
New cards

Matthews Correlation Coefficient

18
New cards

Kappa statistic

Model performance compared to random guessing

19
New cards

Sensitivity

Probability of true positives to all positives in training set

20
New cards

Specificity

Probability of true negatives to all negatives in training set

21
New cards

Precision

Probability of true positives to all predicted positives

22
New cards

Recall

sensitivity Percentage of positive results for searches

23
New cards

F-measure

Balance between precision and recall

24
New cards

ROC curve

Visual plot of true positives again avoiding false positives

25
New cards

AUC

area under the curve

26
New cards

subset()

return dataframe elements that match a condition

27
New cards

lapply()

Apply a function to a list

28
New cards

sample()

Generate random indices over a range

29
New cards

tm_map()

stop words and stem words; used in word corpus

30
New cards

DocumentTermMatrix()

used in word corpus

31
New cards

prop.table()

Return percentages of each category

32
New cards

model()

model( class ~ predictors, data = train )

33
New cards

pairs()

Plot distribution between pairs of features in a dataset

34
New cards

pairs.panels()

From package psych

35
New cards

Normalization

adjust for larger values that may bias classification

36
New cards

Min-Max Normalization

normalize based on the min and max values

37
New cards

Z-Score Standardization

shift mean to 0 and set standard deviation to 1

38
New cards

Dummy Coding

Take nominal values and set up binary choices assigned 0 and 1.

Convert numerical data into a limited number of levels

39
New cards

Thresholding

Numerical values above a threshold given 1

40
New cards

Imputation

Use statistics from one feature to predict missing values for another feature

41
New cards

Enhancing Performance

42
New cards

Meta-learning

Set up model to learn how to learn

43
New cards

Bagging

Use the bootstrap method on an unstable learner (i.e. decision tree); Let the bags vote on a class

44
New cards

Boosting

Generate multiple weak learners; Each learner is trained on a complementary portion of the data to capture examples that are difficult to classify (adaptive boosting); Use a weighted vote between models based on past performance

45
New cards

Random Forest

Train an ensemble of decision trees using different combinations of features within the dataset; Allow models to vote on the predicted class.