Data Science and AI deck

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/49

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

50 Terms

1
New cards

Mean

Average of values sum divided by count.

2
New cards

Median

Middle value when data is sorted.

3
New cards

Mode

Most frequent value in a dataset.

4
New cards

Range

Max minus min in a dataset.

5
New cards

Variance

Average squared distance from the mean.

6
New cards

Standard deviation

Square root of variance; typical spread from the mean.

7
New cards

Correlation

Strength and direction of a linear relationship.

8
New cards

Correlation vs causation

Correlation does not prove one variable causes another.

9
New cards

Conditional probability P(A|B)

Probability of A given B occurred.

10
New cards

Independent events

A and B are independent if P(A|B)=P(A).

11
New cards

Normal distribution

Bell-shaped distribution defined by mean and standard deviation.

12
New cards

Expected value

Weighted average outcome based on probabilities.

13
New cards

Discrete variable

Counts or separate values like number of clicks.

14
New cards

Continuous variable

Any value in a range like time or temperature.

15
New cards

Histogram use

Shows distribution of a numeric variable.

16
New cards

Scatterplot use

Shows relationship between two numeric variables.

17
New cards

Boxplot use

Shows median quartiles spread and potential outliers.

18
New cards

Data cleaning purpose

Fix errors and improve data quality for analysis and models.

19
New cards

Common data issues

Missing values duplicates inconsistent formats outliers bad sources.

20
New cards

Structured data

Fixed schema like rows and columns in tables.

21
New cards

Unstructured data

Text images audio video without a fixed schema.

22
New cards

Semi-structured data

JSON or XML with flexible fields.

23
New cards

Primary key

Uniquely identifies a row in a table.

24
New cards

Foreign key

References a primary key in another table to link tables.

25
New cards

SQL SELECT

Returns columns and rows from a table.

26
New cards

SQL WHERE

Filters rows by condition.

27
New cards

SQL GROUP BY

Groups rows for aggregation like COUNT SUM AVG.

28
New cards

SQL HAVING

Filters grouped results after aggregation.

29
New cards

INNER JOIN

Returns rows with matching keys in both tables.

30
New cards

Pandas used for

Data wrangling cleaning and analysis with DataFrames.

31
New cards

NumPy used for

Fast numerical arrays and mathematical operations.

32
New cards

PyTorch used for

Building and training deep learning models.

33
New cards

Generative AI

AI that produces new content like text images code.

34
New cards

LLM limitation

Can hallucinate and be confidently wrong.

35
New cards

Train validation test split

Train learns; validation tunes; test evaluates generalization.

36
New cards

Supervised learning

Learns from labeled examples to predict targets.

37
New cards

Unsupervised learning

Finds structure in unlabeled data like clusters.

38
New cards

Reinforcement learning

Learns actions through rewards and penalties.

39
New cards

Overfitting

Great training performance but poor new-data performance.

40
New cards

Underfitting

Model too simple; poor performance even on training data.

41
New cards

Confusion matrix

Counts TP FP TN FN for classification.

42
New cards

Precision

TP divided by TP plus FP.

43
New cards

Recall

TP divided by TP plus FN.

44
New cards

Linear regression

Predicts a continuous numeric value.

45
New cards

Logistic regression

Predicts probability for classification.

46
New cards

Decision tree

Splits data by feature rules to classify or predict.

47
New cards

K-means clustering

Groups data into k clusters by similarity.

48
New cards

Algorithmic bias

Systematic unfair outcomes due to data or model choices.

49
New cards

Ethical AI practice

Inclusive data, bias audits, transparency, accountability.

50
New cards