Data Science & AI Flashcards

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/70

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

71 Terms

1
New cards

Standard Deviation/What does larger values mean?

A measure of data spread; large values = high variability.

2
New cards

Variance of 0

All values are identical.

3
New cards

Gaussian Distribution

Bell curve.

4
New cards

% within 1 SD in normal distribution

68%.

5
New cards

Continuous Variable

Height, temperature.

6
New cards

Discrete Variable

Number of students.

7
New cards

Expected Value of Die Roll

3.5

8
New cards

Importance of Gaussian Distribution

Used in modeling errors and natural variation.

9
New cards

68-95-99.7 Rule

Percent within 1, 2, 3 SDs.

10
New cards

Outliers Affect Mean

Skew results upward/downward.

11
New cards

Parameter vs Statistic

Parameter = population, Statistic = sample.

12
New cards

Positive Covariance

Variables increase together.

13
New cards

Best Visualization for Distribution

Histogram or boxplot.

14
New cards

Best Chart for Categories

Bar chart.

15
New cards

Boxplot vs Histogram

Boxplot = summary stats; histogram = frequency distribution.

16
New cards

Multivariate Data Example

Data with multiple features, e.g., height and weight.

17
New cards

Multiple Linear Regression Use

Predict continuous outcomes.

18
New cards

Why Clean Data

Removes errors, improves accuracy.

19
New cards

Duplicate Record Issue

Distorts analysis results.

20
New cards

Low-Quality Data Issue

Produces unreliable AI results.

21
New cards

Imputation Meaning

Filling in missing data.

22
New cards

Classification Algorithms

Decision trees, logistic regression, SVM.

23
New cards

Decision Trees Use

Model decision paths.

24
New cards

Feature Selection Role

Removes irrelevant data to improve performance.

25
New cards

Correlation vs Causation

Correlation = relationship, causation = direct effect.

26
New cards

Data Normalization Importance

Scales data for fair model comparison.

27
New cards

Real-World Data Science Use

Predict sales, detect fraud, diagnose diseases.

28
New cards

Purpose of SQL

Query and manage databases.

29
New cards

SQL Query for All Employees

SELECT * FROM employees;

30
New cards

Library for Data Manipulation

Pandas.

31
New cards

NumPy Used For

Numerical computation.

32
New cards

Pandas Used For

DataFrames, data manipulation.

33
New cards

PyTorch Used For

Deep learning models.

34
New cards

Data Wrangling

Transforming and cleaning raw data.

35
New cards

R vs Python

R = stats focus; Python = general-purpose.

36
New cards

Relational Database

Organizes data in linked tables.

37
New cards

Primary key

Unique identifier per record.

38
New cards

Generative AI

AI that creates new data (text, images, etc.).

39
New cards

Generative AI examples

ChatGPT, DALL·E.

40
New cards

Limitation of generative AI

Can generate incorrect or biased content.

41
New cards

Ethical concern

Deepfakes, misinformation.

42
New cards

Computer vision

AI for image/video understanding.

43
New cards

NLP used for

Language processing tasks.

44
New cards

LLM

Large Language Model.

45
New cards

Hallucination in LLMs

AI generating false or made-up info.

46
New cards

Bias in LLMs

Reflects biased training data.

47
New cards

Symbolic vs neural AI

Symbolic = rules, Neural = learned patterns.

48
New cards

Machine learning

AI that learns from data to make predictions.

49
New cards

Training vs testing vs validation

Train = learn, Test = evaluate, Validation = tune.

50
New cards

Supervised learning feature

Labeled data.

51
New cards

Unsupervised learning example

K-means clustering.

52
New cards

Reinforcement learning

Learning via rewards/punishments.

53
New cards

Real-world reinforcement example

Game-playing AI, robotics.

54
New cards

Neural network

Layers of interconnected nodes mimicking the brain.

55
New cards

Deep learning

Neural networks with many layers.

56
New cards

Overfitting

Model memorizes data; prevented with regularization.

57
New cards

Learning rate

Controls speed of model updates.

58
New cards

Speech recognition

Technology that converts spoken language into text.

59
New cards

Knowledge representation

A method for encoding information for AI reasoning.

60
New cards

Reasoning in AI

Using logic or inference to draw conclusions from data.

61
New cards

Example of reasoning in AI

Diagnosing diseases based on symptoms.

62
New cards

Data privacy

Protecting personal data from unauthorized access.

63
New cards

AI transparency importance

Ensures trust and accountability in AI systems.

64
New cards

Algorithmic bias

Systematic unfairness due to biased data or design.

65
New cards

AI ethics

Guidelines for responsible development and use of AI.

66
New cards

Example of AI ethical concern

Facial recognition used without consent.

67
New cards

Data literacy

Ability to read, work with, analyze, and argue with data.

68
New cards

Raw data

Unprocessed information collected from sources.

69
New cards

Dataset

A structured collection of related data.

70
New cards

Data visualization importance

Helps interpret patterns and insights easily.

71
New cards

Metadata

Data describing other data, such as format or source.