cram data science and ai

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/63

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

64 Terms

1
New cards

Mean

Average of all values

2
New cards

most affected by outliers

3
New cards

Median

Middle value of a dataset

4
New cards

Mode

Most frequently occurring value

5
New cards

Range

Difference between maximum and minimum values

6
New cards

Standard deviation

Measure of spread of data from the mean

7
New cards

Variance

Square of standard deviation

8
New cards

Covariance

Measure of how two variables change together

9
New cards

Normal distribution

Bell-shaped symmetric distribution defined by mean and standard deviation

10
New cards

Expected value

Average outcome of a random variable over time

11
New cards

Continuous variable

Variable with infinite possible values such as height

12
New cards

Discrete variable

Countable variable such as number of items

13
New cards

Complement of an event

The probability the event does not occur

14
New cards

Histogram

Chart showing frequency distribution of data

15
New cards

Boxplot

Graph displaying median quartiles and outliers

16
New cards

Scatterplot

Graph showing relationship between two variables

17
New cards

Overfitting

Model learns noise instead of patterns and performs poorly on new data

18
New cards

Underfitting

Model is too simple and doesn’t learn important patterns

19
New cards

Confusion matrix

Table used to evaluate classification models

20
New cards

Precision

How many predicted positives were correct

21
New cards

Recall

How many actual positives were identified correctly

22
New cards

Coefficient of variation

Spread measure useful for comparing datasets with different units

23
New cards

Data cleaning

Correcting or removing inaccurate inconsistent duplicated or missing data

24
New cards

Data normalization

Scaling features into a similar range

25
New cards

Regression

Predicts continuous numerical outcomes

26
New cards

Classification

Predicts categories or labels

27
New cards

SQL

Language used to manage and query relational databases

28
New cards

Primary key

Unique identifier for a database record

29
New cards

Pandas

Python library for data tables and data analysis

30
New cards

NumPy

Python library for numerical computing and arrays

31
New cards

Matplotlib

Python library for data visualization

32
New cards

TensorFlow

Framework for building and training deep learning models

33
New cards

Jupyter Notebook

Interactive tool for writing and running data science code in the browser

34
New cards

Artificial intelligence

Computers performing tasks that normally require human intelligence

35
New cards

Generative AI

AI that creates new content such as text images or audio

36
New cards

AI subfields

Categories including computer vision NLP robotics and human-AI interaction

37
New cards

Large language model

AI system trained to predict and generate text

38
New cards

Supervised learning

Learning based on labeled input and output data

39
New cards

Unsupervised learning

Learning patterns or groups from unlabeled data

40
New cards

Reinforcement learning

Learning through rewards and penalties while interacting with an environment

41
New cards

Training data

Data used to teach a model patterns

42
New cards

Validation data

Data used to tune model hyperparameters

43
New cards

Test data

Data used to evaluate final model performance

44
New cards

Neural network

Model made of interconnected nodes that learn complex patterns

45
New cards

K-means clustering

Unsupervised algorithm that groups similar data points

46
New cards

Deep learning

Machine learning using large multi-layer neural networks

47
New cards

Predicate logic

Logical statements computers can interpret and reason with

48
New cards

Bayesian network

Probabilistic model using nodes and directed edges to represent relationships

49
New cards

Knowledge representation

Storing information in ways computers can understand and reason about

50
New cards

Expert system

AI that uses rules and reasoning to solve problems

51
New cards

Algorithmic bias

Unfair patterns in AI output caused by biased training data

52
New cards

GDPR

European Union law protecting data privacy and data usage transparency

53
New cards

Transparency

Clearly informing users how data is collected processed and used

54
New cards

Data literacy

Ability to read work with analyze and communicate data

55
New cards

Structured data

Data organized in tables and fixed formats

56
New cards

Unstructured data

Data without a fixed format such as images text or emails

57
New cards

Qualitative data

Descriptive non-numeric information

58
New cards

Quantitative data

Numerical measurable information

59
New cards

Streaming data

Real-time continuously arriving data

60
New cards

Batch data

Data processed in groups at scheduled intervals

61
New cards

Metadata

Data that describes other data

62
New cards

Data wrangling

Transforming and preparing raw data for analysis

63
New cards

Data science process

Steps including collecting cleaning analyzing modeling and interpreting data

64
New cards