Intro to data science sumary

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/15

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

16 Terms

1
New cards

What is Data Science?

An interdisciplinary field combining math, statistics, computer science, and domain knowledge to extract insights from data.

2
New cards

What’s the difference between AI, ML, and Data Science?

  • AI = Machines performing tasks requiring human-like intelligence.

  • ML = Subset of AI that learns patterns from data.

  • Data Science = Field that prepares and analyzes data used to train ML/AI.

3
New cards

What are the 5 components of the Data Science process?

Data Collection, Cleaning, Analysis, Visualization, Decision Making.

4
New cards

What are the 7 Vs of Big Data?

Volume, Velocity, Variety, Veracity, Value, Variability, Visualization.

5
New cards

Give an example of each data type:

  • Structured: SQL table, Excel sheet.

  • Semi-structured: JSON, XML.

  • Unstructured: Image, video, audio, free-text.

6
New cards

Why is Data Quality important?

High-quality data builds trust, improves decision making, and prevents misleading results.

7
New cards

Name 3 data cleaning techniques for structured data.

Remove duplicates, handle missing values, fix incorrect entries.

8
New cards

What’s “imputation”?

Filling missing data with estimated values (e.g., mean, median).

9
New cards

Name 2 techniques for cleaning unstructured data.

Text cleaning (remove stop words, fix spelling), image preprocessing (resizing, normalization).

10
New cards

What are epochs, batch size, and learning rate?

  • Epoch: One full pass through the dataset during training.

  • Batch size: Number of samples per training step.

  • Learning rate: How much the model’s weights change per update.

11
New cards

What’s the difference between training, validation, and test data?

  • Training set: Teaches the model.

  • Validation set: Tunes model & prevents overfitting.

  • Test set: Evaluates final model performance.

12
New cards

Image Recognition vs Object Detection?

  • Recognition: Identifies what is in an image.

  • Detection: Identifies what and where by drawing bounding boxes.

13
New cards

What are Precision and Recall?

  • Precision: Correct positive predictions / all positive predictions.

  • Recall: Correct positive predictions / all actual positives.

14
New cards

What is F1 Score?

Harmonic mean of Precision and Recall → balances both.

15
New cards

What is mAP@50-95?

Mean Average Precision measured across IoU thresholds 0.5–0.95; evaluates detection accuracy.

16
New cards

What does IoU stand for and what does it measure?

Intersection over Union — overlap between predicted and actual bounding boxes.