FDA - Exploratory Data Analytics, week 1

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/28

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

29 Terms

1
New cards

Categorical (qualitive) data 

No numeric value

2
New cards

Categorical (qualitive) data  → Nominal data

Two or more outcomes with no natural order (blond, brown, green hair)

3
New cards

Categorical (qualitive) data → Ordinal data

Two or more outcomes with a natural order (good, medium, bad)

4
New cards

Categorical (qualitive) data → Dichotomous data 

Two possible outcomes (true/false, yes/no)

5
New cards

Numerical (quantitative) data

With a numeric value

6
New cards

Numerical (quantitative) data → Continuous data

Can attain any value

7
New cards

Numerical (quantitative) data → Continuous data → Interval data

No fixed zero point (clock time, birth year)

8
New cards

Numerical (quantitative) data → Continuous data → Ratio data

With fixed zero point (money, distance, time duration)

9
New cards

Numerical (quantitative) data → Discrete data

Can only attain certain values (number of...)

10
New cards

Cyclic data

Has a circular order and allows some numeric operations (days of week, wind direction)

11
New cards

One-dimensional data

One value

example: make a histogram of hight of people. There are 8 people with a hight of 1.80m, 1.80m is a value but 8 people is just a count and not a value.

12
New cards

Two-dimensional data

Two values

Example: temperature and ice cream sales. Temperature is one value and the amount of sales is the other value

13
New cards

Dot plot

  • One value, one-dimensional

  • To find clusters, ouliers

14
New cards

Scatter plot

  • Two values(numeric), two-dimensional

  • Investigates relations and used to see if there are outliers

15
New cards

Histogram

  • One value(numeric), one-dimensional

  • Data is split in bins(=staven)

    • To understand the distribution of the data

16
New cards

Bar chart

  • Two values(1 categorical, 1 numeric), two-dimensional

  • To look up and compare values

17
New cards

Mean

Average

(1 + 5 + 8 + 2) / 4 = 4

18
New cards

Median

The value separating the lower half from the higher half

100, 160, 200, 360: median = 180

10, 20, 30, 40, 50: median = 30

19
New cards

Mode

Most frequently occurring value

1, 3, 1, 5, 2, 3, 1: mode = 1

20
New cards

Range

maximum - minimum

60, 70, 75, 90, 95

→ 95 - 60 = 35, range = 35

21
New cards

Interquartile range

Median Q3 - median Q1

22
New cards

Sample variance

Tells you on average, how far the data values from the mean, squared

→ the higher the statistics the more spread/variability in the data

23
New cards

Sample standard deviation

Tells you how many units it is away from the mean

→ the higher the statistics the more spread/variability in the data

24
New cards

Median absolute deviation

Measures how far data values typically are from the median

→ the higher the statistics the more spread/variability in the data

25
New cards
term image

Unimodal distribution (1 peak)

26
New cards
term image

Bimodal distribution (2 peaks)

27
New cards
term image

Symmetric distribution

28
New cards
term image

Left - skewed (negative)

mean<median<mode

29
New cards
term image

Right - skewed (positive)

mode<median<mean