TOPIC #3: Numerical Summaries

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/15

flashcard set

Earn XP

Description and Tags

module one data 1001

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

16 Terms

1
New cards

What are the advantages of a numericall summary?

Reduces all data to one simple number/statistic and this allows for easy communication and comparisons

2
New cards

what major features can be used to create numeric summaries?

maximum, minimum, centre (mean and median), spread (standard deviation, range, IQR)

3
New cards

Mean

sum of data/size of data, - is the unique balancing point on the histogram

4
New cards

median

the middle data point , is the halfway point on the histogram, splitting the histogram into bottom 50 and upper 50 %

5
New cards
<p>what is the purple line on this image </p>

what is the purple line on this image

the median

6
New cards
<p>what is the blue line on this image</p>

what is the blue line on this image

the mean

7
New cards

what is said to be robust, and a good summary for skewed data and why?

the median, as it is not affected by outliers.

8
New cards

for left skewed data, what can we expect from the mean and median?

the mean is smaller than the median

<p>the mean is smaller than the median </p>
9
New cards

for right skewed data what cna we expect from the mean and the median

we expect the mean to be larger than the median

<p>we expect the mean to be larger than the median </p>
10
New cards

how to typically measure spread?

standard deviation: measures the spread of the data. SDpop = RMS of (gaps from the mean) (Root mean square (RMS) - measures the average of a set of numbers, regardless of the signs. Square the numbers, mean the results, then root the result.), RMS

11
New cards

what is the z score

the standard units of a data point is how many standard deviations it is above and below the mean. standard units = (data point - mean)/SD (example given)

<p>the standard units of a data point is how many standard deviations it is above and below the mean.  standard units  = (data point  - mean)/SD (example given) </p>
12
New cards

What does the IQR measure

spread. IQR = middle 50% of the data,

13
New cards

what is the coefficient of variation?

combines the mean and standard deviation into one summary CV = SD/mean

14
New cards

Changes to data, with mean and standard deviation (calculations)

Y = aX + b (Y = new dataset, X = old dataset with mean M and standard deviation S) ,

new mean = aM + b
New SD = aS

15
New cards

what does shifting the data change

changes the mean of the data

16
New cards

what does scaling the data change?

changes the mean and SD of the data