Bus Analytics Ch 3

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/28

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

29 Terms

1
New cards

central tendency

the extent to which the values of a numerical variable group around a typical or central value.

2
New cards

variation

the amount of dispersion or scattering away from a central value that the values of a numerical variable show.

3
New cards

shape

the pattern of the distribution of values from the lowest value to the highest value.

4
New cards

Mean

sum of values divided by the number of values

  • The most common measure of central tendency.

  • Affected by extreme values (outliers).

5
New cards

median

the “middle” number (50% above, 50% below).

  • less sensitive to extreme values

6
New cards

mode

  • Value that occurs most often.

  • Not affected by extreme values.

  • Used for either numerical or categorical data.

  • There may be no mode.

  • There may be several modes.

7
New cards

measures of variation:

give information on the spread or variability or dispersion of the data values.

8
New cards

range

Simplest measure of variation.

Difference between the largest and the smallest values:

9
New cards

sample variance

Average (approximately) of squared deviations of values from the mean.

10
New cards

sample standard deviation

  • Most commonly used measure of variation.

  • Shows variation about the mean.

  • Is the square root of the variance.

  • Has the same units as the original data.

11
New cards

coefficient of variation

  • Measures relative variation.

  • Always in percentage (%).

  • Shows variation relative to mean.

  • Can be used to compare the variability of two or more sets of data measured in different units.

12
New cards

shape of a distribution

Describes how data are distributed.

two useful shapes:

  • skewness

  • kurtosis

13
New cards

skewness

Measures the extent to which data values are not symmetrical

14
New cards

kurtosis

Kurtosis measures the peakedness of the curve of the distribution—that is, how sharply the curve rises approaching the center of the distribution.

15
New cards

quartiles

split the ranked data into 4 segments with an equal number of values per segment.

16
New cards

five number summary

The five numbers that help describe the center, spread and shape of data

  • Xsmallest.

  • First Quartile (Q1)

  • Median (Q2).

  • Third Quartile (Q3).

  • Xlargest.

17
New cards

Interquartile range (midspread)

measures the spread in the middle 50% of the data.

  • measure of variability that is not influenced by outliers or extreme values

18
New cards

boxplot

A Graphical display of the data based on the five-number summary

  • If data are symmetric around the median then the box and central line are centered between the endpoints

19
New cards

population parameters:

  • population mean

  • variance

  • standard deviation

20
New cards

population mean

the sum of the values in the population divided by the population size, N.

21
New cards

population variance

Average of squared deviations of values from the mean.

22
New cards

standard deviation

  • Most commonly used measure of variation.

  • Shows variation about the mean.

  • Is the square root of the population variance.

  • Has the same units as the original data.

23
New cards

empirical rule

approximates the variation of data in a symmetric mound-shaped distribution

24
New cards

Chebyshev’s Rule

Regardless of how the data are distributed, at least (1 - 1/k2) x 100% of the values will fall within k standard deviations of the mean (for k > 1).

25
New cards

covariance

measures the strength of the linear relationship between two numerical variables (X & Y).

26
New cards

coefficient of correlation

Measures the relative strength of the linear relationship between two numerical variables.

27
New cards

Data analysis is objective:

Should report the summary measures that best describe and communicate the important aspects of the data set.

28
New cards

Data interpretation is subjective:

Should be done in fair, neutral and clear manner

29
New cards

Numerical descriptive measures:

  • Should document both good and bad results.

  • Should be presented in a fair, objective and neutral manner.

  • Should not use inappropriate summary measures to distort facts.