Bus Analytics Ch 3

studied byStudied by 0 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions

1 / 28

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

29 Terms

1

central tendency

the extent to which the values of a numerical variable group around a typical or central value.

New cards
2

variation

the amount of dispersion or scattering away from a central value that the values of a numerical variable show.

New cards
3

shape

the pattern of the distribution of values from the lowest value to the highest value.

New cards
4

Mean

sum of values divided by the number of values

  • The most common measure of central tendency.

  • Affected by extreme values (outliers).

New cards
5

median

the “middle” number (50% above, 50% below).

  • less sensitive to extreme values

New cards
6

mode

  • Value that occurs most often.

  • Not affected by extreme values.

  • Used for either numerical or categorical data.

  • There may be no mode.

  • There may be several modes.

New cards
7

measures of variation:

give information on the spread or variability or dispersion of the data values.

New cards
8

range

Simplest measure of variation.

Difference between the largest and the smallest values:

New cards
9

sample variance

Average (approximately) of squared deviations of values from the mean.

New cards
10

sample standard deviation

  • Most commonly used measure of variation.

  • Shows variation about the mean.

  • Is the square root of the variance.

  • Has the same units as the original data.

New cards
11

coefficient of variation

  • Measures relative variation.

  • Always in percentage (%).

  • Shows variation relative to mean.

  • Can be used to compare the variability of two or more sets of data measured in different units.

New cards
12

shape of a distribution

Describes how data are distributed.

two useful shapes:

  • skewness

  • kurtosis

New cards
13

skewness

Measures the extent to which data values are not symmetrical

New cards
14

kurtosis

Kurtosis measures the peakedness of the curve of the distribution—that is, how sharply the curve rises approaching the center of the distribution.

New cards
15

quartiles

split the ranked data into 4 segments with an equal number of values per segment.

New cards
16

five number summary

The five numbers that help describe the center, spread and shape of data

  • Xsmallest.

  • First Quartile (Q1)

  • Median (Q2).

  • Third Quartile (Q3).

  • Xlargest.

New cards
17

Interquartile range (midspread)

measures the spread in the middle 50% of the data.

  • measure of variability that is not influenced by outliers or extreme values

New cards
18

boxplot

A Graphical display of the data based on the five-number summary

  • If data are symmetric around the median then the box and central line are centered between the endpoints

New cards
19

population parameters:

  • population mean

  • variance

  • standard deviation

New cards
20

population mean

the sum of the values in the population divided by the population size, N.

New cards
21

population variance

Average of squared deviations of values from the mean.

New cards
22

standard deviation

  • Most commonly used measure of variation.

  • Shows variation about the mean.

  • Is the square root of the population variance.

  • Has the same units as the original data.

New cards
23

empirical rule

approximates the variation of data in a symmetric mound-shaped distribution

New cards
24

Chebyshev’s Rule

Regardless of how the data are distributed, at least (1 - 1/k2) x 100% of the values will fall within k standard deviations of the mean (for k > 1).

New cards
25

covariance

measures the strength of the linear relationship between two numerical variables (X & Y).

New cards
26

coefficient of correlation

Measures the relative strength of the linear relationship between two numerical variables.

New cards
27

Data analysis is objective:

Should report the summary measures that best describe and communicate the important aspects of the data set.

New cards
28

Data interpretation is subjective:

Should be done in fair, neutral and clear manner

New cards
29

Numerical descriptive measures:

  • Should document both good and bad results.

  • Should be presented in a fair, objective and neutral manner.

  • Should not use inappropriate summary measures to distort facts.

New cards

Explore top notes

note Note
studied byStudied by 14 people
1005 days ago
4.0(1)
note Note
studied byStudied by 162 people
624 days ago
5.0(1)
note Note
studied byStudied by 16 people
122 days ago
5.0(1)
note Note
studied byStudied by 22 people
743 days ago
5.0(1)
note Note
studied byStudied by 61 people
882 days ago
4.0(1)
note Note
studied byStudied by 8 people
176 days ago
5.0(1)
note Note
studied byStudied by 10 people
898 days ago
5.0(1)
note Note
studied byStudied by 255 people
686 days ago
4.8(9)

Explore top flashcards

flashcards Flashcard (127)
studied byStudied by 31 people
911 days ago
5.0(1)
flashcards Flashcard (20)
studied byStudied by 19 people
266 days ago
5.0(1)
flashcards Flashcard (20)
studied byStudied by 8 people
784 days ago
5.0(1)
flashcards Flashcard (28)
studied byStudied by 29 people
737 days ago
5.0(2)
flashcards Flashcard (67)
studied byStudied by 9 people
837 days ago
5.0(1)
flashcards Flashcard (315)
studied byStudied by 51 people
763 days ago
5.0(4)
flashcards Flashcard (29)
studied byStudied by 15 people
379 days ago
5.0(1)
flashcards Flashcard (26)
studied byStudied by 84 people
17 days ago
5.0(1)
robot