Displaying and Describing Data (ST 311 (2))

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/63

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

64 Terms

1
New cards

Frequency distribution (frequency table)

Often helpful in organizing and summarizing data

2
New cards

Measure of center

A value at or near the center or middle of a data set. Measures of center are often interpreted as “typical” values for a group

3
New cards

Mean, medium, mode

What are the most common measures of center?

4
New cards

Denotes a sum. Pronounced “sigma”

5
New cards

𝑥

Denotes an individual data value

6
New cards

n

Denotes the number of values in a sample. Also called the sample size

7
New cards

N

Denotes the number of values in a population

8
New cards

𝑥̅

Denotes the sample mean. Pronounced “x bar”

9
New cards

𝜇

Denotes the population mean. Pronounced “mew”. The mean of the entire population. The value is generally unknown

10
New cards

mean

The … of a data set is found by adding all values and dividing by the number of values in the set

11
New cards

Median

The … of a data set is the value that is the middle when listed in ascending order. It shows what number separates the bottom 50% of the data from the top 50%. Roughly half of all values are below it and half are above it.

12
New cards

Mode

If it exists, it is the value that occurs with the greatest frequency. A data set may have none of this.

13
New cards

unimodal

A dataset with one mode is called ….

14
New cards

multiple

A data set may have … modes

15
New cards

modes

If there are multiple values tied for most frequent, they are all….

16
New cards

bimodal

A dataset with two modes is called….

17
New cards

multimodal

A data set with more than two is called…

18
New cards

This depends on your data?

Which measure of center is appropriate/best?

19
New cards

Outlier

A value that does not fit in with the overall pattern of the dataset may be considered an ….

20
New cards

Mean

Uses every data value/Highly affected by outliers/Not good for skewed data sets (but is best for symmetric data)

21
New cards

Median

Not affected by outliers/Can use with any data set

22
New cards

Mode

Not necessarily in the center/Not affected by outliers/Only useful for multimodal or qualitative data

23
New cards

histogram

The graph of a frequency distribution is called a … which can make it easier to interpret patterns

24
New cards

concepts

Bars of equal width drawn adjacent to each other (unless there are gaps in the data)/A horizontal scale representing classes of quantitative data values/A vertical scale (height of the bars) represents frequency. These are all the … of a histogram.

25
New cards

Dotplot

Shows each value in a dataset as a dot above a number line

26
New cards

categorical

Pie charts, bar charts, column charts, stacked charts, and others exist for …. data

27
New cards

pie chart

If the data have a natural order, a … is not the best choice. A bar chart whose horizontal axis put these in order

28
New cards

misleading

Vertical axis can exaggerate differences/Does the y-axis start at zero?/Is the y-axis skewed or stretched?/Using 3D can make certain categories seem larger/smaller/Misrepresenting areas (by mistake or on purpose) is misleading/Selecting the wrong type of graph to represent your data can be very confusing/Improper scaling (especially in pictographs) can exaggerate differences/Not labeling a graph makes readers fill in the blanks/Improper extraction does not show the whole picture. These are all the ways that graphs can be …

29
New cards

Frequency or probability

…. distributions can take on different shapes

30
New cards

Skewness

Is a measure of the asymmetry of a distribution. Values far from the “peak” skew a distribution in their direction

31
New cards

Normal distribution

In a symmetric distributionw, the mean = median = mode.These measure occur under the peak

32
New cards

Right-skewed (positively skewed)

In a … distribution, the mode < median < mean.Outliers, if any, appear on the right side

33
New cards

Left-skewed (negative skewed)

In a …. distribution, the mean < median < mode. Outliers, if any, appear on the left side

34
New cards

consistency

In everyday use, variability can mean a lack of …

35
New cards

Variability

The extent to which data points in a statistical distribution or data set diverge (or vary) from the average value

36
New cards

Range or interquartile range, Variance, and Standard deviation

What are some common measures of variation (or measures of spread) include:

37
New cards

Range

The … of a data set is the difference between the maximum and the minimum

38
New cards

Range

Maximum data value - minimum data value

39
New cards

affected by outliers

Since the range is calculated using only the two most extreme data values, it is highly…

40
New cards

Interquartile Range (IQR)

Uses what is called quartiles to provide a range of values that are not as affected by potential outliers as the range

41
New cards

Quartiles

… are values separate a data set in to fourths (or quarters, hence quartiles)

42
New cards

Q1

The first quartile

43
New cards

Q2

The second quartile

44
New cards

Q3

The third quartile

45
New cards

1/4

About … (25%) of the data lie between any two consecutive quartiles

46
New cards
  1. Minimum

  2. Q1

  3. Median (Q2)

  4. Q3

  5. Maximum

The 3 quartiles together with the minimum and the maximum values constitute the five-number summary

47
New cards

Q3 and Q1

The IQR is the difference between the …

48
New cards

slightly

Sometimes, the median. is. excluded and the IQR will differ …

49
New cards

not affect

If there are a large number of data values, this will probably … the IQR much.

50
New cards

IQR may not

If there are a very small number of data values, then we can look directly at those values and a summary such as the …. be necessary

51
New cards

Boxplot

Is a visual representation of the 5 number summary and also helps identify outliers. Can be displayed vertically or horizontally

52
New cards

distribution

You can get a sense of the shape of a … from its boxplot

53
New cards

Variance

The …. is the square of the standard deviation (standard deviation)²

54
New cards

standard deviation

The … is the square root of the variance (√𝑣𝑎𝑟𝑖𝑎𝑛𝑐e)

55
New cards

same

Because the units of standard deviation are the … as the units of the data, the interpretation is easier to understand. Therefore, we tend to use the standard deviation. However, there are circumstances when it is easier to use the variance since it does not include a square root

56
New cards

standard deviation

The … is defined as a measure of how much data values deviate from the mean

57
New cards

negative

The value of the standard deviation is never…. It is zero only when all of the data values are exactly the same

58
New cards

larger

In standard deviation, … values indicate greater amounts of variations

59
New cards

increase

The standard deviation can … dramatically with one or more outliers

60
New cards

the same as

The units of the standard deviation (such as minutes, feet, pounds) are …. the units of the original data values

61
New cards

population or a sample

For variance and standard deviation, we use different symbols and we use different formulas, depending on whether the data set is from a ….

62
New cards

Population variance

𝜎 2 (𝜎 is the Greek lower-case letter “sigma”, we call this “sigma squared”)

63
New cards

Standard deviation

𝜎 (𝜎 is the Greek lower-case letter “sigma”)

64
New cards

Sample variance

𝑠 2 (we read this “s squared”)