QA QUIZ 2

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/71

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

72 Terms

1
New cards

What do measures of variability describe in descriptive statistics?

The spread or dispersion of the data.

2
New cards

Name the three most common measures of variability.

The range, the standard deviation (and the variance), and the coefficient of variation.

3
New cards

How is the range calculated?

It is the difference between the highest and lowest value in a set of data (Range = Highest value - Lowest value).

4
New cards

What is a main disadvantage of using the range as a measure of variability?

It is extremely sensitive to outliers and is not a resistant measure of variability since it depends only on the maximum and minimum observations.

5
New cards

What does standard deviation measure?

It measures the average deviation of data values from the mean, considering every value in the distribution.

6
New cards

What symbols are used to denote the population standard deviation and the sample standard deviation?

The population standard deviation is denoted by σ (sigma), and the sample standard deviation is denoted by s.

7
New cards

How does a data set with a large standard deviation differ from one with a small standard deviation?

A data set with a large standard deviation or variance is usually more spread about the mean, while one with a small standard deviation or variance is usually more clustered about the mean.

8
New cards

What is the variance, and how is it related to the standard deviation?

The variance is the average of the squared deviations of values from the mean, and it is the square of the standard deviation.

9
New cards

List some important properties of standard deviation.

It measures variability about the mean, should be used only with the mean as the center, is always non-negative (s ≥ 0), increases with larger variability, has the same units as the original observations, and is sensitive to outliers.

10
New cards

How is standard deviation used to identify "unusual" or "extreme" data values?

A data value that is more than 2 standard deviations above or below the mean is considered unusual or extreme.

11
New cards

What is the Coefficient of Variation (C.V.), and why is it a useful measure of dispersion?

The Coefficient of Variation is a relative measure of dispersion, considering the size of the standard deviation relative to the mean. It is useful because it has no unit and measures variability as a percentage, allowing comparison of variability between groups with different means or variables with different units.

12
New cards

What is the main purpose of Descriptive Statistics?

To describe the main characteristics or features of a dataset, such as its mean, median, mode, or standard deviation.

13
New cards

What is the difference between a population parameter and a descriptive statistic (sample statistic)?

A population parameter quantitatively describes a characteristic of data from a population (e.g., population mean μ), while a descriptive statistic quantitatively describes a characteristic of data from a sample (e.g., sample mean x̄).

14
New cards

What do measures of central tendency describe?

They describe the central value, location, or a typical value of a distribution.

15
New cards

How is the arithmetic mean calculated?

The arithmetic mean is calculated by summing all values in a dataset and dividing by the total number of observations (sum of all values / number of observations).

16
New cards

How are outliers related to the arithmetic mean?

The arithmetic mean is sensitive to extreme values (outliers), meaning outliers can significantly affect its value.

17
New cards

How is the median calculated for a simple distribution?

To find the median, first place all values in order from smallest to largest. Then, locate the value at the median position, which is (n+1)/2, where n is the number of observations.

18
New cards

How do outliers affect the median?

The median is not significantly affected by extreme values or outliers, making it useful for skewed distributions.

19
New cards

What is the mode of a distribution?

The mode is the value that occurs most frequently in a distribution. A distribution can have no mode, one mode, or several modes.

20
New cards

When is the trimmed mean used?

The trimmed mean is used when there are outliers in the data; it removes a percentage of the smallest and largest values before calculating the mean to reduce the influence of these extremes.

21
New cards

What is the purpose of a weighted average?

A weighted average is used when some numbers in a dataset need to be assigned more importance or 'weight' than others. It's calculated as the sum of (value × weight) divided by the sum of weights.

22
New cards

Which measure of central tendency is most appropriate for nominal data?

The mode is generally the most appropriate measure of central tendency for nominal data.

23
New cards

Which measure(s) of central tendency are resistant to outliers?

The mode, median, and trimmed mean are resistant to extreme values because they are not affected much by outliers in a data set.

24
New cards

How do the mean, median, and mode relate in a left-skewed distribution?

In a left-skewed distribution, the mean is typically less than the median, which is less than the mode (Mean < Median < Mode).

25
New cards

How do the mean, median, and mode relate in a right-skewed distribution?

In a right-skewed distribution, the mean is typically greater than the median, which is greater than the mode (Mean > Median > Mode).

26
New cards

How do the mean, median, and mode relate in a symmetric distribution?

In a symmetric distribution, the mean, median, and mode are approximately equal (Mean = Median = Mode).

27
New cards

What are the primary objectives when summarizing and presenting quantitative data?

To summarize the distribution of a quantitative variable with frequency tables (simple, relative, grouped) and to make, describe, and compare histograms of quantitative data distributions.

28
New cards

What is a Simple Distribution in the context of quantitative variables?

A list of data values placed in ascending order.

29
New cards

What is a Simple Frequency Distribution?

It shows all the values a variable can take and the number of times (frequency, f) each value appears in the data set.

30
New cards

What is a Grouped Frequency Distribution?

It shows categories of values that a variable can take and the number of times (frequency, f) a value from the data set appears in a given category.

31
New cards

In frequency distributions, what does 'x' always represent?

The variable being measured.

32
New cards

In frequency distributions, what does 'n' always represent?

The sample size.

33
New cards

What are the steps to construct a Simple Frequency Distribution from raw data?

  1. List all the unique values of the variable (x). 2. Count the number of individuals who answered each of the values of x (frequency, f).

34
New cards

How do you verify the sample size in a simple frequency distribution?

By summing all the frequencies (f) in the distribution.

35
New cards

What is the primary purpose of organizing data into a grouped frequency distribution?

To make the distribution shorter and more readable by grouping observations (data) into classes.

36
New cards

In a grouped frequency distribution, what do percentages (%) communicate?

The frequency of each class in percentage form relative to the total sample size.

37
New cards

What is Cumulative Frequency (cf) in a grouped frequency distribution?

The sum of the frequencies of a given class and all classes that came before it.

38
New cards

What is Cumulative Percentage (c%) in a grouped frequency distribution?

Similar to cumulative frequency, but instead of adding up frequencies, percentages are added up cumulatively.

39
New cards

What do midpoints (m) represent in a grouped frequency table?

The 'middle value' of each class interval.

40
New cards

What is the formula for calculating the midpoint (m) of a class in a grouped frequency distribution?

Midpoint (m) = (Lower limit + Upper limit) / 2

41
New cards

What is a histogram?

A special type of bar graph used to display the distribution of quantitative data.

42
New cards

What is a key characteristic of the bars in a histogram?

There are no spaces between the bars of a histogram.

43
New cards

What do the heights of the bars in a histogram represent?

The frequencies or relative frequencies of values in each interval.

44
New cards

What aspects should be described when interpreting the overall pattern of a histogram?

Its shape, center, variability/spread, and any outliers.

45
New cards

What is an outlier in the context of a histogram or data distribution?

An individual value that falls outside the overall pattern of the data.

46
New cards

Describe the characteristics of a mound shape/symmetric distribution.

It is single-peaked, with much of the data clustered around one clear center, and observations decrease as one moves away from the center in either direction. Both sides are roughly the same if folded vertically down the middle.

47
New cards

What does skewness refer to in a histogram?

A histogram in which one tail is stretched out longer than the other, with the direction of skewness indicating the side of the longer tail.

48
New cards

What is a Right (Positively) skewed distribution?

While most of the data are clustered around a low value, a number of cases stretch out into the higher (right) values.

49
New cards

What is a Left (Negatively) skewed distribution?

While most of the data are clustered around a large value, a number of cases stretch down into the lower (left) values.

50
New cards

What characterizes a bimodal distribution?

It has two distinct peaks, meaning two classes with the largest frequencies are separated by at least one class, often indicating the presence of two separate populations within the data.

51
New cards

What are line graphs used to display?

Measurements of the same variable recorded at regular intervals over a period of time, useful for showing how data change over time.

52
New cards

What is another name for line graphs when they display data over time?

Time-series graphs.

53
New cards

In the example of student commuting time, what is the variable of interest?

Average commuting time (in minutes from home to school).

54
New cards

In the example of student commuting time, what is the sample size (n)?

35 students.

55
New cards

What is the primary purpose of organizing data?

To pinpoint where data values tend to concentrate, which is its distribution.

56
New cards

What does the distribution of a variable tell us?

It tells us what values the variable takes and how often it takes these values.

57
New cards

What is a frequency distribution table for a categorical variable?

It lists the categories and gives either the count, relative frequency, or percent of individuals who fall in each category.

58
New cards

What are cross-tabulations?

Tables that display the distribution of data across two categorical variables.

59
New cards

How is relative frequency calculated?

Relative frequency is calculated as the frequency (f) divided by the sum of all frequencies (sigma f), which is equal to the sample size (n).

60
New cards

What are descriptive statistics?

Numbers that describe certain characteristics of a sample, highlighting salient features of a data distribution.

61
New cards

How do you calculate a proportion?

By dividing the portion you are interested in (frequency) by the whole (sample size).

62
New cards

How do you convert a proportion to a percentage?

Multiply the proportion by 100.

63
New cards

From the social media preference example with n=50, what proportion of respondents prefer Instagram if 9 people preferred it?

0.18 (9 divided by 50).

64
New cards

From the social media preference example, what is the ratio of respondents who preferred Snapchat to those who prefer Twitter if 8 preferred Snapchat and 4 preferred Twitter?

2 to 1 (8 divided by 4).

65
New cards

What is a pie chart used for?

To show the distribution of a categorical variable as a 'pie' whose slices are sized by the percentage for the categories, emphasizing each category
's relation to the whole.

66
New cards

When should pie charts be avoided?

If there are too many categories, if the percentages do not sum to 100%, or to display distribution across two categorical variables.

67
New cards

What is a bar chart used for?

To represent each category of a variable as a bar, where bar heights show category counts or percentages.

68
New cards

What is a cluster bar chart?

A bar chart that displays and compares two or more groups along the same variable.

69
New cards

What is a segment bar chart?

A chart that displays the distribution of a categorical variable as portions (segments) of a rectangle, with the area of each segment proportional to the percentage of individuals in the corresponding category.

70
New cards

When is a bar chart generally preferred over a pie chart?

When comparing the magnitude of differences between categories, when there is a larger number of categories, or when emphasizing the distribution of data.

71
New cards

When is a pie chart generally preferred over a bar chart?

When emphasizing the relationship of parts to a whole, with a smaller number of categories, or for a simple comparison of proportions or percentages, provided all categories
' percentages sum to 100%.

72
New cards

What are the key elements of a good graph?

Title, plot, source, legend, and axis titles.

Explore top flashcards