stats 10 midterm

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/50

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

51 Terms

1
New cards

Marital status of each member of a randomly selected group of adults is an example of what type of variable?

categorical variable

2
New cards

The average number of hours spent completing statistics homework for a randomly selected group of statistics students is an example of what type of variable.

numerical variable

3
New cards

A recent report showed there were 49 accidents involving pedestrians in City A and 62 accidents involving pedestrians in City B this year. The mayor of City A claims that his city is safer for pedestrians than City B. What information is missing that might contradict this claim?

the total number of pedestrians in both City A and City B

4
New cards

A group of students is divided into two groups. One group listens to classical music while taking a math test and the other group takes the test in silence. The average test scores of the two groups are compared to see whether listening to music during a math test has an effect on scores.

controlled experiment

5
New cards

Consider the following statement: "Researchers conducted a large observational study and determined that children who participated in school music programs scored higher on math exams in later grades than those who did not." Suppose that upon hearing this a politician states that all children should participate in school music programs. What is wrong with the politician's statement?

The politician confused correlation with causation.

6
New cards

The distribution of marital status of a randomly selected group of adults would best be visualized by which plot?

bar chart

7
New cards

Which of the following is (are) key features of a well-designed experiment? (Select all that apply)

A) The population should be large enough to observe the full range of variability

B) The subjects of the study should be assigned to the groups randomly

C) The study should use a placebo if possible

D) The researchers should know who is in which group

B) The subjects of the study should be assigned to the groups randomly

C) The study should use a placebo if possible

8
New cards

Which of the following can be calculated from a dotplot?

A) Range

B) IQR

C) Mode

D) Mean

E) All of the above

E) All of the above

9
New cards

A large university conducted a survey among their students and received 300 responses. The survey asked the students for the following information: Age, Year in school (Freshman, Sophomore, Junior, Senior, Graduate student), GPA, Gender. What type of graph would you use to describe the variables Gender and Year in school?

A side-by-side bar chart should be used since these are two categorical variables

10
New cards

The mean price of a pound of ground beef in 75 cities in the Midwest is $2.11 and the standard deviation is $0.56. Suppose the histogram of the data shows that the distribution is unimodal and symmetrical. A local grocer is selling a pound of ground beef for $3.50. What is this price in standard units? Assuming the Empirical Rule applies, would this price be unusual or not? Round to the nearest hundredth.

z = 2.48; This is unusually expensive ground beef.

11
New cards

Which one of the following best describes the relationship between the correlation and the slope of the regression line modeling the relationship between X and Y?

the sign of the correlation between X and Y is the same as the sign of the slope of the regression line modeling the relationship between X and Y.

12
New cards

A horticulturist conducted an experiment on 110 thirty-six-inch plant boxes to see if the amount of plant food given to the plant boxes was associated with the number of tomatoes harvested from the plants. The mean amount of plant food given was 27.8 milliliters with a standard deviation of 2.1 milliliters. The mean number of tomatoes harvested was 7.5 with a standard deviation of 1.5. The correlation coefficient was 0.7691. Use the information given to calculate the slope of the linear model that predicts the number of tomatoes harvested from the amount of plant food given. Round to the nearest hundredth.

B) 0.55

13
New cards

Which of the following statements regarding the correlation coefficient is true?

A correlation coefficient of -1 means that as one variable increases, the other decreases

14
New cards

in the NBA, the correlation between "steals per game" and "blocked shots per game" is found to be 0.8045. Choose the statement that is true about the coefficient of determination.

A) The coefficient of determination, r2, is equal to approximately 0.6472.

B) The coefficient of determination states that about 64.72% of the variation in the blocked shots per game is explained by steals per game.

C) When given as a percent, the coefficient of determination is always between 0 and 100%.

D) All of the above are true statements.

All of the above are true statements.

15
New cards

If two numerical variables X and Y have a correlation coefficient of 0.90, what percentage of the variation in one variable can be accounted for by the other variable?

81%

16
New cards

Consider the following statement: "a study found that people who consume more alcohol are more likely to die at an early age" Which of the following is a possible confounding variable in this study?

A) Smoking

B) Gender

C) Socioeconomic status

D) None of these

E) All of these

E) All of these

17
New cards

An experiment was conducted to explore the relationship between the amount of plant food given to a certain plant and the number of fruits harvested from the plant.

The summary statistics were:

Amount of plant food given (ml) -- XNumber of fruits harvested -- YMean258SD21.5r0.8

Suppose the relationship was found to be linear for the amount of plant food given in between 50 ml and 200 ml, fit a least-squares regression model to predict the number of fruits that could be harvested.

Find the equation of the regression line and select the correct answer below.

If 100 ml plant food is given to the plant, the predicted harvest from the plant is about 53 fruits.

, Not Selected

Incorrect answer:

18
New cards

If you were to conduct a study to get student opinions regarding the difficulty of a stats class (100 students). Which one of the methods below give you a simple random sample?

A) Select the first 20 names using the course roster

B) Randomly choose a discussion section and select all the students from that section.

C) Randomly ask students who come to the office hour during the week

D) Randomly generate 20 numbers between 1 and 100, then use the course roster to select corresponding students.

D) Randomly generate 20 numbers between 1 and 100, then use the course roster to select corresponding students.

19
New cards

Which of the following statements is correct?

A) A negative residual means that the predicted y-value (yhat) is larger than the observation y-value (y)

B) We find the regression line of best fit using the sum of the residuals

C) Negative residuals indicate a bad fit

D) A residual plot that shows a strong linear pattern indicates a good fit

A) A negative residual means that the predicted y-value (yhat) is larger than the observation y-value (y)

20
New cards

Many people who use coconut oil claim it helps with hair​ care, skin​ care, stress​ relief, weight​ loss, and a boosted immune system. Can we conclude that the use of coconut oil causes these health​ benefits?

A) Yes, the claims are true​ stories, so we do have evidence of the health benefits.

B) No, the claims can be​ lies, so we do not have evidence of the health benefits

C) No, the claims are anecdotes and do not give us a comparison group to find health differences.

D) Yes, the claims are from experiments and give us a good comparison group to find health differences.

C) No, the claims are anecdotes and do not give us a comparison group to find health differences.

21
New cards

A study recruited 1000 people to participate in a weight loss program. Participants are allowed to choose whether they want to go on a vegetarian diet or follow a traditional low-calorie diet that includes some meat. Half of the people choose the vegetarian diet, and half choose to be in the control group and continue to eat meat. The study found that there is greater weight loss in the vegetarian group.

Which of the following is correct about the study?

A) We can conclude that vegetarian diet helps people lose more weight because this is an experimental study.

B) The result of the study is not reliable because there is no control group.

C) The result of the study is not reliable because there is no randomization and the confounding variables are not controlled.

D) We can not conclude that vegetarian diet helps people lose more weight because the sample size is too small.

E) The result of the study is not reliable because there is no randomization and the confounding variables are not controlled.

C) The result of the study is not reliable because there is no randomization and the confounding variables are not controlled.

22
New cards

A senator conducted a poll in her state by calling 100 people whose names were randomly sampled from the phone book (mobile phones and unlisted numbers aren't in phone books). The senator's office called those numbers until they got a response from all 100 people chosen.

What bias may exist in the study?

A) No obvious bias

B) Nonresponse

C) Volunteer

D) Undercoverage

D) Undercoverage

23
New cards

Which of the following regarding the coefficient of determination (r-squared) is true?

A) It has a value between -1 and 1

B) It measures how much variation in the response variable can be explained by the explanatory variable

C) It measures the goodness of fit of a model for any relationship

D) t can tell us the direction of the relationship

B) It measures how much variation in the response variable can be explained by the explanatory variable

24
New cards

What does it mean to have a positively skewed distribution?

The tail of the distribution is longer on the right side

25
New cards

An instructor is interested in finding out how much time (in hours) her students spent on the exam preparation. The data of 25 students is displayed in the following histogram.

Which of the following statements is true about the histogram?

None of the 25 students spent 5 hours on the exam preparation.

26
New cards

The existence of multiple peaks is sometimes a sign that

Very different groups have been combined into a single collection of data.

27
New cards

The administrators of XYZ university conducted a poll to find out whether their undergraduate students at the university prefer in-person or online lectures post pandemic.

What is the population of interest here?

All undergraduate students who attend the university

28
New cards

A student investigated a sample of 12 apartments in Westwood and reported how many bathrooms they contain. The data table is given below:

ApartmentNumber of Bathrooms112132425361728191102113122

What is the relative frequency of two-bathroom apartments?

0.42

29
New cards

For the following data set:

60, 68, 72, 73, 76, 79, 80, 81, 82, 82, 86, 87, 90, 91, 93, 95, 96, 96, 98, 100

If this data was graphed in a histogram, how many points would be within a bin of (90, 100]?

7

30
New cards

A survey was conducted to understand people's living situation in the U.S. The following table shows a sample of data.

StateZip codeFamily sizeAnnual incomeAnxiety LevelAlabama35236532000MildFlorida32716823000ModerateAlabama36374429900MildCalifornia94565131000ModerateFlorida32116613500PanicFlorida32116339000Mild

Select the variables that are Numerical.

family size, annual income

31
New cards

A survey asked a group of people to report the average number of phone calls they made daily. Below is the table that shows the survey results.

Number of calls made1-45-89-1213-1617-20Frequency1611531

Suppose you were to make a frequency histogram from the table to display the distribution of the number of daily phone calls.

Which of the following statements is true? (Select all that apply)

-The highest bar in the histogram falls in the interval 1-4

-The sum of the heights of all the bars in the frequency histogram equals 36

-The maximum number of phone calls reported in the sample data is 20

-The shape of the histogram will remain the same if we change the bin width to 2

-There will be 5 bars in the histogram

-The highest bar in the histogram falls in the interval 1-4

-The sum of the heights of all the bars in the frequency histogram equals 36

-There will be 5 bars in the histogram

32
New cards

The following data set shows the number of floors in the tallest buildings in some cities.

City# FloorsA80B100C90D90E120

Which of the given observations is furthest from the mean and therefore contributes most to the standard deviation?

city E

33
New cards

A sample of students at a college was asked the fastest speed (mpg) they have driven a car.

The five-number summary for the data is:

Min = 72, Q1 = 100, Median = 110, Q3 = 120, Max = 156

Suppose a boxplot was constructed, which of the following is correct?

-The maximum value would be marked as a potential outlier

-The lower whisker would stop at the value 70

-The upper whisker of the boxplot would extend to the value 156

-The minimum value would be marked as a potential outlier

The maximum value would be marked as a potential outlier

34
New cards

Which of the following statements is incorrect about the center of a distribution?

-The mean, mode and median are equal in a symmetric distribution

-The mean does not represent the typical value well in an highly skewed distribution

-The mode is mostly used for categorical data, but the least used for numerical data

-The median is easily affected by extreme values in the data

The median is easily affected by extreme values in the data

35
New cards

The test scores of 20 students are listed below.44, 46, 52, 56, 60, 63, 66, 71, 73, 80, 82, 83, 86, 88, 90, 91, 92, 94, 97, 98Find the interquartile range for the sample using the method introduced in class.

29

36
New cards

Suppose the number of hours of sleep students get per night has a unimodal and symmetric distribution with a mean of 7 hours and a standard deviation of 1.5 hours.

Approximately what percent of students sleep more than 8.5 hours per night?

16%

37
New cards

A data set ranges from 0 to 10 has a distribution with a longer left tail. A student reported three measures for the center: 4, 4.5, 5. Which one is more likely to be the value of the mean?

-The value of the mean cannot be determined

-The median is likely to be 5 as it is in the middle of 0 and 10

-The mean is likely to be 5 because it is the balancing point in the middle

-The mean is likely to be 4 as it is affected more by small values in the distribution

The mean is likely to be 4 as it is affected more by small values in the distribution

38
New cards

Home prices in a particular city for a recent month are shown in the accompanying histogram.

Which of the following statements is correct?

Median and IQR should be used to describe the distribution

39
New cards

Which of the following is correct about Z-scores?

-A positive Z-score is more unusual than a negative Z-score

-Z-scores should only be used when distribution is unimodal and symmetric

-Z-score has the same units as the standard deviation

-Z-score measures the distance between an observation and the median.

-Z-scores should only be used when distribution is unimodal and symmetric

40
New cards

Which of the following information cannot be obtained from a boxplot?

-Mode

-Range

-Potential outliers

-Symmetry/skewness

-Inter-Quartile Range

mode

41
New cards

Answer the questions using the boxplot below

Within which interval would you expect to find the largest number of observations?

All the intervals contain approximately equal number of observations

42
New cards

Which of the following statements is correct

-The value of the correlation coefficient does not tell us whether the relationship is linear or not

-The correlation coefficient is not affected by the outliers

-Correlation coefficient has the same units as the explanatory variable

-The direction of the relationship is either positive or negative

The value of the correlation coefficient does not tell us whether the relationship is linear or not

43
New cards

A study were done on 5,375 students to answer the question: "Is the smoking of students related to their parents' smoking habits?"

Are students less likely to smoke if their parents do not smoke?

Yes, the percentage of students who do not smoke in the "Parents do not smoke" group (86.1%) is higher than the "Parents smoke" group (79.7%).

44
New cards

A study was conducted in order to determine whether longevity (the length a person lives) is related to a person's handedness (right-handed/left-handed).

Which of the following would be the best for examining this type of relationship? (Select all that apply)

-Two-way table

-Grouped bar chart

-Boxplots

-Five-number summary

-Boxplots

-Five-number summary

45
New cards

A study about high school student SAT scores reported that a​ student's SAT Math score has a strong positive linear association with​ his/her SAT Verbal score.

What can you determine about the relationship between SAT Math scores and SAT Verbal​ scores?

Students with higher SAT Math scores tend to have higher SAT Verbal scores.

46
New cards

If the point in the upper right corner were removed from the scatterplot, what will happen to the value of the correlation coefficient r?

r will become closer to -1

47
New cards

Two variables have correlation coefficient of -0.9, which of the following is possible to be the case? (select all that apply)

-The points on the scatterplot closely follow a straight line that goes from upper left to lower right

-There can be a strong nonlinear relationship showing a general decreasing trend

-The points on the scatterplot closely follow a straight line that goes from lower left to upper right

-There are many inconsistent outliers in the data

-The points are randomly scattered with no pattern

-The points on the scatterplot closely follow a straight line that goes from upper left to lower right

-There can be a strong nonlinear relationship showing a general decreasing trend

48
New cards

A store asked 200 customers whether or not they were satisfied with the service. The purpose of this study was to examine the relationship between the customer's satisfaction and gender.

This is an example of which type of relationship?

Case: Categorical -> Categorical

49
New cards

Which graph would you choose to visualize the data below?

Cell phone useless than 5 hours 6-9 hours10-12 hoursMore than 15 hoursFemale7954Male10541

Side-by-side Barchart

50
New cards

The correlation coefficient between the height (inch) and weight (lbs) is 0.8. Suppose the unit of height inch is converted to meter, and the unit of weight is converted to kg, how will this affect the correlation coefficient?

Neither the sign nor magnitude will change

51
New cards

Which of the following is true about using the scatterplot? (select all that apply)

-The coordinate of an point on the vertical axis is determined by the response variable

-Each individual is represented by a point on the plot

-The coordinate of an point on the vertical axis is determined by the explanatory variable

-The response variable should be plotted on the horizontal axis

-The coordinate of an point on the vertical axis is determined by the response variable

-Each individual is represented by a point on the plot

Explore top flashcards