1/50
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Marital status of each member of a randomly selected group of adults is an example of what type of variable?
categorical variable
The average number of hours spent completing statistics homework for a randomly selected group of statistics students is an example of what type of variable.
numerical variable
A recent report showed there were 49 accidents involving pedestrians in City A and 62 accidents involving pedestrians in City B this year. The mayor of City A claims that his city is safer for pedestrians than City B. What information is missing that might contradict this claim?
the total number of pedestrians in both City A and City B
A group of students is divided into two groups. One group listens to classical music while taking a math test and the other group takes the test in silence. The average test scores of the two groups are compared to see whether listening to music during a math test has an effect on scores.
controlled experiment
Consider the following statement: "Researchers conducted a large observational study and determined that children who participated in school music programs scored higher on math exams in later grades than those who did not." Suppose that upon hearing this a politician states that all children should participate in school music programs. What is wrong with the politician's statement?
The politician confused correlation with causation.
The distribution of marital status of a randomly selected group of adults would best be visualized by which plot?
bar chart
Which of the following is (are) key features of a well-designed experiment? (Select all that apply)
A) The population should be large enough to observe the full range of variability
B) The subjects of the study should be assigned to the groups randomly
C) The study should use a placebo if possible
D) The researchers should know who is in which group
B) The subjects of the study should be assigned to the groups randomly
C) The study should use a placebo if possible
Which of the following can be calculated from a dotplot?
A) Range
B) IQR
C) Mode
D) Mean
E) All of the above
E) All of the above
A large university conducted a survey among their students and received 300 responses. The survey asked the students for the following information: Age, Year in school (Freshman, Sophomore, Junior, Senior, Graduate student), GPA, Gender. What type of graph would you use to describe the variables Gender and Year in school?
A side-by-side bar chart should be used since these are two categorical variables
The mean price of a pound of ground beef in 75 cities in the Midwest is $2.11 and the standard deviation is $0.56. Suppose the histogram of the data shows that the distribution is unimodal and symmetrical. A local grocer is selling a pound of ground beef for $3.50. What is this price in standard units? Assuming the Empirical Rule applies, would this price be unusual or not? Round to the nearest hundredth.
z = 2.48; This is unusually expensive ground beef.
Which one of the following best describes the relationship between the correlation and the slope of the regression line modeling the relationship between X and Y?
the sign of the correlation between X and Y is the same as the sign of the slope of the regression line modeling the relationship between X and Y.
A horticulturist conducted an experiment on 110 thirty-six-inch plant boxes to see if the amount of plant food given to the plant boxes was associated with the number of tomatoes harvested from the plants. The mean amount of plant food given was 27.8 milliliters with a standard deviation of 2.1 milliliters. The mean number of tomatoes harvested was 7.5 with a standard deviation of 1.5. The correlation coefficient was 0.7691. Use the information given to calculate the slope of the linear model that predicts the number of tomatoes harvested from the amount of plant food given. Round to the nearest hundredth.
B) 0.55
Which of the following statements regarding the correlation coefficient is true?
A correlation coefficient of -1 means that as one variable increases, the other decreases
in the NBA, the correlation between "steals per game" and "blocked shots per game" is found to be 0.8045. Choose the statement that is true about the coefficient of determination.
A) The coefficient of determination, r2, is equal to approximately 0.6472.
B) The coefficient of determination states that about 64.72% of the variation in the blocked shots per game is explained by steals per game.
C) When given as a percent, the coefficient of determination is always between 0 and 100%.
D) All of the above are true statements.
All of the above are true statements.
If two numerical variables X and Y have a correlation coefficient of 0.90, what percentage of the variation in one variable can be accounted for by the other variable?
81%
Consider the following statement: "a study found that people who consume more alcohol are more likely to die at an early age" Which of the following is a possible confounding variable in this study?
A) Smoking
B) Gender
C) Socioeconomic status
D) None of these
E) All of these
E) All of these
An experiment was conducted to explore the relationship between the amount of plant food given to a certain plant and the number of fruits harvested from the plant.
The summary statistics were:
Amount of plant food given (ml) -- XNumber of fruits harvested -- YMean258SD21.5r0.8
Suppose the relationship was found to be linear for the amount of plant food given in between 50 ml and 200 ml, fit a least-squares regression model to predict the number of fruits that could be harvested.
Find the equation of the regression line and select the correct answer below.
If 100 ml plant food is given to the plant, the predicted harvest from the plant is about 53 fruits.
, Not Selected
Incorrect answer:
If you were to conduct a study to get student opinions regarding the difficulty of a stats class (100 students). Which one of the methods below give you a simple random sample?
A) Select the first 20 names using the course roster
B) Randomly choose a discussion section and select all the students from that section.
C) Randomly ask students who come to the office hour during the week
D) Randomly generate 20 numbers between 1 and 100, then use the course roster to select corresponding students.
D) Randomly generate 20 numbers between 1 and 100, then use the course roster to select corresponding students.
Which of the following statements is correct?
A) A negative residual means that the predicted y-value (yhat) is larger than the observation y-value (y)
B) We find the regression line of best fit using the sum of the residuals
C) Negative residuals indicate a bad fit
D) A residual plot that shows a strong linear pattern indicates a good fit
A) A negative residual means that the predicted y-value (yhat) is larger than the observation y-value (y)
Many people who use coconut oil claim it helps with hair care, skin care, stress relief, weight loss, and a boosted immune system. Can we conclude that the use of coconut oil causes these health benefits?
A) Yes, the claims are true stories, so we do have evidence of the health benefits.
B) No, the claims can be lies, so we do not have evidence of the health benefits
C) No, the claims are anecdotes and do not give us a comparison group to find health differences.
D) Yes, the claims are from experiments and give us a good comparison group to find health differences.
C) No, the claims are anecdotes and do not give us a comparison group to find health differences.
A study recruited 1000 people to participate in a weight loss program. Participants are allowed to choose whether they want to go on a vegetarian diet or follow a traditional low-calorie diet that includes some meat. Half of the people choose the vegetarian diet, and half choose to be in the control group and continue to eat meat. The study found that there is greater weight loss in the vegetarian group.
Which of the following is correct about the study?
A) We can conclude that vegetarian diet helps people lose more weight because this is an experimental study.
B) The result of the study is not reliable because there is no control group.
C) The result of the study is not reliable because there is no randomization and the confounding variables are not controlled.
D) We can not conclude that vegetarian diet helps people lose more weight because the sample size is too small.
E) The result of the study is not reliable because there is no randomization and the confounding variables are not controlled.
C) The result of the study is not reliable because there is no randomization and the confounding variables are not controlled.
A senator conducted a poll in her state by calling 100 people whose names were randomly sampled from the phone book (mobile phones and unlisted numbers aren't in phone books). The senator's office called those numbers until they got a response from all 100 people chosen.
What bias may exist in the study?
A) No obvious bias
B) Nonresponse
C) Volunteer
D) Undercoverage
D) Undercoverage
Which of the following regarding the coefficient of determination (r-squared) is true?
A) It has a value between -1 and 1
B) It measures how much variation in the response variable can be explained by the explanatory variable
C) It measures the goodness of fit of a model for any relationship
D) t can tell us the direction of the relationship
B) It measures how much variation in the response variable can be explained by the explanatory variable
What does it mean to have a positively skewed distribution?
The tail of the distribution is longer on the right side
An instructor is interested in finding out how much time (in hours) her students spent on the exam preparation. The data of 25 students is displayed in the following histogram.
Which of the following statements is true about the histogram?
None of the 25 students spent 5 hours on the exam preparation.
The existence of multiple peaks is sometimes a sign that
Very different groups have been combined into a single collection of data.
The administrators of XYZ university conducted a poll to find out whether their undergraduate students at the university prefer in-person or online lectures post pandemic.
What is the population of interest here?
All undergraduate students who attend the university
A student investigated a sample of 12 apartments in Westwood and reported how many bathrooms they contain. The data table is given below:
ApartmentNumber of Bathrooms112132425361728191102113122
What is the relative frequency of two-bathroom apartments?
0.42
For the following data set:
60, 68, 72, 73, 76, 79, 80, 81, 82, 82, 86, 87, 90, 91, 93, 95, 96, 96, 98, 100
If this data was graphed in a histogram, how many points would be within a bin of (90, 100]?
7
A survey was conducted to understand people's living situation in the U.S. The following table shows a sample of data.
StateZip codeFamily sizeAnnual incomeAnxiety LevelAlabama35236532000MildFlorida32716823000ModerateAlabama36374429900MildCalifornia94565131000ModerateFlorida32116613500PanicFlorida32116339000Mild
Select the variables that are Numerical.
family size, annual income
A survey asked a group of people to report the average number of phone calls they made daily. Below is the table that shows the survey results.
Number of calls made1-45-89-1213-1617-20Frequency1611531
Suppose you were to make a frequency histogram from the table to display the distribution of the number of daily phone calls.
Which of the following statements is true? (Select all that apply)
-The highest bar in the histogram falls in the interval 1-4
-The sum of the heights of all the bars in the frequency histogram equals 36
-The maximum number of phone calls reported in the sample data is 20
-The shape of the histogram will remain the same if we change the bin width to 2
-There will be 5 bars in the histogram
-The highest bar in the histogram falls in the interval 1-4
-The sum of the heights of all the bars in the frequency histogram equals 36
-There will be 5 bars in the histogram
The following data set shows the number of floors in the tallest buildings in some cities.
City# FloorsA80B100C90D90E120
Which of the given observations is furthest from the mean and therefore contributes most to the standard deviation?
city E
A sample of students at a college was asked the fastest speed (mpg) they have driven a car.
The five-number summary for the data is:
Min = 72, Q1 = 100, Median = 110, Q3 = 120, Max = 156
Suppose a boxplot was constructed, which of the following is correct?
-The maximum value would be marked as a potential outlier
-The lower whisker would stop at the value 70
-The upper whisker of the boxplot would extend to the value 156
-The minimum value would be marked as a potential outlier
The maximum value would be marked as a potential outlier
Which of the following statements is incorrect about the center of a distribution?
-The mean, mode and median are equal in a symmetric distribution
-The mean does not represent the typical value well in an highly skewed distribution
-The mode is mostly used for categorical data, but the least used for numerical data
-The median is easily affected by extreme values in the data
The median is easily affected by extreme values in the data
The test scores of 20 students are listed below.44, 46, 52, 56, 60, 63, 66, 71, 73, 80, 82, 83, 86, 88, 90, 91, 92, 94, 97, 98Find the interquartile range for the sample using the method introduced in class.
29
Suppose the number of hours of sleep students get per night has a unimodal and symmetric distribution with a mean of 7 hours and a standard deviation of 1.5 hours.
Approximately what percent of students sleep more than 8.5 hours per night?
16%
A data set ranges from 0 to 10 has a distribution with a longer left tail. A student reported three measures for the center: 4, 4.5, 5. Which one is more likely to be the value of the mean?
-The value of the mean cannot be determined
-The median is likely to be 5 as it is in the middle of 0 and 10
-The mean is likely to be 5 because it is the balancing point in the middle
-The mean is likely to be 4 as it is affected more by small values in the distribution
The mean is likely to be 4 as it is affected more by small values in the distribution
Home prices in a particular city for a recent month are shown in the accompanying histogram.
Which of the following statements is correct?
Median and IQR should be used to describe the distribution
Which of the following is correct about Z-scores?
-A positive Z-score is more unusual than a negative Z-score
-Z-scores should only be used when distribution is unimodal and symmetric
-Z-score has the same units as the standard deviation
-Z-score measures the distance between an observation and the median.
-Z-scores should only be used when distribution is unimodal and symmetric
Which of the following information cannot be obtained from a boxplot?
-Mode
-Range
-Potential outliers
-Symmetry/skewness
-Inter-Quartile Range
mode
Answer the questions using the boxplot below
Within which interval would you expect to find the largest number of observations?
All the intervals contain approximately equal number of observations
Which of the following statements is correct
-The value of the correlation coefficient does not tell us whether the relationship is linear or not
-The correlation coefficient is not affected by the outliers
-Correlation coefficient has the same units as the explanatory variable
-The direction of the relationship is either positive or negative
The value of the correlation coefficient does not tell us whether the relationship is linear or not
A study were done on 5,375 students to answer the question: "Is the smoking of students related to their parents' smoking habits?"
Are students less likely to smoke if their parents do not smoke?
Yes, the percentage of students who do not smoke in the "Parents do not smoke" group (86.1%) is higher than the "Parents smoke" group (79.7%).
A study was conducted in order to determine whether longevity (the length a person lives) is related to a person's handedness (right-handed/left-handed).
Which of the following would be the best for examining this type of relationship? (Select all that apply)
-Two-way table
-Grouped bar chart
-Boxplots
-Five-number summary
-Boxplots
-Five-number summary
A study about high school student SAT scores reported that a student's SAT Math score has a strong positive linear association with his/her SAT Verbal score.
What can you determine about the relationship between SAT Math scores and SAT Verbal scores?
Students with higher SAT Math scores tend to have higher SAT Verbal scores.
If the point in the upper right corner were removed from the scatterplot, what will happen to the value of the correlation coefficient r?
r will become closer to -1
Two variables have correlation coefficient of -0.9, which of the following is possible to be the case? (select all that apply)
-The points on the scatterplot closely follow a straight line that goes from upper left to lower right
-There can be a strong nonlinear relationship showing a general decreasing trend
-The points on the scatterplot closely follow a straight line that goes from lower left to upper right
-There are many inconsistent outliers in the data
-The points are randomly scattered with no pattern
-The points on the scatterplot closely follow a straight line that goes from upper left to lower right
-There can be a strong nonlinear relationship showing a general decreasing trend
A store asked 200 customers whether or not they were satisfied with the service. The purpose of this study was to examine the relationship between the customer's satisfaction and gender.
This is an example of which type of relationship?
Case: Categorical -> Categorical
Which graph would you choose to visualize the data below?
Cell phone useless than 5 hours 6-9 hours10-12 hoursMore than 15 hoursFemale7954Male10541
Side-by-side Barchart
The correlation coefficient between the height (inch) and weight (lbs) is 0.8. Suppose the unit of height inch is converted to meter, and the unit of weight is converted to kg, how will this affect the correlation coefficient?
Neither the sign nor magnitude will change
Which of the following is true about using the scatterplot? (select all that apply)
-The coordinate of an point on the vertical axis is determined by the response variable
-Each individual is represented by a point on the plot
-The coordinate of an point on the vertical axis is determined by the explanatory variable
-The response variable should be plotted on the horizontal axis
-The coordinate of an point on the vertical axis is determined by the response variable
-Each individual is represented by a point on the plot