1/74
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Which of the following describes categorical data
Can be either nominal or ordinal
In a perfectly symmetric distribution
Mean = Median = Mode
A researcher wants to test if average exam scores of one class differ from the national average. Which test applies
One-sample t-test
Which of the following best describes structured data
Data organized in rows and columns
Which of the following is a numerical variable
Temperature in Celsius
Temperature in Fahrenheit is measured on which scale
Interval
Which of the following is a floating-point number
3
In the same list, the value 10 is
Integer
Metadata refers to
Data about data
Which of the following is a difference between metadata and big data
Metadata is small and descriptive, big data is large and complex
The difference between data and information is mainly that
Information is organized and meaningful, while data is raw and unorganized
What is the main goal of data cleaning
To ensure the dataset is accurate and consistent
What is a common approach to handling outliers
Investigate context and decide whether to keep, remove, or transform
What is the best way to resolve inconsistent date formats
Convert all dates into a standard format
Which sampling method ensures every member of the population has an equal chance of being selected
Simple random sampling
Which measure of spread is most influenced by extreme values
Range
What does standard deviation measure
The average distance of data from the mean
Which is NOT a measure of central tendency
Standard deviation
If a distribution has a longer tail on the right, it is
Positively skewed
If the mean is greater than the median, the distribution is likely
Positively skewed
Which function calculates the standard deviation for a sample in Excel
STDEV.S()
Which Excel function can be used to measure skewness of a dataset
SKEW()
If SKEW(data) returns a positive value, what does this indicate
Data is positively skewed
The alternative hypothesis (Hₐ) represents
The claim we want to provide evidence for
Which correlation coefficient indicates the strongest relationship
r = −0.75
If X = temperature and Y = electricity use, a positive slope means
Higher temperatures are associated with higher electricity use
Categorical variables represent
Labels or groups without inherent numerical meaning
Nominal data is characterized by
Categories without a natural order
The main difference between interval and ratio data is
Interval data lacks a true zero point
Big data is typically characterized by the four Vs. Which is NOT one of them
Validity
Which of the following correctly represents the steps in the SOAR analytic model
Specify, Obtain, Analyze, Report
A pivot table is best described as
A statistical tool that reorganizes and summarizes data in a spreadsheet or database to create a report
Which of the following best describes data
Raw numbers and facts with little meaning on their own
When raw data are organized in a way that is meaningful to the user, they become
Information
In analytics, why is context important
It determines the setting in which data can be better understood and evaluated
Which is NOT a method to handle missing data
Visualization
Which Excel function is commonly used to remove extra spaces from text
TRIM()
Which of the following best defines a population in statistics
The entire set of individuals or items of interest
What is a sample
A subset of the population used to make inferences
Which of the following is a parameter
Population mean
Which of the following statements is true
Parameters describe populations; statistics describe samples
The mean is defined as
The sum of values divided by the number of observations
The mode is
The most frequently occurring value
Which dataset has a larger spread
Mean = 50, SD = 15
Which measure of central tendency is most affected by extreme values
Mean
Which of the following is most useful in identifying skewness
Comparison of mean and median
Which descriptive statistic is least reliable for skewed distributions
Mean
Which Excel function calculates the arithmetic mean of a dataset
AVERAGE()
Which function returns the middle value of an ordered dataset
MEDIAN()
Which function would you use to calculate the 25th percentile (Q1)
QUARTILE.EXC(array, 1)
What is the null hypothesis (H₀)
A statement of no effect or no difference
What does a p-value measure
Probability of observing results as extreme as the data if H₀ is true
If p-value < α, the correct conclusion is
Reject H₀
If p-value > α, then
Fail to reject H₀
Which of the following is a two-tailed test
H₀: μ = 50, Hₐ: μ ≠ 50
Which test compares a sample mean to a known population mean when σ is unknown
One-sample t-test
Which test compares means of two independent groups
Two-sample t-test
Which test checks differences in more than two group means
ANOVA
Which test is appropriate for paired data (before vs. after measurements)
Paired t-test
Which test examines whether two categorical variables are independent
Chi-square test of independence
A doctor measures patient weights before and after a new diet plan. Which test applies
Paired t-test
If r = 0, it means
No linear relationship exists between variables
A correlation coefficient of r = −0.25 suggests
Weak negative relationship
If two variables have r = 0.02, the relationship is
Almost no linear relationship
If all data points lie exactly on a straight line with positive slope, r =
1
High correlation between two variables means
A strong linear association exists
Which situation likely violates assumptions for correlation
Strong nonlinear relationship
The main purpose of linear regression is to
Predict the value of a dependent variable using one or more independent variables
In the simple linear regression equation Y = a + bX + ε, the term b represents
Slope
The intercept in a regression line represents
The expected value of Y when X = 0
The slope coefficient tells us
The change in Y for a one-unit increase in X
If the regression equation is Ŷ = 20 + 3X, then when X = 5, predicted Y is
35
If R² = 0.80, this means
80% of variation in Y is explained by X
A company predicts sales (Y) based on advertising spending (X). Which method should they use
Linear regression
A researcher finds slope = 0.75 for predicting GPA from study hours. This means
Each additional hour studied increases GPA by 0.75 units