1/74
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is the primary purpose of data visualizations?
To summarize data and communicate findings
When would you use a scatter plot in data analysis
To show relationships between two numeric variables
What does a pie chart primarily illustrate
Part of a whole
Which chart type is best used to show the distribution and outliers of a dataset?
Bow Plot
In which scenario would you best use a histogram?
To show frequency distribution of continuous data
Comparing Test Scores
Evaluates performance across different classes
Analyzing salary ranges
Summarizes income distribution in a job sector
Summarizing survey responses
Provides insights into public opinion or preferences
Identifying outliers
Highlights values that fall far outside typical ranges
IQR
Interquartile range, the difference between Q3 and Q1
Skewness
A measure of the asymmetry of the data distribution
Outliers
Values that are significantly higher or lower than most of the data
Box Plot
A graphical representation of the Five-Number Summary
Summarizes key aspects of data
Provides a quick overview of data distribution
Identifies center
Helps locate the median of the dataset
Useful for skewed distributions
Can describe data that isn’t symmetrically distributed
Visualizing with a Box Plot
Uses the Five-Number summary to illustrate data characteristics
Minimum
The smallest value in a dataset
First Quartile (Q1)
The median of the lower half of the dataset
Median (Q2)
The middle value separating the higher half from the lower half of the dataset
Maximum
The largest value in a dataset
Mean
Average of data
Median
Middle value of data
Skewness
Asymmetry of data distribution
Mode
Most frequently occurring value
Left Skew
Mean < Median
Right Skew
Mean > Median
Symmetric Distribution
Mean = Median
Extreme Left Skew
Mean «« Median (mean much lower than median)
Which of the following measures is NOT considered a measure of center?
Range
What is the primary purpose of descriptive statistics?
To summarize and describe data sets
What does a confidence interval represent
A range of values that likely contains the population parameter
In hypothesis testing, a one-tailed test is used for what purpose?
To test a claim in one specific direction
Which sampling method ensures that every individual in the population has an equal chance of being selected?
Simple Random Sampling
In which scale can all three measures of central tendency (mean, median, mode) be used effectively?
Raito Scale
Which scale of measurement is used when data can be categorized, ordered, but the intervals between categories are not equal?
Ordinal Scale
What type of data measurement is ‘gender’ classified as?
Nominal Scale
Which of the following is an example of a Ratio Scale measurement
Height of a person
Why is it important to understand scales of measurement in statistics?
To select appropriate statistical methods and ensure accuracy in research
Which measure of dispersion is most sensitive to outliers?
Range
Why is standard deviation often preferred over variance in data analysis?
It is expressed in the same units as the data, making it easier to interpret
What is the primary purpose of measure of data dispersion?
To describe how spread out data values are
What does the range of dataset indicate
The difference between the maximum and minimum values
What is a major advantage of using non-probability sampling methods?
It is convenient and cot-effective
Which of the following is a type of non-probability sampling?
Snowball sampling
Which situation is most appropriate for using convenience sampling?
Surveying students in a classroom setting
What is a key limitation of non-probability sampling
Results may not generalize to the entire population
What is non-probability sampling
Selecting participants without giving every member of the population an equal chance
What strategy is recommended to reduce measurement error in surveys
Increase the number of survey respondents
What is the primary cause of sampling error
Natural variability between a smaple and the population
Which of the following is NOT a type of non-sampling error
Sampling error
How can sampling error be minimzed?
By increasing the sample size
Which error occurs when respondents fail to participate in a survey
Nonresponse Error
When using a z-score, a higher z-value indicates that the sample proportion is further from the population proportion, which can impact the probability calculation
true
To reliability use a normal approximation for sample proportions, random sampling is not necessary as long as the sample size is large enough
False
The shape of the sampling distribution of sampling proportions will always be normal regardless of sample size or population proportion
False
In the context of estimating probabilities using sampling distributions for proportions, what is the primary purpose of calculating the z-score
To standardize the sample proportion for use with the standard normal distribution
Why is it important to ensure that the sampling distribution is approximately normal before finding probabilities
Because the properties of the normal distribution are use to calculate probabilities
What does the Central Limit Theorem (CLT) state about the sampling distribution of the sample mean for a large enough sample size?
It is approximately normal even if the population distribution is not
Hypothesis Testing
CLT allows us to use normal approximation to determine the significance of results
Confidence intervals
Provides a range around the sample estimate that reflects uncertainty due to sample variability
Approximation of probabilities
Helps in estimating probabilities for sample statistics using the normal distribution
Data visualization
Plays a critical role in creating visuals to represent raw data
CLT
States that the distribution of sample means approaches a normal distribution as sample size increases
Standard Error (SE)
Calculated as the standard deviation divided by the square root of the sample size
Sampling Distribution
The probability distribution of a statistic obtained through a large number of samples drawn from a specific population
Population proportion
The true proportion of a certain characteristic in the entire population
Which measure of central tendency can be used for categorical data?
Mode
In which scenario would you prefer to use the median over the mean?
When the data is skewed or contains outliers
Which measures of central tendency is calculated by summing all values and dividing by the number of values?
mean
What is a key benefit of using stratified sampling?
It ensures representation across key subgroups of the population
Which of the following is an example of systematic sampling
Selecting every 5th participant from a list of volunteers
In which scenario might cluster sampling be most advantageous
For research involving large, geographically dispersed populations
What is the primary characteristic of probability sampling
Every member of the population has a known, non-zero chance of being selected
Which method of probability sampling is most effective for subgroup
Stratified sampling