1/46
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Descriptive Statistics
Summarizes and describes data using measures like mean, median, mode, and standard deviation.
Inferential Statistics
Uses sample data to make predictions or inferences about a population (tools: hypothesis tests, confidence intervals).
Mean
The average value of a data set
Median
The middle value of a data set when arranged in ascending or descending order
Mode
The value that appears most frequently in a data set.
Standard Deviation
Shows how spread out data is from the mean.
Population
The whole group we want to study or learn about.
Sample
A smaller group taken from the population to study and make conclusions..
Hypothesis Testing
Checks if sample data supports a specific claim about a population.
Confidence Interval
A range of values from a sample likely to include the true population value.
How can statistics help an investment firm determine the age and income of cryptocurrency investors?
By using statistical inference, particularly estimation, to analyze a subset of the data.
What is statistical inference?
Drawing conclusions about a population using sample data.
What is the purpose of estimation in statistics?
To approximate the values of population parameters based on sample data.
What is the purpose of using a proportion in estimation?
To derive an estimate based on the sample data.
What does the estimation process involve?
Extracting data from a sample and applying a proportion to estimate broader trends.
What sampling method is used for an interview conducted with school principals of a sample of cities in Florida? Dividing the population into groups (clusters) and randomly selecting whole groups to study.
Cluster sampling
Which sampling method is most appropriate for a poll of voters regarding a national highway development program? Dividing the population into subgroups (strata) and randomly sampling from each subgroup.
Stratified random sampling
What sampling method would likely be used in a survey of customers entering a shopping plaza in Orlando? (Choosing whoever is easiest to reach, not randomly.)
Convenience sampling
Systematic Random Sampling
Picking every nth person from a population, starting at a random point.
Simple Random Sampling
Every individual has an equal chance of being chosen, usually by random selection.
Interval
The consistent gap between selected elements in systematic sampling.
What distinguishes a statistic from a parameter?
A statistic describes a sample, while a parameter describes a whole population.
If a value is computed from all items in a population, what is it called?
It is called a parameter.
What type of values are always considered statistics?
Values computed from a sample.
difference between stratified random sampling and cluster sampling.
Stratified random sampling divides a population into subgroups so that each population item belongs to only one subgroup. Cluster sampling divides a population into groups that are each intended to be mini-populations.
What can AMA data analysts do if the margin of error is too large?
They can increase the sample size or decrease the level of confidence.
What is one method to reduce the margin of error that is not available to AMA data analysts?
Reducing the standard deviation.
How does increasing the sample size affect the margin of error?
It generally decreases the margin of error.
What effect does decreasing the level of confidence have on the margin of error?
It reduces the margin of error.
t-distribution
Bell-shaped like normal but with heavier tails; used for small samples.
What happens to the margin of error if a bank desires a higher level of confidence in its interval estimate?
The margin of error will increase.
Why do we remove rows with missing values?
To ensure the remaining dataset is complete and to avoid obscuring results.
What is a potential danger of removing rows with missing data?
The remaining dataset may be too small to analyze and could introduce bias.
What is a common method for imputing missing data?
Using the mean to impute missing data is easy to compute and apply.
Why is using the mean to impute missing data not ideal for skewed data?
The mean is sensitive to extreme values, which can distort the results.
What is a better alternative to the mean for imputing missing data in skewed distributions?
Using the median is recommended as it is not affected by extreme values.
Left Skewed Distribution
the mean to be less than the median.
Right Skewed Distribution
mean being greater than the median.
Normal Distribution
Symmetric; mean = median = mode; no skew.
What does margin of error represent?
How much a sample result might differ from the true population value.
How does sample size affect margin of error?
Larger samples reduce variability in the data, resulting in a smaller margin of error.
What is the effect of standard deviation on margin of error?
A higher standard deviation indicates more variability in the data, leading to a larger margin of error.
How does the level of confidence influence margin of error?
A higher level of confidence requires a larger margin of error to ensure the interval contains the true population parameter. (Higher confidence → bigger margin of error.)
What is the parameter of interest when using confidence intervals in hypothesis testing?
The mean of the population.
What does it indicate if confidence intervals overlap when comparing sample means?
There is no difference in the population means.
What type of data is used to make inferences about population parameters in hypothesis tests?
Sample data.
What is the primary use of confidence intervals in the context of comparing groups?
To determine if the means of populations are equal.