Diminishing returns in sampling
Indicates that after a certain sample size, increases provide minimal additional accuracy.
Stratified Sampling
Dividing a sample population into subgroups and sampling from each to ensure representation across groups.
1/73
Flashcards focusing on key terminology and concepts related to survey methodology and statistics, enhancing understanding and recall for exam preparation.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Diminishing returns in sampling
Indicates that after a certain sample size, increases provide minimal additional accuracy.
Stratified Sampling
Dividing a sample population into subgroups and sampling from each to ensure representation across groups.
Confidence level
Means 95% of survey repetitions would fall within the margin of error.
Representativeness in convenience sampling
Refers to the limitation of generalizability as convenience samples may not represent the population accurately.
Probability sampling
Gives each population member an equal chance of being selected.
Efficiency in sampling
Allows researchers to make inferences about a population without surveying everyone.
Quota sampling
Ensures the sample reflects certain population characteristics without random selection.
Incidence Rate
Represents the proportion of the general population meeting the criteria.
Regional Representation in Stratified sampling
Stratified sampling ensures each region or subgroup is proportionally represented.
Snowball sampling
Effective for hard-to-reach populations by using current participants to recruit others.
Sample size determinants
Survey length doesn’t directly affect sample size; confidence level, margin of error, and population size do.
Sampling based on referrals
Snowball sampling involves using referrals from current participants to recruit more respondents.
Known selection probability
In probability sampling, every population member has a known selection chance, ensuring representativeness.
Generalizability in Non-probability sampling
Non-probability sampling may be less representative, limiting generalizability of results.
Close to 1 correlation coefficient
Indicates a strong relationship.
Type I error
Occurs when failing to detect a real difference.
When to use a paired T-test
Used when comparing the same respondents on two items.
ANOVA
The best test for comparing satisfaction groups with more than two groups.
T-test (for two groups)
Used when comparing mean scores between two groups on a continuous variable.
Impact of survey length
Long, complex surveys can reduce response rates, making participants less likely to complete them.
Response Rate in Mail surveys
Typically lower response rates than online surveys.
Increasing Response Rates with incentives
Offering incentives can encourage participants to complete the survey.
Cost-effective survey distribution
Online surveys are typically more cost-effective due to lower distribution and collection costs.
Maintaining confidentiality
Restricting access to personal data to authorized personnel.
Integrated Survey tools
Online survey software offers tools to create, analyze, and distribute surveys efficiently.
Confidentiality in Sensitive surveys
Online surveys are preferred for sensitive topics as they allow anonymity and reduce bias.
Reliability in survey data
High response rates improve data reliability and better reflect the population’s views.
Reducing survey bias
Randomizing questions and using neutral language helps avoid survey response biases.
Ethical data collection
Ensures participant confidentiality and informed consent.
Engaging survey invitations
Should be clear, concise, and engaging to motivate participants.
Using branching logic
Customizes the survey path based on prior responses, improving relevance and flow.
Self-selection bias
Occurs when only certain respondent types participate, skewing results.
Optimal reminder frequency
Best practices suggest no more than two reminders to increase response rate without overwhelming participants.
Leading questions
Can create implementation bias by suggesting specific responses.
Identifying Most common responses
The mode represents the most frequently selected response in a survey dataset.
Data cleaning
Corrects errors and inconsistencies in datasets before analysis.
Variation in data
Measures of spread like standard deviation show how responses differ from the mean.
Top Box scoring
Uses the percentage of respondents selecting the highest rating option.
Weighting
Adjusts the sample to better represent the population when certain groups are over/underrepresented.
Categorizing open-ended responses
Coding organizes open-ended responses into themes, allowing quantitative analysis.
Top 2-Box score
Adds the percentages for the two highest rating categories, showing positive sentiment.
Using the median without outliers
Median is less affected by outliers than the mean, making it more representative with skewed data.
Comparative Analysis by Demographics
Cross-tabulation helps compare survey responses across demographic groups like age or gender.
Shifting around a value
Standard deviation indicates clustering around mean values, describing data spread.
Balancing over-or-under representation
Weighting corrects for demographic imbalances to better represent the population.
Summarizing typical values
Measures like mean, median, and mode help summarize the central tendency in a dataset.
Segmenting by Demographics
Cross-tabulation segments responses by demographics, revealing insights specific to different groups.
Describing categorical responses
Percentages summarize categorical data, useful for reporting on proportions.
Quantifying open-ended questions
Coding open-ended responses categorizes them for easier quantitative analysis.
Primary purpose of inferential statistics
Aims to make inferences about a population based on sample data.
Null Hypothesis
Assumes no difference exists between groups in the target population.
Interpreting p-value with significance level
If a p-value is lower than the significance level, it suggests a statistically significant event.
Type I Error in Hypothesis Testing
Occurs when a true null hypothesis is incorrectly rejected.
T-test for comparing means between two groups
Used when comparing mean scores between two groups on a continuous variable.
Correlation coefficient close to 1
Indicates a strong positive relationship.
Meaning of p-value in hypothesis testing
Indicates the probability that the null hypothesis is true in the population.
ANOVA (for comparing 3 or more groups)
Used when comparing mean scores across 3 or more groups.
Interpretation of p-value with null and alternative hypothesis
If the p-value is greater than alpha, fail to reject the null hypothesis, no significant effect for the alternative hypothesis.
When to use paired T-test
Suitable for comparing related scores from the same respondents.
Type II Error in hypothesis testing
Happens when a false null hypothesis is not rejected, missing a true effect.
Conjoint analysis
Identifies which combinations of features are most preferred by customers.
Linear regression
Quantifies relationships between predictor and outcome variables, useful in forecasting.
Chi-squared test
Best suited for comparing categorical variables.
Multi-regression
Allows the examination of multiple factors together to predict an outcome, such as sales.
Key in probability sampling
Known chance of selection indicates that everyone has an equal chance.
Main purpose of sampling
Reduce cost and time.
What are types of probability sampling?
Types include simple random sampling, stratified, and systematic.
Best method for anonymous employee surveys
Online surveys.
Example of survey implementation bias
Leading questions, double-barreled questions.
What affects online survey response rates negatively?
Survey length and complexity.
Technique to adjust underrepresented survey results
Weighting.
What are percentages commonly used with?
Percentages are commonly used with categorical responses.
How are Top-Box scores calculated?
Percentage of highest ratings.
Technique that can be used to segment responses (e.g., by age group)
Cross-tabulation.