1/35
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
probability sampling
uses random selection to ensure members of a population have equal probabilities of being chosen
considered more sound sampling methods compared to non-prob methods bc probability sampling results can be used to describe the target population as a whole
non-probability sampling
Techniques do not rely on random sampling of the target population but rather upon the researcher’s subjective judgement about how the sample should be structure, or upon access to the sample
therefore the odds of any member of pop as a whole
confidence level
result of how confident researchers can be that if a study where replicated, the same results would be returned
margin of error
an indicator expressed as a percent range that estimates how much a statistic can fluctuate when using it to describe the true population. aka confidence interval or sample error
sample
a subset of a population, in market research, the population refers to the target of consumer/target market
Researchers rely on a sample (subset of population)
result from the sample is called a statistic
census
study that includes EVERY member of a population
result is called a parameter
IF a census collects data from every member of a population, it may be assumed that this is a superior technique and one researcher should strive to acheive, but often is not the case
simple random sampling
a completely random method of selecting subjects
random number generator can be used to select those on the list for participation
stratified random sampling
sampling technique where subjects are split into mutually exclusive groups (called strata) before sampling is employed
simple random sampling is then employed for those groups
quota sampling
involves gathering a sample in a proportional relation to predefined groups (cells) in the target population
aims to ensure that there is enough data per demographic (age, gender, etc)
snowball sampling
research participants themselves suggest other members to participate in the study
works well in populations that are difficult to find or identify
screening questions
used to qualify potential respondents to take part in a study
research panels
a collected group of potential respondents who opt in to participate in future research studies
once opted in, panel members provide their contact info along with their demographic and psychographic info about themselves
helps researchers target potential respondents who qualify for particular studies
some companies maintain their own
appeal is having access to a readily available sample od respondents from which to choose
disadvantage: members are not exactly a random sample of the population
online/mobile survey distribution (survey distribution methods)
self administered surveys taken via the internet. disaggreated into 2 approaches
emailed or text invitations
advantages
respondents are affiliated with the topic in some way → helps response rate
surveys typically have to be brief, but since this group of people have a relationship with the survey topic, they can get away with longer surveys
ability to send targeted reminders
unique URL - company has respondents data like demographics
surveys accessed directly on websites
social media sites or elsewhere online when direct contact info is not given
surveys via mail - snail mail (survey distribution methods)
best for sensitive topics and when respondents need to gather info and contemplate their responses
useful when targeting a specific geographic location or when it is necessary to respondents in places with unreliable internet access
response rates low, costs are high
telephone interviews (survey distribution methods)
calling potential respondents and inviting them to partifcipate in an interviewer led survey
when contact is too general people do this
benefit is that the interviewer can clarify questions and probe
costs are higher, results lower
response rate
percentage of those who completed a survey out of the potential respondents on sample list
response bias
results from instances when a respondent answers a question in a way that misrepresents the truth
self-selection bias
occurs when respondents feel strongly about a survey topic and are, therefore, more likely to participate
non-response error bias
occurs when there is a significant difference between results of those who completed a survey and those who did not complete a survey
social desirability bias
occurs when respondents alter responses either to inflate their self-worth or provide responses that seem more socially acceptable than the truth
piping and screening
piping - a survey software system’s automatic insertion of text into a survey based on information from the sample list or based on responses to prior questions in the survey
screening - set of questions, typically delivered verbally or via written questionnaire, used to establish that a prospective participant is a good fit for a given research project
ethics in survey research
code of standard and ethics for market research and data analytics
insight association is a membership organization representing companies, data analytics departments, and individuals working in the marketing, opinion, social research, and data analytics professions
developed a code of standards and ethics
GDPR - general data protection regulation
comprehensive data privacy regulation that was introduced by the EU to protect the personal data and privacy of EU citizens
main goal is to give individuals greater control over their personal data and to simplify the regulatory environment for business operating within the EU
GDPR includes:
consent
data subject rights - GDPR grants individuals various rights over their personal data (erase, access, rectify, etc)
data portability - individuals have the right to receive their data in a commonly used and machine readable format and transmit data into another controller
data breach notification - orgs are required to report data breaches
privacy by design and default - orgs are encouraged to implement privacy measures from the beginning of the data processing lifecycle and to ensure that privacy is default setting
data protection officers - some organizations are required to appoint a data protection officer(DPO) to oversee GDPR compliance
cross-border data transfers - GDPR imposes restrictions on the transfer of personal data outside the EU to ensure that data is adequately protected
noncompliance to GDPR can resulting large fines
confidentiality
refers to not sharing individual responses with anyone outside of the research team, including study sponser
researchers themselves have access to the individual data (contact info and responses) since raw data from surveys for instance is collected individually before being aggregated
great care is taken to safeguard info so that it is only used by appropriate research staff
anonymity
refers to data in which even the researcher cannot identify respondents.
Respondents are anonymous
can be achieved by sending a generic link to the sample without identifying characteristics embedded
creates some challenges: can’t send reminders because ALL respondents get the reminder even if they have already taken the survey
instead “blanket” reminders are sent → all potential respondents get a reminder
forgoes the ability to link responses to demographic information in the sample which could be used for ensuring a represnetice sample of the population for segmenting results
measure of central tendency
summary measure that helps describe a set of data in a single value
the value in a measure of central tendency represents the center of distribution
mean
sum of all the values divided by the number of responses
median
middle value in a distribution when the values are arranges in ascending or descending order
mode
most commonly occurring value in a distribution
range
difference between smallest value and largest value of set. provides an idea of how widely spread out the most extreme responses are
standard deviation
the standard deviation is a stat that measures the set of a dataset relative to its mean. the further the data points from the mean, the higher the standard deviation
tells the researcher if the responses are concentrated around the mean or if they are scattered
square root of variance
variance = subtracting value of mean from each data point then square those values, then add them up and divide them by N-1, calculate square root and thats the standard deviation
raw data
data that has not been aggregated or summarized in any way
data cleaning / clean data
refers to process of identifying and correcting errors, inconsistencies and inaccuracies in datasets
may involve
handling missing data
removing duplicates
standardizing formats
correcting inconsistencies
dealing with outliers
ensuring accuracy
coding categorical variables
transforming variables
excluding data from analysis
clean data = data that has no errors, is consistent, and accurate
top box scores
while mean scores are usually used when reporting results into interval questions (such as those with rating scales), percentages can be used as well to help summarize and compare the responses in a different way
top box scores use the percentage of respondents who answered using the highest value in a rating scale
may be used instead of means to isolate a report the proportion of respondents who are very positive about a feature or a conceot
often used as a way to report the level of brand loyalty
top 2-box scores combine percentages of those who answered the highest and second highst in a rating scale
word clouds
text-based visuals that give greater prominence to words that appear more frequently in a data set
provides a visual summary of qualitative data, validates survey results, and identifies key themes in research findings
weighting data
refers to giving more or less power (or weight) to data in underrepresented or overpresented segments so that the data more closely matches the mix of the target population
stat software can be used to weight certain cases in the sample to a desires level
coding open ended responses
coding open ended data = categorizing responses into themes. oftentimes there may be several different themes presented in one comment
there are software programs that can help with the coding process but more often then not they have to be somewhat manual coding activities
involves developing “codes” to assign comments to themes
to uncover the major themes, percentages are calculated for each code
the denominator is typically the number of total comments gather for a particular question and the numerator is the raw number of comments that correspond to each respective code
transforms qualitative data → quantitative