UMD SURV400 Midterm Flashcards

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/232

There's no tags or description

Looks like no tags are added yet.

Last updated 4:15 AM on 12/4/25

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

233 Terms

New cards

What is a survey?

A systematic method for gathering information from entities to construct estimates of their attributes on average.

New cards

What are examples of major surveys?

- Gallup public opinion polls,

- Harvard Adult Development Survey,

- the General Social Survey.

New cards

Why are surveys rapidly changing today?

Technology has changed how surveys are conducted and how people interact, causing the field to evolve faster than education and research can keep up.

New cards

What is survey methodology?

A field and profession that studies survey design, collection, processing, and analysis; it is multidisciplinary (statistics, psychology, sociology, etc.).

New cards

Survey

A systematic method for gathering information from entities (often people) to construct estimates of the attributes those entities have on average

New cards

Survey Methodology

A scientific field and profession that seeks to identify and study principles pertaining to survey design, collection, processing, and analysis

New cards

Data Science

Extracting meaning from and interpreting data using tools and methods from statistics and machine learning

New cards

Descriptive Research Question

Seeks to summarize characteristics of a set of data with no interpretation—just facts/attributes (e.g., "What is the average number of doctor visits reported by respondents?")

New cards

Exploratory Research Question

Analyzes data for patterns, trends, or relationships between variables, used for hypothesis generation: these are unplanned questions (e.g., "Is there an age difference in doctor visits?")

New cards

Predictive Research Question

Determines whether one or more phenomena can be used to forecast some future outcome, less interested in "why," just what predicts the outcome (e.g., "Can we guess whether a single household will be less likely to respond?")

New cards

Causal Research Question

Asks whether changing one factor will change another factor, requires controlled randomized trials or experiments to establish cause and effect (e.g., "Will this drug intervention reduce illicit drug use?")

New cards

Inferential Research Question

Uses a sample to make conclusions about a larger population

New cards

Mechanistic Research Question

Asks about the exact mechanism or process by which something occurs

New cards

Good Research Question Criteria

Must be: (1) of interest to your audience, (2) not already answered, (3) stemming from a plausible framework, (4) falsifiable, and (5) specific

New cards

Data Generation Process

The method by which data are collected, surveys often have large variation in size/quality, depend on modality, are often cross-sectional, and usually involve a sample rather than a census

New cards

Data Curation/Storage

Includes editing, de-identification, data entry, coding, error checking, dataset construction, codebook construction, building weights, and imputing missing values

New cards

Data Analysis

Using statistical models (t-tests, ANOVAs, regression) to make inferences about populations from samples, or machine learning models (KNN, decision trees, logistic regression) to make predictions

New cards

Data Output/Access

Communicating results through papers, dashboards, videos, blogs (science communication is almost as important as the science itself)

New cards

Total Survey Error (TSE)

A framework for thinking about various sources of error that may affect survey statistics, errors reflect uncertainty in an inference, not necessarily mistakes

New cards

Construct

Elements of information (variables) sought by researcher: usually abstract, described by words, often latent and not directly observable (e.g., happiness, quality of life, belief in God)

New cards

Measurement/Operationalization

Linking theoretical constructs to observable variables, the step-by-step protocols implemented to gather data: the construct is the "what" and measurement is the "how"

New cards

Response

The respondent value(s) from your measurement scheme (e.g., answers to questions, blood pressure readings)

New cards

Edited Response

Transforming data for specific use, including coding (text to numbers), acceptable answer sets, consistency rules, and reverse scoring negatively worded items

New cards

Target Population

The set of units to be studied: often abstractly defined with several ways to operationalize (e.g., adults in the US, users of a social media platform)

New cards

Sampling Frame

A set of units identified in some way that they could be sampled and located: ideally, every unit in the target population appears once and only once

New cards

Sample

A subset of the population from which measurements are drawn: the goal is to make inferences about the population from the sample

New cards

Respondents

Sample units that were successfully measured: the respondent pool may or may not equal the sample size

New cards

Post-survey Adjustments

Changes to survey data to make estimates better reflect the full target population, including selection weights, imputation, nonresponse weights, and poststratification

New cards

True Value

An idealized concept of a quantity to be measured: abstract and never truly known, but serves as a standard for comparison

New cards

Interviewer Variance

Error arising from different interviewers collecting different data despite having the same training, procedures, and workloads

New cards

Interviewer Bias

When personal factors of the interviewer systematically impact data collection

New cards

Sampling Variance

Variation in values of a survey statistic because different subsets of the population fall into samples over replications of the same survey design, measured via confidence intervals and standard errors

New cards

Sampling Bias

Consistent failure to estimate the proportion of the population correctly (e.g., relying on intro psych students when interested in all emerging adults): sampling bias is 0 for probability samples

New cards

Accuracy (Quality Dimension)

Total survey error is minimized

New cards

Credibility (Quality Dimension)

Data considered trustworthy by the survey community

New cards

Comparability (Quality Dimension)

Demographic, spatial, and temporal comparisons are valid

New cards

Relevance (Quality Dimension)

Data satisfy users' needs

New cards

Timeliness (Quality Dimension)

Data deliveries adhere to schedule

New cards

Completeness (Quality Dimension)

Data rich enough to satisfy analysis objectives without undue burden on respondents

New cards

An open-source statistical programming language

New cards

RStudio

An integrated development environment (IDE) for R that enhances R's usability by allowing you to keep track of objects, plots, run scripts, and more

New cards

Object

A container in R that stores information: you assign information using <- or = (e.g., course <- 400)

New cards

Numeric Data Type

Data consisting of all real numbers including whole numbers and decimals (e.g., num <- 6.5)

New cards

Integer Data Type

Data consisting only of whole numbers, denoted by an L (e.g., int <- 4L): a more efficient way of storing whole numbers

New cards

Character Data Type (String)

Text-based data enclosed in quotes (e.g., char <- "hello"), if not in quotes, R will try to interpret it as an object

New cards

Logical Data Type (Boolean)

Data that takes on TRUE or FALSE values (must be capitalized) (results from logical comparisons like 3 > 2)

New cards

Factor Data Type

A data structure for categorical variables that can be ordered or unordered. technically a structure, not a basic data type

New cards

class() Function

Command to check what data type is contained in an object

New cards

Vector

A unidimensional object that holds a singular data type, created using c() which stands for concatenate

New cards

Data Coercion Hierarchy

When mixing data types in vectors, R forces them to be the same following: character > numeric > integer > logical

New cards

length() Function

Returns the number of elements in a vector

New cards

Data Frame

R's primary means of data storage, similar to a spreadsheet where you can mix data types between columns but not within columns

New cards

nrow() and ncol() Functions

Return the number of rows and columns in a data frame, respectively. Only work on multi-dimensional objects

New cards

Indexing with []

Accessing specific elements of an object. R uses 1-based indexing (first element is position 1)

New cards

$ Operator

Used to index a named column in a data frame (e.g., df$column_name) (generally preferred method)

New cards

subset() Function

A more intuitive way to subset data frames based on conditions (e.g., subset(df, ratings > 7))

New cards

for Loop

A programming technique that runs a block of code a pre-specified number of times: structure: for(i in 1:10){ code }

New cards

install.packages() Function

Installs a package from CRAN, only needs to be done once and requires quotes around package name

New cards

library() Function

Loads an installed package for use in the current R session and must be done every time you start a new session

New cards

read.csv() Function

Reads a CSV file into R as a data frame

New cards

Working Directory

The default location where R searches for files: check with getwd(), change with setwd()

New cards

Commenting with #

Anything after # in a line will not be interpreted by R, used for transparency and reproducibility

New cards

summary() Function

Provides a five-number summary (min, 25th percentile, median, 75th percentile, max) and mean for numeric variables

New cards

mean(), sd(), var(), cor(), median()

Functions for calculating descriptive statistics on vectors

New cards

package::function() Syntax

Calling a function while explicitly stating which package it comes from, preferred for clarity and transparency

New cards

Mode of Data Collection

The method by which survey data are collected (e.g., face-to-face, telephone, mail, web)

New cards

CAPI (Computer-Assisted Personal Interviewing)

Computer displays questions on screen, interviewer reads them to respondent and enters answers

New cards

CATI (Computer-Assisted Telephone Interviewing)

Telephone counterpart to CAPI. Interviewer calls respondent, reads questions from computer, enters responses

New cards

CASI (Computer-Assisted Self-Interviewing)

Respondent completes survey on computer themselves. This can include text, audio, or video stimuli

New cards

ACASI (Audio Computer-Assisted Self-Interviewing)

Respondent sees questions on computer, hears recorded audio of questions, and enters their own answers. This increases privacy for sensitive topics

New cards

IVR (Interactive Voice Response)

Telephone counterpart of ACASI: respondent calls in, hears recorded questions, answers by keypad

New cards

TDE (Touchtone Data Entry)

Respondents call toll-free number, hear recorded questions, enter data using telephone keypad

New cards

SAQ (Self-Administered Questionnaire)

Paper questionnaire completed by respondent without interviewer present

New cards

CSAQ (Computerized Self-Administered Questionnaire)

Electronic version of SAQ completed on computer

New cards

Coverage Error (Mode)

Error due to the fact that not every unit in the population is represented on the sampling frame. Telephone surveys exclude those without phones, and web surveys exclude those without internet

New cards

Nonresponse Error (Mode)

Error that varies across modes. people may be more likely to complete certain types of surveys than others

New cards

Measurement Error (Mode)

Deviations of answers from true values. Sources include respondent, interviewer, instrument/questionnaire, and mode of data collection

New cards

Interviewer Effects (Positive)

Interviewers achieve higher response rates, motivate respondents, probe inadequate responses, provide feedback, clarify questions

New cards

Interviewer Effects (Negative)

Can lead to biased responses on sensitive questions, more expensive than self-administered modes

New cards

Social Desirability Bias (Mode)

Tendency to underreport sensitive behaviors in face-to-face surveys (self-administered modes reduce this)

New cards

Fixed vs. Variable Costs

Fixed costs (postage for set number of invites) vs. variable costs (hourly wages for unknown number of interviews) (affects mode choice)

New cards

Mixed-Mode Surveys

Using a combination of modes to compensate for weaknesses of individual modes, this is increasingly common

New cards

Random Digit Dialing (RDD)

Sampling method for telephone surveys that randomly generates phone numbers. No equivalent exists for web surveys

New cards

Probability-Based Online Panels

Panels where members are recruited via probability sampling methods (e.g., LISS, AmeriSpeak, KnowledgePanel)

New cards

Non-Probability Online Panels

Panels where members self-select or are recruited through non-random means. These are common but potentially biased

New cards

Elements

The unit of observation in a study (e.g., customers, households, businesses, schools, tweets)

New cards

Target Population Characteristics

Must be: (1) finite (can theoretically be counted), (2) observable/accessible, and (3) specific to a time frame (implies boundaries of space and time)

New cards

Unambiguous Population Definition

You should be able to clearly place elements in or out of the target population: "young adults in college" is better than "all young adults"

New cards

Sampling Frame

A list of elements in the target population that can be sampled and located. Coverage varies depending on frame quality

New cards

Coverage

The percent of the target population included in the frame (generally theoretical since we rarely know the exact population size)

New cards

Perfect Coverage

Ideal but unrealistic situation where the sampling frame exactly matches the target population

New cards

Undercoverage

When some members of the target population are missing from the sampling frame. This is the primary coverage concern since we can't reach them

New cards

Overcoverage

When the frame contains units not in the target population: includes ineligibles, duplicates, and blanks and can be identified and removed

New cards

Undercoverage Bias

Bias introduced when there is a difference between covered and uncovered units on the statistic(s) of interest

New cards

Duplication (Overcoverage)

Multiple frame entries link to the same element (e.g., a person with two phone numbers listed)

New cards

Clustering (Overcoverage)

Multiple elements can be reached via the same frame entry (e.g., a landline that reaches an entire household)

New cards

Ineligibles (Overcoverage)

Frame entries that are not part of the target population (e.g., businesses on a frame meant for individuals)

New cards

Blanks (Overcoverage)

Frame entries that don't connect to any unit (e.g., disconnected phone numbers)

New cards

Solutions for Overcoverage

Delete elements once identified: for clustering, take whole cluster or select one and weight up, for duplication, delete duplicates or weight down

100

New cards

Solutions for Undercoverage

Use multiple frames, combine modes, apply weighting (post-stratification), or change target population definition to match frame

Explore top notes

The Living World

Updated 1194d ago

Note

Unidad 8 Gramática: expresar preferencia

Updated 149d ago

Note

NaOH Concentration Determination by Titration

Updated 269d ago

Note

Disruption of Attachment

Updated 288d ago

Note

Respiratory system (book)

Updated 403d ago

Note

English Exam Review – Literary Devices & Short-Story Elements

Updated 115d ago

Note

32 elements General

Updated 314d ago

Note

The Ultimate Guide to AP United States Government and Politics

Updated 746d ago

Note

The Living World

Updated 1194d ago

Note

Unidad 8 Gramática: expresar preferencia

Updated 149d ago

Note

NaOH Concentration Determination by Titration

Updated 269d ago

Note

Disruption of Attachment

Updated 288d ago

Note

Respiratory system (book)

Updated 403d ago

Note

English Exam Review – Literary Devices & Short-Story Elements

Updated 115d ago

Note

32 elements General

Updated 314d ago

Note

The Ultimate Guide to AP United States Government and Politics

Updated 746d ago

Note

Explore top flashcards

Flashcards (32)

Flashcards (103)

Flashcards (42)

Most Common French Words

Updated 1154d ago

Flashcards (1000)

APUSH Midterm: Presidents in Order

Updated 10d ago

Flashcards (28)

London by Willliam Blake

Flashcards (20)

Flashcards (35)

Flashcards (24)

Flashcards (32)

Flashcards (103)

Flashcards (42)

Most Common French Words

Updated 1154d ago

Flashcards (1000)

APUSH Midterm: Presidents in Order

Updated 10d ago

Flashcards (28)

London by Willliam Blake

Flashcards (20)

Flashcards (35)

Flashcards (24)