Notes on Choosing Cases and Case Selection

Overview and core ideas

This chapter, "Choosing Cases," delves into the critical decisions researchers make when selecting specific cases for their studies, a process that profoundly impacts both the external validity (the extent to which findings can be generalized beyond the studied cases) and internal validity (the confidence that observed relationships are indeed causal and not due to confounding factors). The text employs vivid analogies, such as the metaphor of Ricky Jay’s card tricks, to vividly illustrate the inherent danger of "stacking the deck." This refers to the subtle or overt bias that can occur when cases are chosen in a way that preordains or unduly favors a certain conclusion, whether this bias is introduced deliberately or inadvertently. A meticulously designed case selection strategy is therefore paramount for researchers to draw credible and robust inferences, allowing them to either generalize findings from a limited subset of cases to a broader population or to precisely identify the intricate causal mechanisms at play within a specific case.
External validity is a central concern, addressing the generalizability of research findings. Due to inherent practical constraints such as limited time and financial resources, it is rare for researchers to analyze an entire population. Instead, they must rely on samples—a subset of the population. To achieve meaningful generalization, researchers must possess conceptual clarity regarding the precise definition of the "population" under study. This involves meticulously defining the relevant and theoretically significant set of cases to which the findings are intended to apply, which in turn dictates which individual cases are considered members of that population.
Internal validity, on the other hand, focuses on whether the observed relationships genuinely reflect underlying causal processes, rather than being spurious or influenced by unmeasured confounding factors. In large-n designs (studies involving a substantial number of cases), researchers often leverage randomization methods to enhance internal validity, as this helps distribute potential confounding factors evenly across comparison groups. Conversely, case-study designs (which involve intense investigation of one or a few cases) typically rely on qualitative methods such as process tracing to meticulously establish causal order and delineate the specific mechanisms through which a cause leads to an effect within the chosen cases.

Populations, samples, and generalization

Population vs. sample: The operational definition of the population of cases is fundamentally shaped by how core theoretical concepts are precisely defined. For instance, if a researcher is studying "political parties in communist countries," the scope of the population hinges on several definitional choices: whether to include only present-day communist countries or all historical instances, and how the term "communist country" is precisely delineated (e.g., based on state control, ruling party ideology, or economic system). Sometimes, the population is intentionally framed to be quite narrow (e.g., "all contemporary communist regimes"), whereas at other times, it might be conceived more broadly (e.g., "all countries exhibiting similar characteristics over a significant historical period"). The clarity of this definition is vital for consistent case selection.
Sampling for external validity: While an ideally comprehensive research approach would involve analyzing the entire population of interest, practical realities almost always necessitate the use of sampling. To ensure that a sample accurately reflects the larger population and thereby supports external validity, random sampling techniques are preferred. These methods aim to create a sample that mirrors the characteristics of the population, though achieving a perfect mirroring is exceptionally rare. A widely understood benchmark in survey research, for example, is that a random sample of approximately 1,500 American adults typically yields a margin of error (MOE) of about $\pm 3\%$ for many survey estimates, assuming a 95% confidence level.
- Specifically, a margin of error (MOE) of approximately $\pm 3\%$ is often achieved when the sample size (n) is around 1,500, particularly for large and relatively homogenous populations. This means that for 95% of the samples drawn, the true population parameter will fall within three percentage points of the sample estimate.
- It is crucial to note that if a smaller sample size is employed, the margin of error commensurately increases. For example, a sample of roughly $n \approx 400$ might result in an MOE of approximately $\pm 5\%$ or more, indicating less precision in the estimates. This inverse relationship between sample size and margin of error is a fundamental principle in statistical inference.
Random vs systematic vs stratified sampling:
- Simple random sample: This method ensures that every possible subset of the population has an exactly equal chance of being selected for the sample. For example, a researcher might use a random number generator to select 370 cases from a total of 1,469 Supreme Court oral argument cases, where each case has an identical probability of inclusion.
- Systematic random sample: This technique involves selecting every k-th unit from an ordered list of the population, after a random starting point has been chosen. For instance, a researcher might select every tenth name from a comprehensive university campus directory, ensuring a spread across the list.
- Stratified random sample: This method is employed when the population can be meaningfully divided into distinct subgroups or "strata" (e.g., based on gender, age, socioeconomic status, or type of political office). Researchers then draw random samples independently from each stratum. This ensures that all critical subgroups within the population are adequately represented in the final sample, which can enhance precision and prevent underrepresentation of smaller but important groups.
- Multistage sampling: This sophisticated method combines multiple stages of sampling. For instance, in national exit polls, the process might involve initially sampling a subset of counties across the country, then randomly selecting specific precincts within those counties, and finally, randomly selecting individual voters within the chosen precincts as they exit polling stations. This approach is highly efficient for large, geographically dispersed populations.
Nonrandom samples and biases:
- Convenience samples: These are samples of individuals who are easily accessible or readily available to the researcher (e.g., interviewing passersby on a street corner, or students in an introductory psychology class). While quick and easy to obtain, convenience samples carry a high risk of unrepresentativeness, as they do not account for hidden biases in accessibility. This directly threatens external validity, making it difficult to generalize findings to a broader population.
- Snowball samples: This technique is particularly useful when the target population is difficult to locate, enumerate, or unwilling to openly participate (e.g., members of illicit networks, marginalized communities, or specialized professional groups). Researchers begin with a few initial contacts who fit the study criteria and then ask them to refer other individuals who also meet the criteria, causing the sample to grow in a "snowball" fashion. While practical for hard-to-reach groups, these samples are inherently prone to biases, as individuals are linked through social networks, potentially leading to a homogenous sample that does not represent the full diversity of the population.
- Selection bias: This occurs when the process of selecting cases systematically omits or underrepresents important segments of the population. Such bias can severely distort conclusions drawn from the study. A classic and instructive example is the 1936 Literary Digest poll, which famously predicted a win for presidential candidate Alf Landon over Franklin D. Roosevelt. The poll's mailing list was disproportionately drawn from telephone directories and automobile registrations, which at the time significantly overrepresented wealthier Americans. This affluent demographic was more likely to vote Republican, leading to a biased sample that inaccurately forecasted the election outcome.
- Nonresponse bias: Even a meticulously designed random sample can become biased if specific groups within the sample are systematically less likely to respond or participate in the study. For example, in political surveys, certain ideological groups (e.g., conservatives), professional affiliations (e.g., union members), or age demographics (e.g., younger adults) may have lower response rates, leading to their underrepresentation in the final dataset. Similarly, the increasing prevalence of cell-phone-only households has complicated survey research, as these individuals are often underrepresented in traditional landline-based polls, introducing a different form of nonresponse bias.
- Modern sampling challenges: Many populations relevant to contemporary political science research are inherently difficult, if not impossible, to fully enumerate. Examples include the precise number of political NGOs operating in a particular African country, the true size of a terrorist cell, or the full range of niche political actors influencing policy debates. While researchers may employ sophisticated estimation techniques to approximate population sizes, achieving truly precise and random sampling in such contexts can be practically impossible, necessitating alternative, often deliberate, case selection strategies.
Case studies and external validity: For designs that involve intensely studying a single case or a small number of cases (small-n research), random sampling is typically not only inappropriate but also antithetical to the research goals. Instead, deliberate sampling is the standard approach, where cases are chosen purposefully based on theoretical relevance. External validity in case studies, though often harder to establish than in large-n studies, can be meaningfully enhanced through specific, strategic choices:
- Selecting a typical set of cases: This strategy involves choosing cases that are believed to be representative of broader patterns or phenomena across a larger population. The aim is to demonstrate that the findings from the case are likely applicable to or illustrative of other "normal" instances within that population (a "typical case strategy").
- Choosing cases that are intrinsically important or hard tests: Researchers may select cases that are inherently significant for theoretical or policy reasons, or which present a particularly demanding test for a proposed theory. If a theory holds up under such stringent conditions, its broader applicability is strengthened (these are often called "hard cases").
- Including visually diverse cases: This involves selecting cases that exhibit a wide range of variation on relevant dimensions. The goal is to avoid situations where the research focuses only on similar cases, potentially leading to erroneous conclusions, and to ensure that outliers or unusual observations do not unduly drive the interpretative narrative (often called "deviant" or "outlier cases").
- Ensuring variation across cases in dependent variables or key independent variables: This strategy is central to comparative case studies. By deliberately selecting cases that show contrasting outcomes (differences in the dependent variable) or contrasting causal factors (differences in key independent variables), researchers can better calibrate covariation and more effectively refute competing explanations. This approach is closely aligned with John Stuart Mill's methods of causal inference, particularly the Method of Difference.

Case selection strategies to boost external validity

Population awareness: The majority of researchers operate within practical limits, meaning they must work with samples rather than entire populations. Consequently, the precise definition of the research population is a critical foundational step that must be undertaken with extreme care, aligning closely with the core concepts of the study. A vague, inconsistent, or theoretically unfounded definition of the population can lead directly to weak generalizations, where conclusions are made about groups that were not actually studied, inevitably resulting in contested research findings and undermining the credibility of the study.
Mill’s methods of causal inference (in short):
- Method of Difference (strategy B/C in the text): This method involves comparing two or more cases that are as similar as possible on all relevant independent variables except one, and which exhibit different outcomes on the dependent variable. The unique independent variable that differs between these otherwise similar cases is then considered the plausible cause of the differing outcome. The logic is to isolate the causal factor by seeing what changes when the outcome changes, while everything else remains constant.
- Method of Agreement (strategy D): In contrast, this method seeks to find multiple cases that share the same outcome on the dependent variable but are dissimilar across a wide range of potential causal independent variables. By identifying the one (or few) independent variable(s) that are common across all these diverse cases that share the same outcome, researchers can rule out the dissimilar factors as causes and tentatively identify the common factor as the likely cause. The aim is to find what they agree on despite their differences.
- A refined approach often combines elements of both the Method of Difference and the Method of Agreement (corresponding to strategy C in some classifications). This typically involves selecting cases that are similar in some respects but differ in theoretically important ways, allowing researchers to systematically eliminate a broader array of potential causes while isolating a plausible main cause that consistently co-varies with the dependent variable. In practice, exact "one-cause" puzzles, where only a single factor varies, are exceedingly rare. Therefore, researchers often employ strategies that involve partial variation and sophisticated causal reasoning across a small number of carefully selected cases, aiming to build a cumulative case for a specific causal argument.
Choice of cases along three broad axes:
- Typical case: This strategy is deployed when prior research or existing theory strongly suggests a general pattern or relationship that is widely representative of a broader population. The primary objective is to corroborate this established general pattern by conducting an in-depth investigation of a "normal" or highly representative instance. If the theory holds true in a typical case, it lends weight to the idea that the theory accurately describes the general phenomenon.
- Deviant/outlier case: These cases are chosen because they do not conform to existing theoretical expectations or established patterns; they are statistical or theoretical outliers. By intensely studying such cases, researchers aim to test the very limits of a theory, often revealing crucial nuances, conditioning factors, or alternative causal mechanisms that operate differently under unusual or exceptional conditions. Analyzing deviant cases can lead to significant theoretical refinement or even the development of entirely new theoretical frameworks.
- Hard case: A "hard case" is selected specifically because it presents a most-difficult or most-likely-to-fail test for a proposed hypothesis. If a hypothesis or theory manages to hold true and explain the outcome even under these demanding conditions, then its plausibility and applicability to other, less challenging, cases is significantly strengthened. An example often cited is the Cuban Missile Crisis as a hard case for bureaucratic politics theory; if bureaucratic politics can explain such a high-stakes, crisis decision, its explanatory power is considered robust.
- Easy case: Conversely, an "easy case" is chosen because it represents a scenario where the hypothesis or theory is expected to work with a high degree of certainty, often under ideal or nearly ideal conditions for the theory. The utility of an easy case lies in its potential for strong disconfirmation: if the hypothesis fails to hold even in an easy case, where all conditions are favorable, it strongly suggests that the hypothesis itself is weak, fundamentally flawed, or severely limited in its applicability.
Typical case in health policy example: Ellen Immergut’s influential cross-national comparison of health policy outcomes across different European nations provides an excellent illustration of selecting typical cases to enhance generalizability. She critically examined Sweden, characterized by very high government involvement in health care; Switzerland, known for its comparatively low government involvement and more liberal market-based system; and France, representing an intermediate level of government intervention. This deliberate selection, spanning a meaningful range of government involvement, helped Immergut avoid the pitfalls of overgeneralizing from a single country's unique experience. By demonstrating a consistent pattern across these three strategically chosen cases, particularly regarding the role of institutional veto points in shaping policy, her study greatly strengthened confidence in the broader patterns observed across affluent democracies.
Stratified purposive sampling in practice: This approach combines elements of stratification with deliberate, theoretical purpose. Researchers may consciously select a set of cases to systematically cover meaningful variation along key theoretical variables, while simultaneously maintaining a degree of comparability on other, less focal variables. For instance, a researcher studying the "resource curse" phenomenon might deliberately select Saudi Arabia as a typical case (exhibiting the expected negative effects of resource dependence) but also include "outliers" like Norway (which has successfully managed its oil wealth, defying the curse) to rigorously test the boundaries and conditions of the theory. This ensures both typicality and the exploration of critical boundary conditions.
The role of theory in case selection: Theory-informed sampling is not merely about achieving statistical representativeness, but fundamentally about maximizing theoretical insight and explanatory leverage. In certain research projects, such as a deep dive into the implementation of Medicaid policy in a specific U.S. state, a single, exceptionally informative case can illuminate broader dynamics and generalizable patterns about an entire class of policies or bureaucratic behaviors. In other scenarios, however, theoretical considerations might explicitly mandate a broader comparative sampling strategy to test the generalizability of a hypothesis across diverse contexts or to identify conditions under which a theory is more or less applicable.
Trade-offs and cautions: While deliberate sampling strategies are highly effective at improving internal validity by allowing for a deep, nuanced understanding of causal mechanisms within a few selected cases, researchers must exercise extreme caution not to overstate the external validity (generalizability) of their findings. The credibility of a study becomes stronger when cases are chosen strategically to reflect crucial theoretical variation and to illuminate specific causal mechanisms. However, this focused approach often comes at the cost of universal generalizability. Researchers must be transparent about their specific sampling rationale and acknowledge the explicit limitations of their findings’ broader applicability.

Internal validity, covariation, and mechanism-focused evidence

Internal validity in observational research fundamentally hinges on two key achievements: first, compellingly demonstrating covariation among the key concepts of interest (i.e., showing that variables change together in a predictable manner); and second, rigorously establishing causal order (i.e., demonstrating that the presumed cause precedes and influences the effect). In all descriptive or causal hypotheses, researchers actively seek to identify and exploit variation, either across different cases or over different points in time within a single case, to firmly establish plausible covariation between variables and build a convincing argument for cause and effect.
Large-n designs offer a distinct and considerable advantage for both robustly testing covariation and effectively controlling for a multitude of potentially confounding factors. The sheer number of cases in such designs provides a wealth of combinations of factors, allowing researchers to explore numerous hypotheses and isolate the impact of specific variables. A large, well-executed random sample not only significantly enhances external validity by ensuring representativeness, but also substantially bolsters internal validity by introducing extensive variation across independent and control variables. This extensive variation allows for more powerful statistical controls and a more confident inference about causal relationships.
The small-n challenge: With only a few cases available for study, researchers face the significant challenge of effectively limiting the number of plausible rival hypotheses and instead focusing on uncovering robust evidence of causal mechanisms. This is where process tracing emerges as an indispensable tool. Process tracing involves a detailed, within-case analysis where researchers meticulously document how each individual link in a hypothesized causal chain operates. This often requires sifting through and analyzing diverse sources of evidence—ranging from official government documents, party platforms, and extensive news coverage, to policy briefs, biographies, and interview transcripts—to reconstruct the sequence of events and decisions that connect the independent variable to the observed outcome. The goal is to provide a granular, empirical account of the causal steps.
Equifinality: Even when supported by sound process tracing and compelling case-level evidence, a critical consideration for researchers is the phenomenon of equifinality. This refers to situations where multiple distinct causal pathways or sequences of events can lead to the very same outcome. Good case studies are characterized by their explicit acknowledgment of equifinality. Researchers rigorously defend the plausibility and empirical support for their identified mechanism, while simultaneously remaining open to and sometimes discussing the existence of alternative routes that could also produce the observed outcome. This enhances the theoretical sophistication and intellectual honesty of the research.
Data richness and case selection: Influential scholars like Alexander George and Andrew Bennett, as well as Robert K. Yin and others, strongly advocate for choosing data-rich cases. These are cases where an abundance of high-quality, relevant empirical evidence is available to support detailed and rigorous process tracing. The presence of extensive documentation (e.g., archival records, memoirs, internal reports, diplomatic cables) is crucial for demonstrating the intricate causal connections within a case and significantly helps defend against potential accusations of "stacking the deck" by showing a comprehensive evidence base for the causal claims.

Case study construction and evidence gathering

Process tracing as a critical tool: For a case-study design to offer credible and strong internal validity, it is absolutely essential for researchers to empirically demonstrate how their hypothesized independent variable(s) generated the observed outcome. This must be shown through a clear, traceable, and empirically supported chain of events and decisions. This involves meticulously documenting the array of decisions made by key actors, identifying the specific institutional constraints that shaped their choices, and analyzing the strategic choices and interactions that collectively connect the initial cause to the eventual effect. It's about revealing the "how" in a concrete, empirical manner.
Methodological cautions about case selection: While random or representative sampling remains the gold standard for large-n, variable-oriented studies aiming for broad generalization, single-case or few-case studies retain immense power and validity when executed meticulously. Their strength lies in the strategic selection of cases designed to maximize explanatory leverage, to reveal granular causal mechanisms, and to systematically cover key dimensions of theoretical variation. Such deliberate choices, rather than random probability, underpin the rigor of small-n qualitative research.
Examples highlighted in the text:
- Immergut’s health policy comparison (Sweden, France, Switzerland) is a prime example of effective comparative case selection. By carefully varying institutional veto points and policy regimes across these three countries, Immergut was able to explain significant cross-national differences in health policy outcomes, demonstrating how specific institutional structures impede or facilitate reform.
- Patashnik’s cross-domain case selection (examining policy reform in taxation, transportation, and agriculture in the United States) illustrates how variation across different policy domains within a single country can reveal generalizable patterns about the dynamics and challenges of policy reform. This approach allowed him to identify common factors influencing reform efforts despite the sectoral differences.
- Nobles’ comparison of race constructs in the U.S. versus Brazil’s census categories highlights the power of cross-country comparison in revealing how deeply embedded social categories, specifically those related to 'race', can profoundly shape policy outcomes and societal structures. By contrasting different national approaches to racial classification, she unveiled the constitutive effects of such categorization.
Intrinsically important or hard cases are particularly instrumental in sharpening our theoretical understanding, especially when prevailing conventional beliefs or consensus theories are challenged. Examples include:
- Norway as an outlier in the resource-curse literature: Most resource-rich countries experience negative economic and political outcomes (the "resource curse"). Norway, with its successful management of oil wealth and strong democratic institutions, represents a crucial deviant case that forces researchers to refine the theory by identifying the conditions under which the curse can be avoided.
- 9/11 as a pivotal case for terrorism studies: The scale and nature of the 9/11 attacks made it an intrinsically important event that profoundly reshaped the field of terrorism studies. Its analysis has led to significant theorizing about new forms of terrorism, state responses, and international security dynamics.
- The Netherlands (Lijphart) as a counterexample to cross-cutting-cleavage theories: Arend Lijphart's study of the Netherlands as a consociational democracy challenged the prevailing wisdom that societies with deep, reinforcing cleavages were inherently unstable. The Dutch case showed how institutional arrangements could manage such divisions, prompting a re-evaluation of theories of conflict and stability.
Consequences of selective choices: The practice of "stacking the deck"—which refers to the deliberate and often concealed selection of cases that all uniformly support a predetermined conclusion—gravely undermines the credibility and scientific integrity of any research. To counteract this, researchers have an ethical and methodological obligation to clearly and transparently demonstrate how their cases were selected, articulating the precise theoretical or empirical reasons why those particular cases are informative. Furthermore, robust research often includes a discussion of how the study's results might have fared under alternative case selections or what insights could be gained from studying different types of cases, thereby preemptively addressing potential critiques of selection bias.

Practical guidance and concluding reflections

External validity requires thoughtful case selection: Attaining external validity is not a passive outcome but requires a highly thoughtful and deliberate approach to case selection. For large-n or variable-oriented designs, random sampling methods are generally preferred as they offer the most robust statistical basis for generalizing findings to a broader population. Conversely, for case studies focusing on in-depth understanding, deliberate sampling strategies (such as selecting typical, deviant, or hard/easy cases) are often more appropriate and yield richer insights. It's not uncommon for some ambitious projects to benefit significantly from combining approaches; for instance, conducting a large random sample to identify general attitudes, followed by an intensive, theory-driven case study of a specific deviant or particularly typical case to understand underlying mechanisms.
Variation is essential: Regardless of the specific research design (large-n or small-n), the presence of meaningful variation is foundational for strong causal inference. Whether this involves having several distinct cases with different characteristics or observing changes and differences within cases over time, variation enables researchers to clarify which factors truly matter and which do not. It allows for rigorous tests of covariation (do the cause and effect vary together?) and provides empirical leverage to explore causal mechanisms across diverse contexts, thereby strengthening the convincing power of the argument.
The trade-off between breadth and depth: Researchers must continuously navigate a fundamental epistemological trade-off: breadth versus depth. Broad samples, characterized by a large number of cases, excel at improving statistical generalizability and external validity, but they often necessitate a shallower data collection per case, limiting the ability to uncover intricate causal pathways. Conversely, deep, theory-driven case studies, while yielding exceptionally rich causal narratives and detailed mechanism insights, frequently sacrifice universal applicability or broad external validity. Researchers should be fully transparent about where they have chosen to position themselves on this continuum, explicitly stating their sampling rationale and acknowledging the inherent limitations imposed by their choices on both the generalizability and the specific scope of their overall findings.
Final takeaway: A truly robust and compelling study in political science, irrespective of its particular design, often strategically blends multiple justification strands for its case selection. This might include selecting cases that are typical, deviant, hard, easy, or intrinsically important. Such studies critically employ process tracing and/or other qualitative methods to meticulously connect the causal links within cases. Throughout this rigorous inquiry, the researcher maintains a crystal-clear understanding and transparent self-awareness of both the study's internal validity (are the causal claims credible within the cases?) and its external validity (to what extent can findings apply beyond the studied cases?). The ultimate aim is to construct a convincing and empirically grounded causal story, while scrupulously avoiding the common illusion that findings from a limited set of cases automatically and universally generalize beyond the specific boundaries of the studied set.

Practice prompts and reflective questions

Practice: Inspecting readings encourages you to actively engage with academic texts by assessing the nature of the cases presented (whether they represent a sample or an entire population), determining if the sampling method was random or deliberate, and critically evaluating how these case selection decisions might influence both the external and internal validity of the study's conclusions. This section also prompts you to consider potential biases, such as selection bias and nonresponse bias, and reflect on how such biases could significantly influence or distort the findings and ultimate conclusions of the research.
Practice: Building exercises are designed as active learning opportunities, inviting you to specifically apply the concepts of case selection to practical scenarios. For instance, you might be asked to design a comprehensive sampling plan for a study investigating community school performance, or to analytically reason through the potential effects of employing different sampling strategies on a hypothetical study of voter turnout across 25 countries within a fictional continent. These hands-on exercises are crucial for solidifying your understanding of how strategic sampling choices directly shape the strength and scope of research inferences.
Practice: The book repeatedly foregrounds the danger of “the Ricky Jay problem”—a powerful metaphor for researchers inadvertently or deliberately "stacking the deck" by choosing cases that are biased towards supporting favored conclusions. In response to this danger, the text offers a comprehensive array of concrete strategies. These include systematic methods like random and stratified sampling for large-n designs, as well as deliberate sampling techniques (typical, deviant, hard/easy cases) and rigorous process tracing for case studies. These detailed methodological tools are presented as essential safeguards to bolster both the external and internal validity of political science research, ensuring the credibility and robustness of findings.

Key terms and concepts to remember

External validity: Refers to the extent to which the findings and conclusions derived from a study can be accurately generalized or applied beyond the specific cases, settings, and participants initially studied to a broader population or different contexts.
Internal validity: Pertains to the credibility and confidence that the observed relationships within a study genuinely reflect causal inferences, meaning that changes in the independent variable are indeed responsible for changes in the dependent variable, allowing for the effective ruling out of alternative, confounding factors.
Population vs sample: The population is the entire, theoretically defined set of cases or units of analysis that a researcher is interested in studying and to which they wish to generalize their findings. A sample is a carefully selected subset of cases drawn from that larger population, which is then actually studied in the research.
Random sampling: A foundational probability sampling technique where every single case or unit within the defined population has an exactly equal and known individual chance of being selected for inclusion in the sample. This method is designed to maximize the representativeness of the sample.
Stratified random sampling: A more advanced probability sampling method where the entire population is first divided into distinct, non-overlapping subgroups or "strata" based on one or more relevant characteristics (e.g., age, gender, region). Then, random samples are independently drawn from each of these strata, ensuring proportional or theoretically relevant representation of each subgroup.
Convenience sampling: A nonrandom, non-probability sampling technique where cases are selected primarily because they are readily accessible, easy to reach, or convenient for the researcher. This method carries a high risk of producing unrepresentative samples and can severely compromise external validity.
Snowball sampling: A non-probability sampling technique particularly useful for studying hard-to-reach or hidden populations. It starts with a small number of initial contacts who fit the study criteria, who then refer other potential participants from their networks, causing the sample to grow exponentially.
Selection bias: A systematic distortion or error that occurs when the process of selecting cases for a study is not truly random or representative, leading to the systematic omission or underrepresentation of important segments of the population. This bias can significantly skew research findings.
Nonresponse bias: A specific type of selection bias that arises when certain groups within a randomly selected sample are systematically less likely to participate in a survey or experiment compared to other groups, leading to an unrepresentative final dataset.
Mill’s methods (Method of Difference, Method of Agreement, Method of Difference with multiple variables): A set of structured logical approaches developed by John Stuart Mill to infer causality from systematic comparisons of cases. These methods involve systematically looking for commonalities or differences in independent variables that correspond to commonalities or differences in dependent variables across cases.
Process tracing: An intensive, within-case analytical method used primarily in qualitative case studies. It involves meticulously dissecting and documenting the sequence of events, decisions, and causal links that connect a hypothesized cause to an observed outcome, using diverse empirical evidence to build a detailed and empirically supported causal narrative.
Equifinality: A concept that describes situations where the same outcome can be reached through multiple different causal pathways or sequences of events. In case studies, researchers must acknowledge that their identified mechanism might be one of several possible routes to the outcome.
Typical case: A case purposefully selected because it is believed to be highly representative of broader patterns, trends, or theories within a larger population. It is used to corroborate general patterns and cautiously generalize findings to other "normal" instances.
Deviant/outlier case: A case chosen specifically because it deviates significantly from expected patterns or theoretical predictions. Studying such cases helps to test the limits of existing theories, refine explanations, and uncover contingent or alternative causal mechanisms.
Hard case: A particularly demanding or "least likely" case for a hypothesis or theory to hold true. If the hypothesis explains the outcome in a hard case, it significantly strengthens the confidence in its broader applicability and robustness.
Easy case: A case where a hypothesis or theory is expected to most readily apply and explain the outcome. Its primary utility is in strong disconfirmation: if the hypothesis fails even in an easy case, it suggests the theory is weak or flawed.
Intrinsically important case: A case chosen not necessarily for its representativeness or deviance, but because of its inherent historical, theoretical, or policy significance, or its potential to profoundly challenge prevailing wisdom and provoke new theoretical insights.

Notes on the accompanying practice prompts

The chapter's accompanying practice sections are meticulously designed to move beyond passive learning and actively help you internalize and apply these complex ideas in a practical manner. These prompts guide you through evaluating whether a studied case functions as a sample or represents an entire population, determining the underlying method of sampling (random versus deliberate), and critically considering how specific case selection choices directly impact both the external (generalizability) and internal (causal credibility) validity of research. Crucially, they also encourage you to engage in critical thought experiments, guiding you to reason through how different methodological designs and sampling strategies would concretely affect the conclusions drawn in various real-world scenarios, such as assessing community attitudes toward local schools or analyzing voter turnout patterns across a multi-country continent.

Overall takeaway

The selection of an appropriate case selection strategy is not a one-size-fits-all endeavor; it is inherently contingent upon the specific research design chosen and the precise nature of the research question being addressed. While random sampling provides the most robust foundation for external validity in variable-oriented, large-n designs, a deliberate, theory-driven sampling approach is often far more effective and indeed necessary for enhancing internal validity and facilitating the granular testing of causal mechanisms in case studies. Transparency regarding the rationale behind case selection, explicit and precise reasoning when defining the research population, and the rigorous application of methods like process tracing or various multi-method evidence gathering techniques are all critical elements. These practices collectively serve to mitigate potential biases and significantly fortify the overall credibility of any causal claims made. The ultimate goal is to construct a convincing and empirically robust causal narrative, always coupled with a clear and honest acknowledgment of the scope and limits of the findings' generalizability beyond the directly studied cases.