DS 300

Identity, identification, and information leakage

  • Identity vs. identification: being able to identify a person (satisfying multiple conditions) is not the same as knowing who the person is.
  • What can identify you? The idea that even if you share several attributes (birth date, ZIP, gender), you still reveal a lot about who you are; identity information is probabilistic, not absolute.
  • Example you heard about: a ZIP code might contain around 56,000 people; the claim is that there are many potential similarities between people in a ZIP, and the idea that this makes you unique is probabilistic rather than guaranteed.
  • The point: even with limited data (birth year, birth day, ZIP, gender), there can be enough overlap to threaten privacy and identity in nontrivial ways.
  • Bits of information and rough counting: discuss how many bits are needed to specify identity given observed data; if you only know birth year, you have far fewer possibilities than if you know day-of-year, ZIP, and gender; the more attributes you reveal, the more identifying power you disclose.
  • Quick aside on an estimation mindset:
    • If you know the birth year and nothing else, the number of possible people within a ZIP is roughly tied to the number of days in a year; with days per year around 365, the granularity is coarse, reducing identifying power.
    • With additional attributes (exact birth date, ZIP, and gender), the combination space grows, increasing likelihood that two or more people share the exact same combination in a given area.
  • The tone is probabilistic reasoning about identity; there is no absolute guarantee of uniqueness in large enough populations.

Geodesic game: locating people vs. locations

  • The speaker introduces a game concept called geodesic, which mirrors “20 questions” style deduction but for location rather than person.
  • Core idea: locate a person or place by asking location-based questions; the process reveals information bits about where something is.
  • Demonstration setup: attempts to locate a place in a region (e.g., Ghana, Montenegro) using clues such as license plates, signage, street names, landmarks.
  • The point: there is a lot of information embedded in geographic hints; even seemingly minor clues (car plates, road signs, a patchwork car, a hotel name) gradually reveal the location.
  • The activity highlights how much information is leaked by appearance, naming, and geography, sometimes revealing a location within a mile or so.
  • Prompt to the audience: think about how many bits of information are contained in a location to within one mile on Earth; this is a thought exercise about information content in geolocation.
  • Meta note: the exercise illustrates how easy it is to narrow down a location with enough clues, reinforcing the privacy/identification theme.

Visual and cognitive illusions as metaphors for rationality

  • The speaker shares a personal side project: writing about irrational behavior and cognitive illusions, using visual illusions as a metaphor for rationality.
  • Visual illusion demonstrations show that even our most trusted perceptual systems can be systematically wrong in repeatable ways.
  • Example 1: two tables with lines where one seems longer though they are equal; animation can make it look changed, but a ruler reveals the truth; once the illusion ends, prior learning doesn’t undo the bias.
  • Example 2: a cube with faces colored differently; two top arrows appear different colors, but they are identical; removing surrounding context makes the illusion disappear.
  • Takeaway from visual illusions: perception is fallible, and learning to distrust first impressions requires measurement or structured tools.
  • Vision is highly developed in humans; cognitive domains like financial decision-making lack specialized brain areas and consistent practice, making biases more likely.
  • The idea: cognitive illusions are harder to demonstrate and resist than visual ones, but similarly systematic and predictable.

Organ donation decisions and the default effect (Johnson & Goldstein)

  • Research question: why do organ donation rates differ across European countries? Is it culture, religion, or something else?
  • Observation: countries with seemingly similar cultures show large differences in willingness to donate organs.
  • Key insight: the form language and default options at the DMV have powerful effects on decisions.
    • Left-leaning form (opt-in): people must check to participate; many don’t check.
    • Right-leaning form (opt-out): people are presumed donors unless they opt out; people often stay opted in as a default, leading to higher donation rates.
  • Netherlands case: 28% donation rate after mailing every household; the dramatic increase comes from a form design that explicitly prompts not to opt out vs. to opt in.
  • The broader implication: the design of the choice environment shapes behavior more than the content of the choice itself.
  • Philosophical takeaway: many decisions, even those about life after death, are influenced by how choices are framed; people feel like agents, but choice architecture steers outcomes.
  • Economic interpretation: a standard rationality view assumes the cost of marking a choice exceeds its benefit; in reality, the difficulty and complexity of the decision, plus framing, leads to default-driven behavior.

Default effects and expert decisions (Redelman & Shafir)

  • An experiment with physicians evaluating a patient requiring hip replacement.
  • Setup A: A reviewer realizes that ibuprofen was not tried; the physician is asked whether to add ibuprofen or proceed with hip replacement.
  • Result A: Most physicians pull back and try ibuprofen first, delaying surgery.
  • Setup B: The reviewer realizes two medications were not tried (ibuprofen and piroxicam); doctors must choose either to pull back or not, and if pulled back, which medication to try first.
  • Result B: The majority choose to let the patient proceed with hip replacement; pulling back becomes more complex due to multiple options.
  • Default effect: the presence or absence of a default (or the framing of the decision) strongly influences physicians’ choices, sometimes away from purely clinical guidelines.
  • Practical implication: defaults and choice architecture affect decisions even among highly trained professionals; this has implications for medical practice and policy design.

The car analogy, coffee, and consumer choice (temptation, context, and preference construction)

  • Thought experiments and ads to illustrate context effects in decision-making:
    • Car commercial: a setting with a “dominant” option; the presence of a middle option that nobody wants can still shape choices by altering perceived value of the other options.
    • A twist: add a variant like “car stolen” to show why seemingly inferior options can shift preferences by changing the reference frame.
    • Coffee example: a package with rum and coffee options; adding or removing a non-preferred option can make preferred options look better by comparison, even if the new option is not attractive on its own.
  • Economist ad example (The Economist): a three-option subscription offer would typically entice most to pick the 'combo' if the middle option exists, but removing the middle option flips popularity across options.
  • Key lesson: people’s preferences are imperfect and context-dependent; the presence of a non-desired option helps reveal preferences by anchoring comparisons.
  • Takeaway: our stated preferences may be unstable and easily swayed by what is presented to us; designers can exploit or mitigate these effects through framing and option sets.

Attraction, dating, and social signaling experiments

  • A dating choice experiment: participants view profiles of Tom and Jerry; in some cases, an ugly version of Jerry is added; in others, an ugly version of Tom is added.
  • Result: when Jerry is made less attractive, Jerry gains more dating interest; when Tom is made less attractive, Tom gains more dating interest.
  • Practical implications:
    • If you are going bar-hopping, you might prefer to bring a slightly less attractive version of yourself to influence others’ perceptions.
    • When someone invites you somewhere, your perceived view of yourself can influence whether you accept based on how they relate to you.
  • Big picture: humor and thought experiments show how social perception and choice are influenced by relative comparisons and contextual cues.

The broader message: cognitive limitations shape the design of the world

  • Behavioral economics challenges the idealized view of rational, fully autonomous agents; humans have cognitive limitations and biases.
  • We tend to design the physical world (stairs, accessibility) with these limits in mind, but we often neglect cognitive limits when designing financial decisions, retirement planning, health care, and markets.
  • The hopeful takeaway: if we understand cognitive limitations as clearly as physical limitations, we can design better systems that guide decisions toward better outcomes without assuming perfect rationality.
  • Final reflection: are we Superman (optimally rational) or Homer Simpson (limited, biased)? The answer is nuanced; we can design around cognitive constraints to improve real-world outcomes.

Key numerical and mathematical references (embedded in the discussion)

  • Population and identity probabilities:
    • A ZIP code with about
      N5.6×104N \,\approx\, 5.6\times 10^4
    • Potential combinations of birth date and ZIP and gender create a search space; the probabilistic argument that uniqueness is not guaranteed in large populations.
  • Bits of information and data combinations:
    • The idea that exposing more attributes increases the information content and thus the chance of identification.
  • A simple combinatorics thought: the two-attribute collision space grows with the square of the population; rough intuitive estimate used in the talk.
  • A brief numerical aside near the demonstration:
    • A quick reference to a familiar bit-count idea: 26=642^6 = 64.
  • A numeric aside used in a car-ad context:
    • A huge number mentioned: 196,000,000196{,}000{,}000 (and related arithmetic steps like dividing by three to get a rough figure of order tens of millions); the exact calculation isn’t the focal point, but it shows how large numbers appear in analytic-like thought experiments.
  • Percentages used to illustrate ambiguity in preferences:
    • A figure of 26%26\% referenced as a reaction to a probabilistic interpretation in an identity/privacy context.

Takeaways for exam-ready understanding

  • Identity is probabilistic and context-dependent; technical identity (satisfying many conditions) is not equivalent to knowing who someone is.
  • Even with seemingly large data gaps (birth year, ZIP, gender), there can be nontrivial chances of re-identification when combined with other data points; privacy is about information leakage, not just singular attributes.
  • Location data contains a surprisingly large amount of information; geolocation clues can narrow down identity or place with relatively few hints.
  • Visual illusions reveal robust cognitive biases in perception; cognitive biases in decision-making are harder to observe but operate with similar predictability.
  • Organ donation decisions are heavily influenced by choice architecture (the design of the form); defaults can drive behavior even in important life-and-death contexts.
  • Even experts (physicians) are subject to default effects and cognitive biases; the framing of choices can push toward different clinical decisions.
  • The presence of dominated or irrelevant options can steer preferences through context effects; our preferences are constructed in the moment of choice.
  • In designing systems, it’s important to account for cognitive limitations and biases to avoid suboptimal outcomes; better design can nudge people toward better, or more optimal, decisions without diminishing autonomy.
  • The overarching message: understanding cognitive biases and decision context offers a pathway to building better institutions, technologies, and policies that align with how people actually think and choose.