DS 300

Identity, identification, and information leakage

Identity vs. identification: being able to identify a person (satisfying multiple conditions) is not the same as knowing who the person is.
What can identify you? The idea that even if you share several attributes (birth date, ZIP, gender), you still reveal a lot about who you are; identity information is probabilistic, not absolute.
Example you heard about: a ZIP code might contain around 56,000 people; the claim is that there are many potential similarities between people in a ZIP, and the idea that this makes you unique is probabilistic rather than guaranteed.
The point: even with limited data (birth year, birth day, ZIP, gender), there can be enough overlap to threaten privacy and identity in nontrivial ways.
Bits of information and rough counting: discuss how many bits are needed to specify identity given observed data; if you only know birth year, you have far fewer possibilities than if you know day-of-year, ZIP, and gender; the more attributes you reveal, the more identifying power you disclose.
Quick aside on an estimation mindset:
- If you know the birth year and nothing else, the number of possible people within a ZIP is roughly tied to the number of days in a year; with days per year around 365, the granularity is coarse, reducing identifying power.
- With additional attributes (exact birth date, ZIP, and gender), the combination space grows, increasing likelihood that two or more people share the exact same combination in a given area.
The tone is probabilistic reasoning about identity; there is no absolute guarantee of uniqueness in large enough populations.

Geodesic game: locating people vs. locations

The speaker introduces a game concept called geodesic, which mirrors “20 questions” style deduction but for location rather than person.
Core idea: locate a person or place by asking location-based questions; the process reveals information bits about where something is.
Demonstration setup: attempts to locate a place in a region (e.g., Ghana, Montenegro) using clues such as license plates, signage, street names, landmarks.
The point: there is a lot of information embedded in geographic hints; even seemingly minor clues (car plates, road signs, a patchwork car, a hotel name) gradually reveal the location.
The activity highlights how much information is leaked by appearance, naming, and geography, sometimes revealing a location within a mile or so.
Prompt to the audience: think about how many bits of information are contained in a location to within one mile on Earth; this is a thought exercise about information content in geolocation.
Meta note: the exercise illustrates how easy it is to narrow down a location with enough clues, reinforcing the privacy/identification theme.

Visual and cognitive illusions as metaphors for rationality

The speaker shares a personal side project: writing about irrational behavior and cognitive illusions, using visual illusions as a metaphor for rationality.
Visual illusion demonstrations show that even our most trusted perceptual systems can be systematically wrong in repeatable ways.
Example 1: two tables with lines where one seems longer though they are equal; animation can make it look changed, but a ruler reveals the truth; once the illusion ends, prior learning doesn’t undo the bias.
Example 2: a cube with faces colored differently; two top arrows appear different colors, but they are identical; removing surrounding context makes the illusion disappear.
Takeaway from visual illusions: perception is fallible, and learning to distrust first impressions requires measurement or structured tools.
Vision is highly developed in humans; cognitive domains like financial decision-making lack specialized brain areas and consistent practice, making biases more likely.
The idea: cognitive illusions are harder to demonstrate and resist than visual ones, but similarly systematic and predictable.

Organ donation decisions and the default effect (Johnson & Goldstein)

Research question: why do organ donation rates differ across European countries? Is it culture, religion, or something else?
Observation: countries with seemingly similar cultures show large differences in willingness to donate organs.
Key insight: the form language and default options at the DMV have powerful effects on decisions.
- Left-leaning form (opt-in): people must check to participate; many don’t check.
- Right-leaning form (opt-out): people are presumed donors unless they opt out; people often stay opted in as a default, leading to higher donation rates.
Netherlands case: 28% donation rate after mailing every household; the dramatic increase comes from a form design that explicitly prompts not to opt out vs. to opt in.
The broader implication: the design of the choice environment shapes behavior more than the content of the choice itself.
Philosophical takeaway: many decisions, even those about life after death, are influenced by how choices are framed; people feel like agents, but choice architecture steers outcomes.
Economic interpretation: a standard rationality view assumes the cost of marking a choice exceeds its benefit; in reality, the difficulty and complexity of the decision, plus framing, leads to default-driven behavior.

Default effects and expert decisions (Redelman & Shafir)

An experiment with physicians evaluating a patient requiring hip replacement.
Setup A: A reviewer realizes that ibuprofen was not tried; the physician is asked whether to add ibuprofen or proceed with hip replacement.
Result A: Most physicians pull back and try ibuprofen first, delaying surgery.
Setup B: The reviewer realizes two medications were not tried (ibuprofen and piroxicam); doctors must choose either to pull back or not, and if pulled back, which medication to try first.
Result B: The majority choose to let the patient proceed with hip replacement; pulling back becomes more complex due to multiple options.
Default effect: the presence or absence of a default (or the framing of the decision) strongly influences physicians’ choices, sometimes away from purely clinical guidelines.
Practical implication: defaults and choice architecture affect decisions even among highly trained professionals; this has implications for medical practice and policy design.

The car analogy, coffee, and consumer choice (temptation, context, and preference construction)

Thought experiments and ads to illustrate context effects in decision-making:
- Car commercial: a setting with a “dominant” option; the presence of a middle option that nobody wants can still shape choices by altering perceived value of the other options.
- A twist: add a variant like “car stolen” to show why seemingly inferior options can shift preferences by changing the reference frame.
- Coffee example: a package with rum and coffee options; adding or removing a non-preferred option can make preferred options look better by comparison, even if the new option is not attractive on its own.
Economist ad example (The Economist): a three-option subscription offer would typically entice most to pick the 'combo' if the middle option exists, but removing the middle option flips popularity across options.
Key lesson: people’s preferences are imperfect and context-dependent; the presence of a non-desired option helps reveal preferences by anchoring comparisons.
Takeaway: our stated preferences may be unstable and easily swayed by what is presented to us; designers can exploit or mitigate these effects through framing and option sets.

Attraction, dating, and social signaling experiments

A dating choice experiment: participants view profiles of Tom and Jerry; in some cases, an ugly version of Jerry is added; in others, an ugly version of Tom is added.
Result: when Jerry is made less attractive, Jerry gains more dating interest; when Tom is made less attractive, Tom gains more dating interest.
Practical implications:
- If you are going bar-hopping, you might prefer to bring a slightly less attractive version of yourself to influence others’ perceptions.
- When someone invites you somewhere, your perceived view of yourself can influence whether you accept based on how they relate to you.
Big picture: humor and thought experiments show how social perception and choice are influenced by relative comparisons and contextual cues.

The broader message: cognitive limitations shape the design of the world

Behavioral economics challenges the idealized view of rational, fully autonomous agents; humans have cognitive limitations and biases.
We tend to design the physical world (stairs, accessibility) with these limits in mind, but we often neglect cognitive limits when designing financial decisions, retirement planning, health care, and markets.
The hopeful takeaway: if we understand cognitive limitations as clearly as physical limitations, we can design better systems that guide decisions toward better outcomes without assuming perfect rationality.
Final reflection: are we Superman (optimally rational) or Homer Simpson (limited, biased)? The answer is nuanced; we can design around cognitive constraints to improve real-world outcomes.

Key numerical and mathematical references (embedded in the discussion)

Population and identity probabilities:
- A ZIP code with about
  $N \,\approx\, 5.6\times 10^4$
- Potential combinations of birth date and ZIP and gender create a search space; the probabilistic argument that uniqueness is not guaranteed in large populations.
Bits of information and data combinations:
- The idea that exposing more attributes increases the information content and thus the chance of identification.
A simple combinatorics thought: the two-attribute collision space grows with the square of the population; rough intuitive estimate used in the talk.
A brief numerical aside near the demonstration:
- A quick reference to a familiar bit-count idea: $2^6 = 64$ .
A numeric aside used in a car-ad context:
- A huge number mentioned: $196{,}000{,}000$ (and related arithmetic steps like dividing by three to get a rough figure of order tens of millions); the exact calculation isn’t the focal point, but it shows how large numbers appear in analytic-like thought experiments.
Percentages used to illustrate ambiguity in preferences:
- A figure of $26\%$ referenced as a reaction to a probabilistic interpretation in an identity/privacy context.

Takeaways for exam-ready understanding

Identity is probabilistic and context-dependent; technical identity (satisfying many conditions) is not equivalent to knowing who someone is.
Even with seemingly large data gaps (birth year, ZIP, gender), there can be nontrivial chances of re-identification when combined with other data points; privacy is about information leakage, not just singular attributes.
Location data contains a surprisingly large amount of information; geolocation clues can narrow down identity or place with relatively few hints.
Visual illusions reveal robust cognitive biases in perception; cognitive biases in decision-making are harder to observe but operate with similar predictability.
Organ donation decisions are heavily influenced by choice architecture (the design of the form); defaults can drive behavior even in important life-and-death contexts.
Even experts (physicians) are subject to default effects and cognitive biases; the framing of choices can push toward different clinical decisions.
The presence of dominated or irrelevant options can steer preferences through context effects; our preferences are constructed in the moment of choice.
In designing systems, it’s important to account for cognitive limitations and biases to avoid suboptimal outcomes; better design can nudge people toward better, or more optimal, decisions without diminishing autonomy.
The overarching message: understanding cognitive biases and decision context offers a pathway to building better institutions, technologies, and policies that align with how people actually think and choose.