Cryptocurrency Exchanges and Searchable Symmetric Encryption Lecture Notes

Cryptocurrency Exchanges and the Hidden Trust Problem

  • Introduction to the Trust Question   - The presentation explores whether blockchain technology truly removes the need for trust or if it merely shifts it.   - The central question is: "When people trade cryptocurrency, what are they really trusting?"   - Most users do not interact directly with the blockchain; they use centralized exchanges (CEXs) such as Binance, Coinbase, or BitSum.

  • The Paradox of Centralized Exchanges   - The Design Goal: Blockchain is built to reduce the need for trust in a central authority.   - The Reality: Centralized exchanges bring trust back by operating as intermediaries to provide speed, lower costs, and convenience.   - Internal Operations: To maintain efficiency, most trading happens inside an "off-chain" system (the exchange's internal database) rather than being recorded immediately on the blockchain.   - Trust Shift: Users are not trusting the blockchain directly; they are trusting the exchange company to maintain accurate records.

  • Case Study: The BitSum Bitcoin Error (February 2026)   - Incident Details: Bisum (a major South Korean exchange) mistakenly credited approximately 620,000620,000 Bitcoin to users.   - Financial Magnitude: This amount was valued at approximately 84,000,000,00084,000,000,000.   - Core Cause: The error was attributed to human error in the internal exchange layer, not a failure or hack of the Bitcoin blockchain itself.   - Significance: This demonstrates that the exchange's internal record-keeping can fail even while the underlying blockchain functions perfectly. Users were credited with more Bitcoin than the exchange actually possessed.

  • Blockchain vs. Centralized Databases   - Blockchain Design: It is expensive by design because information is stored across many nodes.   - Performance Constraints: Transactions require block confirmations and involve transaction fees (known as "gas fees") to reward miners or validators.   - The Value Proposition: While slower and more expensive, blockchain offers decentralized trust, removing the need to rely on a single bank, government, or database administrator.   - Why CEXs Win on Usability: Pure blockchain trading is too complex for beginners (requiring management of private keys and wallets) and too slow for high-frequency trading. CEXs update balances instantly and match trades internally.

  • The Hybrid Structure of Exchanges   - On-Chain: Deposits and withdrawals are recorded on the public blockchain ledger.   - Off-Chain: Everyday trading happens in the exchange's internal database.   - The Hidden Trust Problem: Users must trust that the exchange:     - Keeps internal ledgers correct.     - Actually holds the assets it claims to have.     - Protects private keys and custody systems.     - Will allow withdrawals in the future.

  • Exchanges as "Weaker Banks"   - Traditional banks have a long history of institutional protection, government supervision, audits, and legal responsibilities.   - If a crypto exchange requires the same level of institutional trust but lacks the regulations of a bank, it essentially becomes a weaker version of a bank.   - The Paradox of Safety: To make exchanges safer, we must re-introduce the institutions blockchain tried to avoid: regulation, audits, and legal responsibility.

  • The Threat to Government Control: Stablecoins and Libra/Diem   - Stablecoins are designed to maintain a stable value by pegging to real-world currencies like the US Dollar.   - They are used for international transfers, payments, and savings.   - The Libra (Diem) Case: Facebook attempted to create a global stablecoin. Because Facebook owned Instagram and WhatsApp, this represented a potential private global money system.   - Government Reaction: This was viewed as a threat to sovereign control over money, leading to strong resistance. The debate was about power and who controls rules and money.

  • Decentralized Exchanges (DEXs) and Self-Custody   - DEX Definition: Users trade directly from their own wallets via smart contracts. No central database controls user balances.   - Trade-offs:     - Technical Risks: Slower transactions, high fees, and network congestion.     - The Burden of Responsibility: In self-custody, if a user loses their private key, there is no "password reset" or customer service to recover funds.   - Anecdote of Data Loss: The speaker shared a personal experience of losing access to a GitHub account after accidentally deleting a third-party Google extension used for two-factor authentication. In crypto, this error results in permanent loss of financial assets.   - Economic Incentive Problem: Developing robust infrastructure requires vast resources, but full decentralization means the creators lose control, creating a conflict in economic incentives.

Searchable Symmetric Encryption (SSE)

  • Core Characteristics of Searchable Encryption   - Data Privacy: Users store encrypted data on a server.   - Secure Search: Users can perform searches without the server knowing the content of the data or the search queries.   - Client-Side Operations: Encryption and decryption occur on the client side; the server only sees hashes or trapdoors.   - Correctness: A critical metric. Depending on the model, results can be perfectly correct or include minor "lossy" errors (false positives).

  • The Five Fundamental Algorithms of SSE   - 1. Generation (λ\lambda): Uses a security parameter (λ\lambda) to produce a secret key (kk).   - 2. Encryption: Uses the secret key to transform the database into an encrypted index.   - 3. Trapdoor: Converts a keyword into a trapdoor token using the secret key for searching.   - 4. Search: The server uses the trapdoor and index to find the corresponding document.   - 5. Decryption: The client decrypts the returned document.

  • Levels of Correctness   - Calculation: The probability of a correct query return is 1ϕ(λ)1 - \phi(\lambda).   - Perfectly Correct: ϕ=0\phi = 0 — the system always returns the exact files.   - Correct but Negligible: ϕ\phi is extremely small (e.g., 11 in 21282^{128}).   - Lossy Return: A small percentage (around 1%1\%) of false positives occur. This is often tolerated because it makes the search process faster by allowing hash collisions in the keyword space.

  • SSE-1 Structure (Developed in 2006)   - This is a deterministic, perfectly correct structure.   - Indexing: Words are mapped to files (e.g., Word 1 in File 1, 2, 3). This happens on the client side.   - Lookup Table: Encrypted tokens are assigned to a random starting value and stored in an array at the server.   - Chaining: If a word appears in multiple files, its addresses in the array are chained (e.g., position 44 chains to position 77, then to position 33).   - Padding: Empty strings are used to fill the rest of the array to protect against the server guessing the data distribution.

  • Attacks on Searchable Encryption   - File Injection (Active Attack):     - The attacker (server) plants a file containing specific keywords into the user's database.     - When the user searches, the attacker observes which file is returned. Since they know the contents of the injected file, they can deduce the plaintext of the user's query.     - Logarithmic Complexity: An attacker needs only log2(keyword space)\log_2(\text{keyword space}) files to identify keywords.   - Leakage Problem (Passive Attack):     - A persistent adversary observes the "access pattern" — which trapdoors correspond to which files over a long period.     - They create a co-occurrence matrix to map frequencies. If the database type is known (e.g., an animal database), they can deduce keywords based on their frequency in common language.

  • Solutions and Trade-offs   - Oblivious RAM (ORAM): Shuffles data constantly so the server cannot see access patterns. Disadvantage: Very expensive computationally.   - Padding: Fills arrays with dummies to prevent the server from guessing based on size. Disadvantage: Storage size "explodes."   - Frequency Smoothing: Returns dummy files for rare words so all words appear to have the same frequency. Disadvantage: Expensive operation.

Questions & Discussion

  • Question on Solvency: How can a user know an exchange is solvent without a third-party auditor?   - Answer: One method is for the company to record all transactions on the blockchain. Because blockchain is immutable, users can verify the records themselves. While this process is slow, it provides high trust.

  • Question on Private Key Management: How do users manage self-custody keys in the real world?   - Answer: Users often use third-party applications like browser extensions (e.g., Google extensions) that act as wallets to store keys, or they manually write them down (memos).

  • Question on "Provably Secure" SSE: Why is SSE-1 called provably secure if it is vulnerable to leakage?   - Answer: It is considered secure because the leakage problem only works if the attacker knows the keyword space or database type. In practical applications where the database is unknown, the attacker has no starting ground. Additionally, file injection is a separate trust issue where the user must actively accept a malicious file from the server, which is outside the standard SSE model.