Blockchain Privacy and Searchable Encryption

Zero Knowledge Proofs (ZKSNARKS) in Blockchain Systems

Presenter: Arin Kaushik
General Topic: Discussion on the design, applications, and security analysis of zero-knowledge proofs, specifically ZK-SNARKs, within blockchain systems.

Blockchain Fundamentals and the Privacy Problem

Definition of Blockchain: A decentralized ledger spread over networks to allow transactions across different computers. It is characterized as being immutable, allowing transactions across multiple processes.
The Problem in Traditional Public Blockchains (e.g., Bitcoin): * Public blockchains like Bitcoin utilize a "transparent transaction" model. * All inputs, outputs, and transaction amounts are available for public viewing; they are not hidden or encrypted. * Risks associated with transparency: 1. User Surveillance: Attackers can monitor all transaction information. 2. DDoS Attacks: Open information can facilitate targeted attacks. 3. Regulatory Privacy Concerns: Lack of confidentiality can conflict with privacy regulations. * Trade-off: While Bitcoin offers high flexibility and low cost, the negatives of its transparency outweigh these positives in certain contexts.

Zero Knowledge Proof (ZKP) Mechanisms

Core Components: * The Prover: The party that sends out a proof to verify details while hiding inputs and amounts. * The Verifier: The party that verifies the proof without gaining access to the underlying hidden information.
The Three Essential Components of ZKP: 1. Completeness: If the prover has made a true statement, the honest verifier will accept the statement. 2. Soundness: If the statement is false, no "cheating prover" can prove it correct to the verifier. If the verifier cannot verify the statement, no further statements from that prover are accepted. 3. Zero Knowledge: The verifier learns nothing from the interaction except that the statement is true. They cannot access or be affected by the prover's inputs.
The Alibaba and the Forty Thieves Metaphor: * Characters: Peggy (Prover) and Victor (Verifier). * Scenario: A cave with a branching point (Path A and Path B) and a magic door at the back that requires a secret code to unlock. * Process: Victor stands outside. Peggy enters and takes a path. Victor then calls out a path (A or B) for her to exit from. * Verification: If Peggy knows the code, she can unlock the door to exit through whichever path Victor chooses. If she does not know the code, she has a $50\%$ chance of being on the correct path. Repeating this process (e.g., $20$ times) reduces the chance of cheating to a negligible level. * Outcome: Victor proves Peggy knows the code without ever hearing the code himself.

ZK-SNARKs: Design and Working

Acronym Definition: Zero-Knowledge Succinct Non-Interactive Argument of Knowledge. * Succinct: Requires very little computation and is very fast to verify. * Non-Interactive Argument: Only one proof is sent between prover and verifier, rather than multiple loops of interaction. This is necessary for blockchain because on-chain verification must be cheap and quick; repetitive loops increase execution costs and time.
Technical Workflow of ZK-SNARKs: 1. Algorithm Conversion: A computation is converted into a mathematical algorithm. 2. QAP (Quadratic Arithmetic Program): The algorithm is converted into a QAP, which contains all the required mathematical computations. 3. Private Circuit: The QAP is processed through a private circuit (a combination of multiple QAPs). 4. Trusted Setup: A one-time "Trusted Setup" ceremony generates public parameters known as the CRS (Common Reference String). 5. Proof Generation: The prover combines the QAP, private circuit, and CRS to create a succinct proof. 6. Verification: The verifier checks the proof against the public keys. If they match, the claim is valid; if not, the interaction is closed.

Trusted Setup and Security Risks

Groth 16: The version of SNARKs used in Zcash which requires a trusted setup ceremony.
Multi-Party Contribution: To mitigate the risk of a single coordinator being hacked or compromised, the setup uses multiple contributors and coordinators.
Counterfeit Prevention: Multiple inputs ensure that no single party can generate "counterfeit proofs" or false answers, as this would require the combined input of all parties.
Randomness: The process relies on randomness to ensure security.

Case Study: Zcash vs. Bitcoin

Zcash Implementation: * Uses ZK-SNARKs to hide the sender, receiver, and transaction amounts. * The transaction involves a "Zero-knowledge proof of validity" for the input and a "Zero-knowledge proof" for the output. * Information is encrypted and hidden from both the public and the verifier, preserving strong privacy while allowing full verification.
Comparison Matrix: * Privacy: Bitcoin has none (public data); Zcash has strong shielded/hidden data. * Verification: Bitcoin requires full transaction data; Zcash uses succinct cryptographic proofs. * Setup: Bitcoin requires no setup; Zcash requires a trusted ceremony. * Trade-offs: Bitcoin offers high auditability and flexibility; Zcash offers privacy but incurs higher prover costs.

Security Analysis, Limitations, and Future Outlook

Strengths: High standard cryptographic assumptions and high security.
Limitations: * Dependence on the trusted setup; if the setup fails, the system fails. * Higher computational costs and longer processing time for the blockchain compared to transparent systems. * Fragility to specific bugs/codes in the original SNARKs version.
Future Improvements: * Halo: A new version used in Zcash addressing previous limitations. * ZK-STARKs: A newer version that features a transparent setup (no trusted ceremony required) and is quantum resistant.

Questions & Discussion (Arin)

Question (Audience): Regarding the Zcash Sprout ceremony and the "toxic waste" (secret trapdoor): how can someone verify that the toxic waste has actually been destroyed?
Answer (Arin Kaushik): The verification comes through the multi-party computation process. Because there are multiple coordinators and contributors, the "toxic waste" or counterfeit proofs would only be valid if the coordinators failed to provide equal randomness. If a coordinator moves away or doesn't contribute correctly, the verifier can detect the discrepancy, alert the prover, and potentially delete that segment. The randomness of the multi-party process ensures that if a portion is removed or a new verifier joins, the compromised path is eliminated.

Searchable Encryption

Presenter: Madeline
Definition: Searching performed on data held by an untrusted third party (like a cloud host) without decrypting the data first. It protects data from untrusted hosts or parties sharing hardware.

Indexing Methods

Sequential Scan: Scans the entire encrypted document to compare values until a match is found. Similar to regular search but performed on ciphertext.
Document-based Indexing: Files contain embedded keywords. This is more efficient than sequential scans because the entire ciphertext does not need to be searched.
Keyword-based Indexing: Links specific keywords to file locations. This is the most efficient for searching but adds complexity and expense when updating the index (adding or removing documents).

Symmetric Searchable Encryption (SSE)

Mechanism: Works by creating a cryptographically secure pseudorandom number generator (PRNG) with a key.
Encryption: The output of the PRNG is XORed with the plaintext to create the ciphertext.
Search Process: Because the PRNG is deterministic and encryption is not dependent on previous parts of the ciphertext, a search can be conducted by recreating the PRNG and XORing the search word with the bitstream at every possible position. If the XOR result matches at a specific position, the word exists in the document.

Public Key Encryption with Keyword Search (PEKS/PEX)

The Model: 1. Bob (Sender): Encrypts a keyword $w$ into $x$ using a public key. He sends $x$ and the message to the server. 2. Alice (Receiver): Computes a "trapdoor" using her private key and the keyword she wants to search for. She sends this trapdoor to the server. 3. Server: Runs a "test function" to see if the trapdoor matches any of the encrypted keywords ( $x$ ). If a match is found, the server sends the associated messages to Alice.
Efficiency: Usually, the main message is encrypted with standard encryption for efficiency, while PIX/PEX is used specifically for the keywords/subject lines.

Attacks on Searchable Encryption

Offline Keyword Guessing: * A query recovery attack on PEX. * If a party has the public key and the trapdoor, they can encrypt individual keywords from a dictionary and test them against the trapdoor until they succeed. * Unlike passwords, keywords are usually in a specific language, making dictionary attacks easier.
File Injection: * A query recovery attack often targeting email servers. * The server holds plain text email until the client connects, re-encrypts, and re-uploads it. * A malicious server sends an email to itself. When the client encrypts it, the server knows which piece of encrypted mail corresponds to which prompt. * Zhang's Technique: Using a binary search pattern where each piece of mail contains half the keywords an attacker wants to identify to narrow down the user's query.
Mitigation (Proposed by Zhang): The client could limit the number of words allowed as keywords (e.g., only the first $10$ words of a subject line) or use dedicated tags like "Urgent" to avoid UI overflow and limit exposure.

Questions & Discussion (Madeline)

Question (Audience): Can you explain the PEX diagram and what the "trapdoor" actually is?
Answer (Madeline): The trapdoor is an encrypted form of the keyword Alice wants to search for. It is similar to a hash in that it cannot be easily decrypted. The server uses a test function to compare this trapdoor against encrypted keywords attached to messages. If the test returns "true," the message is deemed relevant to Alice’s query.
Question (Audience): If the server is an untrusted third party, why is it safe to hand over the trapdoor? Does this not enable the keyword guessing attacks mentioned?
Answer (Madeline): It does indeed allow the server to perform those attacks. (Wait for further solutions is implied but not explicitly detailed in the response).

Administrative Announcements

Report Submission Deadline: Approximately June 19 (the end of the semester).
Guidelines: There are specific formatting guidelines that students must follow. A formal announcement with these details will be made within the week, providing more than a month for preparation.
Schedule: Next meeting is scheduled for tomorrow.