Bitcoin Mechanics

Mechanics of Bitcoin

Bitcoin Consensus Recap

  • Bitcoin consensus provides:
    • Append-only ledger
    • Decentralized consensus protocol
    • Miners validating transactions, assuming a currency exists to motivate them.

Bitcoin Transactions

Account-Based Ledger (Non-Bitcoin Example)

  • Illustrates a simplified account-based ledger system (not used by Bitcoin).
  • Transactions:
    • Create 25 coins and credit to Alice (ASSERTED BY MINERS)
    • Transfer 17 coins from Alice to Bob (SIGNED(Alice))
    • Transfer 8 coins from Bob to Carol (SIGNED(Bob))
    • Transfer 5 coins from Carol to Alice (SIGNED(Carol))
    • Transfer 15 coins from Alice to David (SIGNED(Alice))
  • Simplification: One transaction per block time.
  • Validation may require scanning backwards to the genesis block.

Transaction-Based Ledger (Bitcoin)

  • Bitcoin uses a transaction-based ledger.
  • Transactions:
    • Inputs: Ø, Outputs: 25.0 → Alice (No signature required)
    • Inputs: 1[0], Outputs: 17.0 → Bob, 8.0 → Alice (SIGNED(Alice))
    • Inputs: 2[0], Outputs: 8.0 → Carol, 7.0 → Bob (SIGNED(Bob))
    • Inputs: 2[1], Outputs: 6.0 → David, 2.0 → Alice (SIGNED(Alice))
  • Simplification: One transaction per block time.
  • Finite scan to check for validity.
  • Implemented with hash pointers.
  • Use of a change address.

Merging Value

  • Example of merging inputs to create an output.
  • Inputs: …, Outputs: 17.0 → Bob, 8.0 → Alice (SIGNED(Alice))
    • Inputs: 1[1], Outputs: 6.0 → Carol, 2.0 → Bob (SIGNED(Alice))
    • Inputs: 1[0], 2[1], Outputs: 19.0 → Bob (SIGNED(Bob))

Joint Payments

  • Example of joint payments requiring multiple signatures.
  • Inputs: …, Outputs: 17.0 → Bob, 8.0 → Alice (SIGNED(Alice))
    • Inputs: 1[1], Outputs: 6.0 → Carol, 2.0 → Bob (SIGNED(Alice))
    • Inputs: 2[0], 2[1], Outputs: 8.0 → David (SIGNED(Carol), SIGNED(Bob))

Real Bitcoin Transaction

  • Demonstrates the structure of a real Bitcoin transaction in JSON format.
  • Key fields:
    • hash: Transaction hash (unique ID).
    • ver: Version number.
    • vin_sz: Number of inputs.
    • vout_sz: Number of outputs.
    • lock_time: "Not valid before" time (more on this later).
    • size: Transaction size in bytes.
    • in: Array of inputs.
    • out: Array of outputs.

Transaction Metadata

  • lock_time: Represents the earliest time or block height that the transaction can be added to the blockchain.

Transaction Inputs

  • Each input includes:
    • prev_out: Hash of the previous transaction being redeemed and index n of the output in that transaction.
    • scriptSig: Signature.

Transaction Outputs

  • Each output includes:
    • value: Amount of Bitcoin to be transferred.
    • scriptPubKey: Script defining the conditions to redeem the output (recipient address). (more on this soon…)
  • The sum of all output values must be less than or equal to the sum of all input values.
  • If the sum of output values is less than the sum of input values, the difference is the transaction fee, which goes to the miner.

Bitcoin Scripts

Script Overview

  • Each transaction output specifies a script, not just a public key.
  • Most common transaction type: redeeming a previous output by signing with the correct key.
  • Validation combines the new transaction’s input script (scriptSig) and the earlier transaction’s output script (scriptPubKey).
  • An address is a hash of a public key.

Script Details

  • Transaction output must specify that it can be redeemed by a public key that hashes to a specific value (address), along with a signature from the owner of that public key.

Pay-to-PubkeyHash Script

  • Example: OP_DUP OP_HASH160 69e02e18... OP_EQUALVERIFY OP_CHECKSIG.
  • Most common type of output script.

Script Validation

  • Inputs also contain scripts (not just signatures).
  • To validate, concatenate the new transaction’s input script and the earlier transaction’s output script.
  • The resulting script must run successfully for the transaction to be valid.
  • In the simplest case, the output script specifies a public key (or address), and the input script specifies a signature with that public key.

ScriptSig and ScriptPubKey

  • scriptSig: The input script (from the redeeming transaction).
  • scriptPubKey: The output script (from the transaction being redeemed).
  • Concatenated script must execute completely with no errors.

Bitcoin Scripting Language ("Script")

Design Goals

  • Built for Bitcoin (inspired by Forth).
  • Simple, compact.
  • Support for cryptography.
  • Stack-based (linear) – every instruction is executed exactly once.
  • Limits on time/memory - the number of instructions in the script gives us an upper bound on how long it might take to run and how much memory it could use.
  • No looping.
  • Result: Bitcoin script is not Turing Complete (cannot compute arbitrarily powerful functions).
  • Miners have to run these scripts, which are submitted by arbitrary participants in the network.
    • Advantage: No infinite looping problem!

Script Language Details

  • Very small, with 256 instructions (each 1 byte).
  • 75 reserved, 15 disabled.
  • Basic arithmetic, logic, crypto instructions (hash computations, signature verifications), etc.
  • Only two possible outcomes:
    • Executes successfully with no errors → transaction is valid.
    • Error during execution → transaction is invalid.

Common Script Instructions

  • OP_DUP: Duplicates top item on the stack.
  • OP_HASH160: Hashes twice: SHA-256, then RIPEMD-160.
  • OP_EQUALVERIFY: Returns true if inputs are equal, otherwise marks transaction invalid.
  • OP_CHECKSIG: Checks that the input signature is valid using the input public key for the hash of the current transaction.
  • OP_CHECKMULTISIG: Checks that t signatures on the transaction are valid from t (out of n) of the specified public keys. A bug exists where An extra data value popped from the stack and ignored that One has to deal with this bug by putting an extra dummy variable onto the stack.

RIPEMD

  • “RIPEMD” stands for “RIPE Message Digest,” where “RIPE” stands for “RACE Integrity Primitives Evaluation” and where “RACE” stands for “Research and Development in Advanced Communications Technologies in Europe”—a nice example of a recursive abbreviation.

Script Execution

  • Uses a stack for data manipulation.
  • Two types of instructions: data instructions and opcodes.
  • Data instructions: push data onto the stack.
  • Opcodes: perform functions using data on the stack.

Script Example Execution

  • The first two instructions are data instructions:

    • Signature and public key used to verify that the signature specified in the scriptSig component of a transaction input in the redeeming transaction.
  • The rest of the script was specified in scriptPubKey:

    • Specified in the scriptPubKey component of a transaction output in the referenced transaction.
  • There is the hash of the public key, as specified by the sender, and the hash of the public key that was used by the recipient when trying to claim the coins.

  • EQUALVERIFY command, which checks that the two values at the top of the stack are equal. If they aren’t, an error will be thrown, and the script will stop executing.

  • CHECKSIG instruction pops the two values, the public key and signature, off the stack, and verifies that is a valid signature for the entire transaction using that public key.

  • Provided there weren’t any errors, the output of this script will simply be true indicating that the transaction is valid.

Script Usage in Practice (2014 Data)

  • Scripts allow specifying arbitrary conditions to spend coins.
  • Most nodes whitelist known scripts (refuse to accept others).
  • Usage:
    • 99.9% are simple signature checks.
    • ~0.01% are MULTISIG.
    • ~0.01% are Pay-to-Script-Hash.
    • Remainder are errors, proof-of-burn.

Proof-of-Burn

  • A script that can never be redeemed, destroying coins sent to it.
  • Implementation: The OP_RETURN opcode throws an error if reached.
  • Can be used to bootstrap alternative currencies or add arbitrary data to the blockchain.

Use Cases for Proof-of-Burn

  • Destroy coins and transfer them to an alternative currency
  • Add arbitrary data to block chain

Pay-to-Script-Hash (P2SH)

P2SH Overview

  • Problem: Senders must specify the script exactly, which can be complex for multi-sig or other advanced conditions.
  • Solution: Pay-to-script-hash (P2SH).
  • Instead of sending to the hash of a public key, send to the hash of a script.
  • To redeem, reveal the script and provide data to make it evaluate to true.

P2SH Mechanics

  • P2SH script hashes the top value on the stack and checks if it matches the provided hash value.
  • Validates the top data value from the stack, is reinterpreted as a sequence of instructions, and executed a second time as a script, with the rest of the stack as input.
  • Complexity: Added to Bitcoin later, not part of the initial design.

Sender/Receiver Roles

  • The recipient can just specify a hash that the sender sends money to.

P2SH Benefits

  • Alice need not worry that Bob is using multisig; she just sends to Bob’s P2SH address, and it is Bob’s responsibility to specify the fancy script when he wants to redeem the coins.
  • Efficiency: Miners track smaller output scripts (hashes), pushing complexity to input scripts.

P2SH Workflow

  • Bob creates a redeem script and hashes it.
  • Bob sends the redeem script hash to Alice.
  • Alice creates a P2SH-style output containing Bob's redeem script hash.
  • When Bob wants to spend the output provides his signature along with the redeem script in the signature script.
  • P2P network ensures the redeem script hashes to the same value as the script hash Alice put in her output, and then it processes the redeem script exactly as it would if it were the primary pubkey script, letting Bob spend the output if the redeem script does not return false.

Applications of Bitcoin Scripts

Escrow Transactions

  • Problem: Alice wants to pay Bob for goods, but needs assurance before paying.
  • Solution: Introduce a third-party arbitrator (Judy) and use MULTISIG.

Escrow Implementation

  • Alice creates a 2-of-3 MULTISIG transaction with Alice, Bob, and Judy.
  • Bob sends goods to Alice after seeing the transaction in the blockchain.

Normal Case (Honest Parties)

  • Alice and Bob both sign a transaction releasing funds to Bob.
  • Judy doesn’t need to be involved.

Disputed Case

  • If Bob doesn’t send the goods (or sends the wrong ones), Alice refuses to sign.
  • Judy decides whether Alice or Bob deserves the money.
  • Judy signs with either Alice or Bob to release the funds.

Green Addresses

Green Address Problem

  • Bob is offline or doesn’t want to wait for confirmations.

Green Address Solution

  • Introduce a trusted third party (bank).
  • Alice pays the bank; the bank pays Bob from a bank-controlled address (green address).
  • The bank guarantees it won’t double-spend the money.

Green Address Trust

  • Bob trusts the bank not to double-spend, based on its reputation.
  • This is a real-world (not Bitcoin-enforced) guarantee.

Green Address Risks

  • If the bank double-spends, people will stop trusting it.
  • Two prominent services that implemented green addresses (Instawallet and Mt. Gox) collapsed.

Efficient Micro-Payments

Micropayment Problem

  • Paying small amounts frequently (e.g., per minute of phone usage) is inefficient due to transaction fees.

Micropayment Solution

  • Use a MULTISIG transaction to pay the maximum amount Alice would ever need to spend, to an output requiring both Alice and Bob to sign to release the coins.
  • Alice continually signs transactions spending those coins that were sent to the MULTISIG address, sending payment to Bob and returning the rest to Alice.
  • When finished, Alice stops signing transactions; Bob signs the last one and publishes it.

Micropayment Double-Spends

  • Alice will keep sending these transactions to Bob every minute that she uses the service.
  • Technically all of these transactions are double-spends. With this micro- payment protocol, we’re actually generating a huge amount of potential double-spends.
  • If both parties are operating normally, Bob will never sign any transaction but the last one, in which case the block chain won’t actually see any attempt at a double-spend.

Micropayment Problem with Never Signing

  • What if Bob never signs the last transaction?
    Alice will lose the full value that she paid at the beginning.

Micropayment Lock Time

  • Solution: Alice demands a timed refund transaction before starting.
  • The refund is locked until time t.
  • Alice will want to get this refund transaction from Bob and hold on to it before she broadcasts, the first MULTISIG transaction that puts her funds into escrow.
  • If Bob hasn’t signed any of the small transactions that Alice has sent by time t, Alice can publish this transaction which refunds all of the money directly to her.

Lock Time Details

  • If you specify any value other than zero for the lock time, it tells miners not to publish the transaction until the specified lock time.

More Advanced Scripts

  • Multiplayer lotteries.
  • Hash pre-image challenges.
  • Coin-swapping protocols.

Smart Contracts

Smart Contract Overview

  • Contracts with technical enforcement in Bitcoin (using scripts, miners, and transaction validation).
  • Examples: Escrow protocol, micro-payment protocol.
  • Many desired smart contracts aren’t supported by the Bitcoin scripting language today.

Bitcoin Blocks

Reason to Bundle Transactions

  1. Requiring consensus for each transaction separately would reduce transaction acceptance rate.
  2. Hash-chain of blocks is much shorter.
  3. Faster to verify history.

Blocks Overview

  • Single unit of work for miners.
  • Limit length of the hash-chain of blocks.
  • Faster to verify the history.

Block Structure

  • Combination of a hash chain of blocks and a per-block tree of transactions (Merkle tree).
  • Merkle tree allows efficient verification of transaction inclusion.

Merkle Tree Verification

  • To prove that a transaction is included in a specific block, we can provide a path through the tree whose length is logarithmic in the number of transactions in the block.
    To recap, a block consists of header data followed by a list of transactions arranged in a tree structure.

Real Bitcoin Block

  • Header contains mining puzzle information (hash, nonce, timestamp, bits).
  • The header is the only thing that’s hashed during mining.
  • Verification only requires looking at the headers.
  • Header includes the root of the transaction tree (mrkl_root).

Block Header Fields

  • ver: Version number.
  • prev_block: Hash of the previous block.
  • time: Timestamp.
  • bits: Difficulty target.
  • nonce: Value miners change to solve the mining puzzle.
  • mrkl_root: Root hash of the Merkle tree of transactions.

Coinbase Transaction

  • A coinbase transaction mostly looks like a normal transaction but with several differences:
  • (1) it always has a single input and a single output,
  • (2) the input doesn’t redeem a previous output and thus contains a null hash pointer, since it is minting new bitcoins and not spending existing coins,
  • (3) the value of the output is currently a little over 25 Bitcoins.
  • The output value is the miner’s revenue from the block comprised of both a flat mining reward and transaction fees.
  • Reward is reduced every 210,000 blocks (about 4 years).
    • (4) There is a special “coinbase” parameter, which is completely arbitrary; miners can put whatever they want in there.

Coinbase Parameters

  • prev_out: Null hash pointer (all zeros).
  • coinbase: Arbitrary data.
  • First ever coinbase parameter: “The Times 03/Jan/2009 Chancellor on brink of second bailout for banks”.

The Bitcoin Network

P2P Network

  • Ad-hoc protocol (runs on TCP port 8333).
  • Ad-hoc network with random topology.
  • All nodes are equal.
  • New nodes can join at any time.
  • Forget non-responding nodes after 3 hrs.

Joining Network

  • New nodes discover peers via getaddr() messages from existing nodes.

Transaction Propagation

  • Transactions are flooded throughout the network.
  • Nodes relay transactions they haven't seen before.

Transaction Relay Policy

  • Validate with current block chain (default).
  • Run script for each previous output being redeemed and ensure that script returns true!
  • Script matches a whitelist.
  • Avoid unusual scripts.
  • Haven’t seen before.
  • Avoid infinite loops.
  • Doesn’t conflict with others I’ve relayed.
  • Avoid double-spends.
  • Sanity checks only… Well-behaving nodes implement them but some nodes may ignore them!

Conflicting Transactions

  • Transactions or blocks may conflict.
  • Default behavior: accept what you hear first.
  • Network position matters.
  • Miners may implement other logic!

Block Propagation

  • Relay a new block when you hear it if:
    • Block meets the hash target.
    • Block has all valid transactions.
    • Run all scripts, even if you wouldn’t relay.
    • Block builds on current longest chain.
    • Avoid forks.

Block Propagation Times

  • Smaller blocks propagate faster.

Network Size

  • Impossible to measure exactly.
  • Estimates-up to 1M IP addresses/month.
  • Only about 5-10k “full nodes”.
    • Permanently connected.
    • Fully-validate.
    • This number may be dropping!

Fully Validating Nodes:

  • Permanently connected
  • Store entire block chain
  • Hear and forward every node/transaction

Storage Cost

  • Storage costs have increased over time as the blockchain has grown.

Tracking UTXO Set

  • Unspent Transaction Output should be stored in memory - everything else can be stored on disk, why?

UTXO Set Size

  • Currently ~65 M UTXOs out of 300 M transactions.
  • Requires several Gigabytes to store – can it fit in the RAM of a standard computer?

SPV/Thin Clients

  • Simplified Payment Verification (SPV) clients don’t store the entire blockchain.
  • Store block headers, validate relevant transactions, and trust fully-validating nodes.
  • Requires only a few tens of Megabytes (1000x cost savings!)

Software Diversity

  • About 90% of nodes run “Core Bitcoin” (C++).
  • Other implementations include BitcoinJ (Java), Libbitcoin (C++), btcd (Go).

Limitations & Improvements

Hard-Coded Limits

  • 10 min. average creation time per block.
  • 1 M bytes in a block.
  • 20,000 signature operations per block.
  • 100 M satoshis per bitcoin.
  • 23M total bitcoins maximum.
  • 50,25,12.5… bitcoin mining reward. These affect economic balance of power too much to change now.

Throughput Limits

  • 1 M bytes/block (10 min).
  • >250 bytes/transaction.
  • 7 transactions/sec. Compare to:
    • VISA: 2,000-10,000 transactions/sec.
    • PayPal: 50-100 transaction/sec.

Cryptographic Limits

  • Only 1 signature algorithm (ECDSA/P256).
  • Hard-coded hash functions.
  • Crypto primitives might break by 2040…

“Hard-Forking” Changes

  • Old nodes will never catch up, that’s crazy talk!!
  • Problem: Old nodes will never catch up

Soft Forks

  • Add new features that limit the set of valid transactions.
  • Need the majority of nodes to enforce new rules.
  • Old nodes will approve, but old nodes might mine now-invalid blocks.

Soft Fork Example: Pay to Script Hash

  • Old nodes will just approve the hash, not run the embedded script

Soft Fork Possibilities

  • New signature schemes
  • Extra per-block metadata
  • Shove in the coinbase parameter
  • Commit to UTXO tree in each block

Hard Forks

  • New op codes
  • Changes to size limits
  • Changes to mining rate
  • Many small bug fixes. Stay tuned for the lecture on altcoins! Currently seem very unlikely to happen

Summary

  • Human beings aren’t Bitcoin nodes.
  • Currency needs to work for people, not nodes.