T5e

P2P and DHT

Link Layers

The layers include: Ethernet, WiFi, Application, Transport, Network (IPv4, v6), Session, Presentation, Physical (Cables, Space and Bits). Messages, Packets, Frames, Bits Segments

Client/Server Model

Fixed Endpoints: Requires knowing the fixed endpoints, allowing for DNS randomness.
Dedicated Connection: Establishes and maintains a dedicated connection for the duration of the conversation.
Efficient Data Transfer: Aims to transfer data as efficiently as possible.
Centralized Services: Depends on well-known, centralized, "one-off" services.
Fragility and Inefficiency: Suffers from fragility and inefficiency.

Issues with the Client/Server Model

Centralized: Dependency on a particular service/provider.
Fragile: Acts as a honeypot for attacks and a point of failure.
Inefficient: One server serving many clients leads to bottlenecks.
Scaling Costs: Better service requires more server nodes, increasing costs.
Organizational Overheads: Managing more server nodes increases workload.
Lack of Privacy: Participants are easy to trace.

Desirable Application Characteristics

Decentralized: Composed of independent, 'equal' nodes working for mutual benefit.
Robust: No single point of failure or attack.
Efficient: No bottleneck, optimized for data transfers.
Scalable: Able to handle a large number of nodes.
Self-Organizing: Dynamic network formation with plug-and-play capabilities.
Anonymous: Difficult to trace participants

Peer-to-Peer (P2P)

Decentralized file/content sharing for:
- Popular media.
- Sensitive content.
- Operating systems.
- Games and updates.
- etc.
Offloads servers (dynamic CDN).
Models:
- Pure vs. Hybrid.
- Structured vs. Unstructured.
Employs distributed directories, ledgers, and blockchain.

Performance in P2P

Centralized (One Server):
- One server, N recipients.
- Server bandwidth divided by N recipients.
- Time is increased.
P2P (Everybody Has Pieces):
- Effectively up to N servers, N recipients.
- N peers each contributing their bandwidth.
- Time is decreased.

Performance Limitations/Opportunities

Asymmetric Bandwidth:
- Common in last mile connections where downloading is easier than uploading.
- Need to leverage benefits of both.
Information Challenge:
- Client/server has sufficient information.
- P2P requires information to be distributed, which raises a bootstrap problem.

Napster

The first to go viral (1999).
Primarily MP3-oriented.
Central Index: Maps files to machines that store them.
File Discovery:
- Query the index system.
- Receive a machine that stores the required file.
- Get the file from the machine.
Simple, easy-to-implement search engines.
Disrupted record shops, leading to iTunes and Spotify.
Disadvantage:
- A single point of failure.
- Shut down due to copyright issues.

Gnutella

1st generation unstructured/decentralized P2P protocol (2000).
Query Flooding:
- Send request to all neighbors.
- Neighbors recursively multicast the request.
Advantage:
- Decentralized, highly robust architecture.
- Flexible queries.
Disadvantage:
- Not scalable; inefficient; slow deep searches.
Has a TTL on queries.
The entire network can be swamped with requests.
Poor protocol design, insecure/fragile.

BitTorrent: Intentions

Developed by Bram Cohen in 2001.
Built on encrypted, distributed file protection.
Many chunks make it safer and faster.
Deals with asymmetric (last-mile) bandwidth.
- Can only download as fast as uploader uploads.
Deals with inefficient downloading (TCP limits).
Aims for fairness and encourages unthrottled sharing.
Incorporates game theory concepts like tit-for-tat, Prisoners’ Dilemma, and Pareto/80-20 rule.

BitTorrent (Original)

Central Directory: Tracker
- Tracks (only) all peers (identity).
- Does not reveal all peers.
- Has none of the file.
Static Metadata (Torrent) File
- Tracker, File info.
- Index of Pieces, hashes.
Pieces are 256/512kB, with SHA-1 hash.

BitTorrent Components

Swarm: All peers participating in the file sharing.
Seeder: Full-file-holder who only uploads.
Leecher: Downloader who downloads what they don’t have and uploads what they do have.

Network Efficiency in BitTorrent

Down:up bandwidth ratio is typically 4:1.
Downloads occur from a chosen, evolving subset, typically ~4 in parallel.
Uploads occur to those downloading and others.
Pieces are divided into sub-pieces (16kB).
Pipeline and fan-out requests for sub-pieces across peers.
Download sub-pieces in parallel to build a complete piece.

Piece Selection Algorithm

Strict: sub-pieces to complete piece first
Rarest piece has priority:
1. Spread the seeding (seeder is bottleneck)
2. Improve robustness (loss of seeder)
3. Increase the spread/speed
4. Increase your value
Most-common last
Random startup piece:
1. Need to have something to share, quickly
Endgame:
1. Need to get only specific piece(s), ask all peers

Choosing Peers

Resource allocation considerations:
- How much bandwidth does a peer have and offer?
- What’s best for everyone?
Implementation:
- Connect to "all".
- Fix the pool of peers for uploading-to (4).
- Identify the best uploaders.
  - non-trivial
  - best download rates from?
- Allow them to download from you ("unchoke").
- Check average upload rate every 10 seconds.

Evolving the Peer Pool

Allow others a chance.
“Optimistic unchoking”
Pick one other peer at random every 30 seconds.
Unchoke it.
Measure performance.
Compare to existing pool.
Keep the best 4, choke others.
Gives everyone a chance.
Reduces the impact of free-loaders.

Upload-Only Phase

Occurs after finishing downloading and willing to seed.
Cannot use download rates.
Can use upload rates from you to each (chosen) peer.
Best rates optimize distribution of pieces.
Best use of your upstream capacity.
Possibly less-busy downloader.
Once satisfied, move on to other peers.

Trackerless BT

Introduced by Cohen in 2005.
Uses a Distributed Hash Table (DHT).
Every node is a tracker.
Every node joins an overlay network (application-level connectivity).
Like a swarm, ignoring physical connectivity.
Used for information sharing, not file sharing.

DHT Requirements

Autonomy, Decentralization, Fault tolerance, Scalability
Tracker Hash Table is Distributed across all peers
Any peer can query for a “key” (piece, info, …)
DHT returns value for the key
Query resolution uses minimal # messages
DHT Peers only need to know about some other peers
Must be dynamic/robust to peers coming and going (aka churn)

Distributing the Tracker DB

Responsible Node, NodeID, Piece # =Key =KeyID, Information (Value).
Example:
- 150.203.56.99, 10, 10, Peers a,b,c,d, …, Hash = JJJJJ
- 150.203.56.99, 10, 20, Peers e,f,g,… Hash =KKKK
- 150.203.56.99, 10, 30, Peers h,i… Hash=LLLL
- 130.48.1.7, 90, 67, Peers h, Hash=MMMM
- 130.48.1.7, 90, 78, Peers k,m, Hash=PPPP
- 130.48.1.7, 90, 100, Peers , Hash = QQQQ
- 45.48.17.6, 140, 107, Peers a,e,g, Hash = SSSS
- 45.48.17.6, 140, 115, Peers b,d, Hash = VVVV
- 45.48.17.6, 140, 149, Peers f,h,k, Hash = YYYY

Allocating K/V Pairs to Nodes

Every piece/key has an ID.
Every node (peer) has an ID on the same scale.
- (BT: 160bits – hopefully a globally unique ID (GUID))
Associate keyID (row) with nearest successor NodeID.
Each node takes care of all keys “between” it and its predecessor node.
Example:
- Piece (Key) ID-scale = $[1..150]$
- NodeIDs = 10, 90, 140
- KeyID 45 goes to NodeID 90
- KeyID 111 goes to NodeID 140
- KeyID 149 goes to NodeID 10

Partitioning the ID Number Space

Basically, divide the rows amongst DHT nodes.
Distribute K/V pairs evenly(-ish).
Ensure every node participates and contributes and is known.
1-d (line, wraps ->circle)
2-d (sheet, wraps -> surface of a sphere)
Map DHT nodes to the space.
As nodes get added, re-partition (re-distribute some rows amongst nodes).
As nodes leave, re-partition (re-distribute their rows).
PUT K/V pairs on nodes, and GET queries.

Example DHT

# of IDs = $[1,150]$
# of Nodes = 16
Each node knows:
- Its successor
- Its predecessor

Query Resolution

Node 18 asks:
- GET Key 109
- What’s its value?
A scaling issue:
- $O(N)$ steps
- $O(2)$ neighbors

DHT Improvements

Each node knows:
- Its successor
- Its successor+1
- Its predecessor
- A few others
Shortcut Nodes:
- Often distributed exponentially
- $O(log N)$ steps
- $O(log N)$ neighbors

Storage and Robustness

Storage of K/V pair.
On appropriate node.
AND its successor.
AND its predecessor.

Churning

Nodes leave/join.
Node leaves:
- Cleanly:
  - sends a message to its immediate neighbors
- Crashes out:
  - detected by lack of ping responses
New Node joins:
- Find a ‘boot peer’

DHT Use Cases

Leverage aggregate bandwidth.
Great for exact “searches” (lookups).
No good for vague, wildcard queries.
Hashes of regexp don’t work.
Great for large swarms, significant churn.
Many implementations with mostly minor parameter and algorithm variations.
Now used in other applications:
- File sharing/file systems (p2p)
- Databases
- Naming systems (DNS-like)
- CDNs
- Communications/messaging, Event notification
- Application-level multicast
- Pub/Sub frameworks
- Electronic Health Records