1/65
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Peer-to-Peer (P2P) System
A decentralized system where each participant (peer) shares resources and has equal functionality.
Decentralization
The absence of a central controlling server, reducing single points of failure and improving scalability.
Overlay Network
A virtual network built on top of an existing network to facilitate routing and search in P2P systems.
GUID (Globally Unique Identifier)
A unique identifier used to locate objects in distributed systems, often derived from hashing functions.
Routing Overlay (RO)
A distributed algorithm that enables efficient location of objects and nodes in P2P networks.
Distributed Hash Table (DHT)
A structured overlay system that provides efficient lookup services, used in systems like Chord and Pastry.
Swarming
A file-sharing technique where pieces of a file are downloaded from multiple sources simultaneously, as used in BitTorrent.
Supernode
A peer with more bandwidth and processing power that helps manage connections and queries (e.g., in KaZaA).
Free-riding
When a user downloads resources without contributing anything back to the network.
Anonymity
Ensuring that providers and users of resources remain untraceable in a P2P network.
Routing Table
Keeps track of nodes and routes in structured P2P networks.
Index Server
Stores metadata about shared files (e.g., in Napster).
Tracker
Centralized server that helps coordinate file sharing in BitTorrent.
GUID Mapping
Associates a unique identifier with data or a node in the system.
Replication Mechanism
Ensures data is stored redundantly to improve availability.
What was a key flaw in Napster’s architecture?
Single point of failure due to central index server
Which P2P system introduced the concept of supernodes?
KaZaA
BitTorrent ensures that all peers download pieces of a file in sequential order.
False (It uses the rarest piece first strategy to improve distribution.)
Gnutella uses a centralized server to manage peer connections.
False (Gnutella is fully decentralized.)
Pastry and Tapestry use prefix routing to efficiently locate nodes and objects.
True
Why did Napster face legal issues and eventually shut down?
It relied on a centralized index server, making it easier for copyright holders to target for enabling music piracy.
How does replication improve fault tolerance in P2P systems?
Replication involves storing multiple copies of data across different nodes. This ensures data remains accessible even if some nodes leave the network or fail.
How does Pastry handle node failures and ensure continued operation?
Pastry maintains redundant routing tables and a leaf set of nearby nodes. If a node fails, queries are rerouted to the next-closest node in the network, ensuring reliability.
What are the main challenges in designing a fault-tolerant P2P system?
handling high churn rates (frequent node joins/leaves), ensuring data availability, preventing malicious attacks, and maintaining efficient routing even as nodes fail.
What is a seeder?
a peer that has the entire file and shares it with others.
What is a leecher?
a peer that is still downloading parts of the file. Once a leecher finishes downloading, they can continue seeding to help other peers.
Why does BitTorrent use the "Rarest Piece First" strategy, and how does it improve performance?
This strategy ensures that the least available pieces of a file are downloaded first. It prevents a situation where all peers wait for the same missing piece, improving file distribution and availability.
How does a tracker improve peer coordination in BitTorrent?
When a new peer joins, it contacts the tracker to get a list of peers who have the file, allowing the new peer to begin downloading and sharing pieces.
How does IP routing work in P2P networks?
IP routing follows a hierarchical structure and is managed by ISPs, with fixed addresses and limited adaptability.
How does overlay routing improve upon IP routing in P2P networks?
Overlay routing creates a virtual network over IP, allowing dynamic routing, better load balancing, and higher fault tolerance.
What is prefix routing in P2P networks?
a method where each node maintains a routing table organized by GUID prefixes.
How does prefix routing optimize query forwarding in P2P networks?
Instead of flooding the entire network, queries are forwarded to nodes with GUIDs that share longer common prefixes with the target, reducing search time (e.g., in Pastry and Tapestry).
Why are Distributed Hash Tables (DHTs) essential in structured P2P networks?
provide scalable and efficient O(log N) lookup services by mapping objects to peers using GUIDs (Globally Unique Identifiers). This enables logarithmic search times compared to the linear complexity of unstructured systems and flooding-based methods. They also support fault tolerance through replication.
What is a structured P2P network, and how does it locate objects efficiently over an unstructured one?
A structured P2P network (e.g., Chord, Pastry) uses Distributed Hash Tables (DHTs) to efficiently locate objects in O(log N) time and allows scalable searches with logarithmic complexity
How do unstructured P2P networks operate, and what makes their searches inefficient?
Unstructured P2P networks (e.g., Gnutella, BitTorrent) rely on flooding, gossiping, or random walks to find data, leading to inefficient searches and, high network traffic, and no guarantees on data availability.
Why are unstructured P2P networks still widely used despite their challenges?
They require no strict organization, making them resilient to node churn (frequent joining and leaving of nodes).
How did Napster's architecture work?
Napster used a centralized index server where users registered file locations, but file transfers occurred directly between peers.
What were the key advantages and disadvantages of Napster's design?
The advantage was fast and efficient searches (O(1) complexity). However, it had a single point of failure, making it vulnerable to legal action, leading to its shutdown in 2001.
How did Gnutella improve upon Napster's design?
Gnutella removed the centralized index, allowing for a fully decentralized P2P system where queries were flooded through the network.
What was the main drawback of Gnutella’s design?
Gnutella had high search costs (O(N) complexity) and excessive network traffic due to flooding.
What is a supernode in P2P networks?
a high-bandwidth, high-performance peer that helps process and route search queries.
How did KaZaA use supernodes to improve efficiency?
In KaZaA, regular peers (leaves) connected to supernodes, reducing query flooding and improving search efficiency compared to Gnutella.
What problem did BitTorrent aim to solve?
BitTorrent addressed free-riding and inefficient downloads in P2P file-sharing networks.
How does BitTorrent's file distribution strategy work?
BitTorrent breaks files into smaller pieces and uses a swarming approach, where users download and share different pieces simultaneously, reducing bottlenecks and improving file availability.
What is node churn, and how does it impact P2P networks?
The frequent joining and leaving of peers in a P2P network, which can affect stability and search efficiency, requiring constant updates to routing tables and data availability mechanisms.
Flooding
A search method in unstructured P2P networks where a query is sent to all connected peers until a result is found or a time-to-live (TTL) limit is reached.
Time-to-Live (TTL) in P2P
A counter that limits the number of hops a query can take in a P2P network before being discarded.
Gossip Protocol
A decentralized method for sharing information in P2P networks where nodes randomly communicate updates to a subset of peers.
Erasure Coding
A technique in file distribution that allows a file to be reconstructed from a subset of encoded pieces even if some pieces are missing, improving fault tolerance and redundancy.
Replication Factor
The number of copies of an object or piece of data stored across different nodes to ensure availability and fault tolerance.
Deniability in P2P Networks
A privacy feature that allows peers to claim plausible deniability regarding hosting or transmitting specific data, reducing legal risks.
Why is flooding an inefficient search method in unstructured P2P networks?
Flooding generates excessive network traffic as queries are broadcast to all connected peers, leading to high bandwidth consumption and scalability issues.
How does Time-to-Live (TTL) help manage query propagation in P2P networks?
TTL limits the number of hops a query can take before being discarded, preventing excessive message traffic and improving efficiency.
How does the Gossip Protocol improve data propagation in P2P networks?
Instead of flooding the entire network, nodes randomly share updates with a subset of peers, reducing traffic while ensuring information spreads efficiently.
What is the advantage of erasure coding over simple replication in P2P networks?
Erasure coding allows a file to be reconstructed even if some pieces are missing, reducing storage requirements while maintaining fault tolerance.
Why is replication important in P2P networks, and how is the replication factor determined?
Replication ensures data availability even if some nodes fail. The replication factor is chosen based on the desired fault tolerance and network conditions.
Structured P2P networks always guarantee data availability, regardless of node failures.
False (While structured networks improve efficiency, they still require replication to handle failures effectively.)
Flooding is an efficient search mechanism in large-scale P2P networks.
False (Flooding consumes excessive bandwidth and scales poorly.)
BitTorrent uses a central server to store files for distribution.
False (BitTorrent uses a decentralized swarming approach where peers share pieces of a file directly.)
Supernodes in KaZaA were used to optimize search efficiency.
True (Supernodes reduced the number of query messages by handling search requests on behalf of regular peers.)
In BitTorrent, a leecher is a peer that has the entire file and shares it with others.
False (A leecher is a peer that is still downloading the file, while a seeder has the full file and shares it.)
In P2P networks, replication increases fault tolerance but may increase storage overhead.
True (Storing multiple copies of data ensures availability but requires additional storage resources.)
A GUID in a P2P network is a human-readable identifier assigned to a file.
False (GUIDs are unique numerical identifiers used for locating objects in a distributed system.)
How does BitTorrent prevent free-riding among peers?
BitTorrent uses a tit-for-tat system, where peers prioritize uploads to those who contribute back to the network. Peers that do not share enough data receive slower download speeds.
What are the main differences between Gnutella and KaZaA?
Gnutella uses query flooding, leading to inefficient searches, whereas KaZaA introduced supernodes to optimize searches and reduce network overhead.
Why is anonymity an important concern in P2P networks, and how is it achieved?
Anonymity protects users from legal risks and censorship. It is achieved through techniques like onion routing, encryption, and decentralized indexing (e.g., Freenet).