Get Complete Project Material File(s) Now! »
Peer-to-Peer Networks
A Peer-to-Peer (P2P) network is a network in which peers have equal responsibility and capability, unlike in a conventional centralized system or a client/server system where a single server indexes data in a large scale system. P2P is an equalizing and decentralizing concept where all peers function as equal and there is no client and server distinction. By recognizing computers as peers in network, P2P enables direct exchange of resources and services without any server, as contents are dispersed among various peers in the network. If a particular data item is searched for, no single point is asked in the network. Instead the query is broadcast to all the peers in the network, and peers capable of answering the query respond [2, 4].
The P2P approach has a number of advantages over centralized storage system, some of which are described below:
Diversity and Equality: Peers have equal access to the network and are able to share any type of content in the network. Content in a P2P network is searched dynamically by asking as many peers possible in the network. Participation of peers in the network is very dynamic, because the peers change status rapidly [5].
Dynamics: In P2P networks, information is searched and downloaded fresh from the source where the information exist as compared to a centralized system which requires updating when the cached information is no longer valid [5].
Redundancy: Data in P2P networks are often redundant. Contents are spread at different peers in the network, with popular content existing at several peers at once. Peers automatically download and store contents of other peers, and there is no single potential point of failure in the network. When a node fails, other peers take charge to balance the load on the network. In contrast, if peers are organized in a centralized manner, taking down the central server disables entire network. In addition this, centralized systems are also hampered by drawbacks related to bandwidth bottlenecks [5].
Structured and Unstructured P2P Networks
P2P networks consist of peers as network nodes. Links exist between nodes in the network; if a participating peer knows the location of another peer, then there exists a directed edge between the two peers that know each other. Based on how nodes in the network are linked to each other, we classify the P2P network as structured or unstructured [6].
Unstructured P2P Networks
In unstructured P2P networks (see Figure 1) the links between nodes are formed arbitrarily. Peers in such networks can join at any time, may contact any node for integration, and copy existing links of other nodes and then form their own over time.
If a peer wants to find some piece of data in an unstructured p2p network, it uses a flooding mechanism in which the query message has to be broadcasted through the entire network to find as many peers as possible that share the data. Messages reach individual peers several times since more than one path exist to each peer. The peers in the network do not know where specific content might be located. Popular content is available at several peers and any peer searching for it will get the same results. Popular unstructured P2P networks include Gnutella[12] and FastTrack[11] [5].
Unstructured P2P networks do however suffer from some serious drawbacks, described below: Scalability: As there is no scheme imposed on the way peers join and leave the network, any peer can join and leave the network at any time, joining peers connecting to any peer already in the network. This makes the network grow in a non-optimal way, and searches cannot be performed efficiently. Information is searched by broadcasting a query over the network. Broadcasting a query also produces overhead traffic, since the query reaches the same peers many times and also reaches peers not capable of providing an answer [1, 5].
Lack of Search Guarantees: Searches for data merely reach a number of random peers, which does not guarantee an accurate result. This is especially true when peers are searching for rare data stored by only a few peers. There is then a greater chance that the search will be unsuccessful, since there is no guarantee that the peer with desired data will be found. As flooding broadcasts queries to all the nodes in the network, it causes a large amount of traffic in the network, which impairs search efficiency [1, 5].
Centralized Server Networks
In a centralized server network (see Figure 2), a peer searching for information contacts a centralized server, which provides links to peers providing the information. Despite the improvements in search performance, this network style is hampered by the vulnerability of a single point failure, bandwidth bottlenecks and the overhead of keeping the directory information up-to-date [1].
Super-Peer Networks
Super-peer networks offer a middle ground between unstructured P2P networks and centralized server networks by introducing hierarchy into the network in the form of super-peers (see Figure 3). Super-peers provide services to the leaf peers and they index contents of leaf peers assigned to them. Queries are broadcasted to super peers who forward them to the leaf peers if relevant. The search performance of super-peer networks is significantly better than P2P networks, and they also reduce the disadvantage of single point failure inherent in centralized server networks. However, super-peer networks put additional work load on super-peers and must be carefully constructed to work well. Peers in the network can become super-peers and take on more responsibilities than others. Still, there are no guarantees when it comes to the search process. The topology could also result in an inefficient network due to uncontrolled evolution [1, 5].
Structured P2P Networks (Deterministic Topologies)
Structured P2P networks give every node a global knowledge of the network, so that any node can route a search to a peer which has desired file, even if the file is extremely rare. Nodes in a deterministic topology have a limited view of the network consisting of a set of neighbors but at the same time know the overall topology. This can be used to reach locally optimized decisions when broadcasting and routing the query message. The most common type of structured P2P network is the distributed hash table (DHT), in which consistent hashing is used to assign ownership of a file to a particular peer (see Figure 4). Well known DHTs include Chord [14], Pastry [15] and CAN [16]. HyperCup [3] is also a structured P2P protocol [5, 7].
In P2P networks, nodes are connected to each in order to share information. In the HyperCircle topology, we state organization of such networks deterministically [1].
Network Model, Aims and Requirements
The HyperCircle topology aims to be symmetric. Every node in the network should have identical power and tasks. There is no central server, which precludes the prominence of some nodes over others. Peers that can send messages directly to each other are called neighbors. A minimum number of messages are to be broadcasted in order to reach all the peers in the network. Every node in the network should be able to be the root of the spanning tree. For load balancing, network traffic should be distributed equally among the peers. The topology should be redundant, with node failure not hampering the search and broadcast processes.
Organizing peers into a HyperCircle graph
Figure 5 depicts an 8-point HyperCircle graph. A complete HyperCircle graph consists of N = 8k nodes, which means that each point can in itself consist of an 8-point hypercircle. The network diameter is Δ = 2 * log8 8k, which gives the shortest path length between the nodes furthest away from each other. As can be inferred from this, the structure is symmetric with no nodes taking a more prominent position than others. This is crucial for load balancing. Any node can be the source of a broadcast, the root of a spanning tree, distributing the load equally.
Edges in the graph are labeled as follows: Node X is neighbor-i of node Z or (X = i(Z)). In Figure 5, node 6 is neighbor-0 of node 7 and vice versa. Edges in the graph are undirected; i.e. node 7 is also the neighbor-0 of node 6. Nodes in the network also have extended neighbors X = N(Z) = {z1, z2, z3 ….}, where N is the neighbor link set, which consists of a sequence of i-neighbors that X have to follow in order to reach node Z (and vice versa). In the Figure 5, the neighbor link set {0, 1, 2} leads from 0 to 3 and back from 3 to 0 using the same link {0, 1, 2}. Edge labels start at i = 0 and maximum number of neighbors is 3. Every peer maintains a small routing table, which consists of the neighboring peers’ Node IDs and IP addresses. A node is recognized by its ID and is reached by its address [2].
Search and Broadcast Algorithm
The following broadcast algorithm is proposed in [1]:
The node invoking the broadcast sends a message to all its neighbors, marking it with the edge label on which the message was sent. The receiving node will forward the message to a) neighbors-(0, 1) if it receives the message from its neighbor-2 or b) neighbors-(0, 2) if it receives it from its neighbor-1. Nodes receiving the message from their neighbor- 0 will not forward. After this second forward, if the circle consists of less than 5 nodes, no forwarding will stop. If the circle contains 5 nodes or more, forwarding will stop after the next step.
To further remove redundancy, an additional rule exists for when a circle contains 5 or 6 nodes. Forwarding will then only be done to the neighbor-0:s in the second step.
As an example, in Figure 6 node 2 initiates a broadcast, sending to its neighboring nodes 3, 4 and 6. Node 4 receives the message from its neighbor-2, forwarding to nodes 5 and 2 (neighbors-(0, 1)). Node 6 receives the message from its neighbor-1, so it forwards to nodes 7 and 1 (neighbors-(0, 1)). Node 3 will not forward since it receives the message from its neighbor-0.
In the spanning tree in Figure 6, all nodes receive the message exactly once. N – 1 messages are needed to reach all the nodes, requiring 2 * log8 8k steps to spread the message to every node [1, 2].
A search in the HyperCircle protocol is a broadcast with a time-to-live, i.e. a broadcast with a limited scope.
Constructing the HyperCircle Topology
The main idea of the HyperCircle topology is to manage nodes in n-dimensional space consisting of 8-point circles where each point can in itself consist of an 8-point circle. The topology is based on the following rules:
• Every circle has a maximum eight nodes.
• Every node has a maximum of three relationship described as neighbor-0, neighbor-1, and neighbor-2 with the other nodes in the same circle.
• Every node has neighbor-0. The neighbor-0 is the 180-degree neighbor, connected to the opposite side of the circle through the circle point. This relationship does not change unless the node or its neighbor-0 leaves the topology.
To achieve symmetry in the topology, any node in the topology can accept and integrate new nodes. When a node leaves the topology, a simulative node jump to cover the position of the departed node, prepared to give the position to a real node when new nodes join. The neighbor-0 of the departed node will take of the departed node’s network responsibilities until a new node takes its place [1].
The following steps are taken when a new circle is created:
Start: Peer 0 is alone in a newly opened circle.
Step a (Figure 7): Peer 1 wants to join the network, contacting peer 0. Peer 0 integrates the new peer as its neighbor-0, its first vacant position. The neighbor-0 vacancy is always filled first.
Table of contents :
1 Introduction
1.1 BACKGROUND
1.2 PURPOSE/OBJECTIVES
1.3 LIMITATIONS
1.4 THESIS OUTLINE
2 Theoretical Background
2.1 PEER-TO-PEER NETWORKS
2.2 STRUCTURED AND UNSTRUCTURED P2P NETWORKS
2.2.1 Unstructured P2P Networks
2.2.2 Centralized Server Networks
2.2.3 Super-Peer Networks
2.3 STRUCTURED P2P NETWORKS (DETERMINISTIC TOPOLOGIES)
3 The HyperCircle P2P Topology
3.1 NETWORK MODEL, AIMS AND REQUIREMENTS
3.2 ORGANIZING PEERS INTO A HYPERCIRCLE GRAPH
3.3 SEARCH AND BROADCAST ALGORITHM
3.4 CONSTRUCTING THE HYPERCIRCLE TOPOLOGY
3.5 TOPOLOGY MAINTENANCE ALGORITHM
3.5.1 Integration Dimension Selection
3.5.2 Integration Champion Node Appointment
3.5.3 Node Integration
3.5.4 Node Departure
3.5.5 Broadcast and Search in an Incomplete Hypercircle
4 Implementation
4.1 SELECTION OF SIMULATION SOFTWARE
4.1.1 P2PSim
4.1.2 Overlay Weaver
4.1.3 OverSim
4.2 THE OVERSIM SIMULATOR
4.2.1 Flexibility
4.2.2 Scalability
4.2.3 Interchangeable Underlying Network Models
4.2.4 Interactive GUI
4.2.5 Base Overlay Class
4.2.6 Reuse of Simulation Code
4.2.7 Statistics
4.3 LAYER STRUCTURE OF THE HYPERCIRCLE IMPLEMENTATION
4.4 THE HYPERCIRCLE CLASS DIAGRAM
4.4.1 Extensions to the HyperCircle Algorithm
4.4.2 Code examples
4.5 SIMULATION
4.5.1 Simulation Parameters
4.5.2 Statistics Gathering
5 Results
5.1 LIMITATIONS OF THE PROTOCOL IMPLEMENTATION
5.2 SIMULATION RUN SETUP
5.3 SIMULATION RESULTS
6 Conclusion and Future Work
7 References