Get Complete Project Material File(s) Now! »
Social Network Systems
Social Netowork System (1) is a platform where people connects with each other by sharing common friends,interests, likes /dislikes and exchanging of knowledge. Now a days, Facebook, linkedIn4 , twitter5 are some of the famous social network systems in the world . With the help of such network systems people are now able to communicate more rapidly. For example , Linkedin is one of the top professional social networks which helps many people to find job accoroding to their expertise. Similarly, facebook and twitter are mostly used for sharing knowledge and content among friends.
Recommendation System
With such an immense amounts of information avaiable, it is, therefore, a big challenge for the users to find out the desired contents. For this, Recommendation system is a mechanism to filter out unwanted contents for the users. Some of the famous recommendation systems are movieLens, Last.fm, StumbleUpon and so on (2). Whereas, some well-known e-commerce websites such as Amazon6, Ebay7 also use different kinds of recommendation systems to help users to find out their desired products.
There are two main types of recommendation systems, content based and collaborative filtering. We will go through different types of techniques used in Recommendation Systems in the following sections.
Content based recommendation systems
Content-based recommendation systems (3) are the systems that recommend an item to a user based upon a description of the item and a profile of the user’s interests. Content-based recommendation systems may be used in a variety of domains ranging from recommending web pages, news articles, restaurants, television programs, and items for sale. Main disadvantage of content-based recommendation system is that, user might miss the opportunity to find out new and interesting things because of “Lack of diversity” (4).
Collaborative Filtering (CF) Recommendation Systems
Collaborative filtering (CF) is the most well-recognized and successful technique for recommendation which also overcomes the problems of content-based filtering. Collaborative filtering (CF) (6) is the process of filtering or evaluating items through the opinions of other people. CF technology brings together the opinions of large interconnected communities on the web, supporting filtering of substantial quantities of data.
Memory-based CF
This mechanism uses user rating data to compute similarity between users or items. This was an earlier mechanism and used in many commercial systems (7). It is easy to implement and is effective as well.
The advantages of this approach include: ability to explain the results, which is an important aspect of recommendation systems; it is easy to create and use; new data can be added easily and incrementally; it does not need to consider the content of the items being recommended; and the mechanism scales well with co-rated items.
There are several disadvantages of this approach. Firstly, it depends on human ratings. Secondly, its performance decreases when data gets sparse, which is frequent with web related items. It prevents the scalability of this approach and has problems with large datasets. Thirdly, it cannot handle new users or new items.
Model-based CF
Models are developed using data mining, machine learning algorithms to find patterns based on training data. These are used to make predictions for real data. There are many model based CF algorithms. These include Bayesian Networks (8), clustering models (8), latent semantic models such as singular value decomposition, probabilistic latent semantic analysis, Latent Dirichlet allocation (9) and Markov decision process based models (9).
There are several advantages with this paradigm. It handles the sparseness better than memory based ones. It helps with scalability with large data sets. It improves the prediction performance. Finally, it gives an intuitive rationale for the recommendations.
Disadvantage of this approach is in the expensive model building. One needs to find a tradeoff between prediction performance and scalability. A number of models have difficulty explaining the predictions.
User-based CF
The main aim of user-based CF is to identify the similar-minded users based on their similarities. If user rates an item, it finds other users who have shown interests in the same item to build user’s neighborhood. Then, this user can be recommended with the items highly rated by its respective neighbors. User-based CF usually makes a prediction based on a user-item matrix.
Item-based CF
A user is likely to have the same opinion for similar items in item-based collaborative filtering. For example, if someone likes Canon digital (or may be still) cameras, he might also like Canon video cameras. Item features are used to find similarity between items i.e. how other users have rated these items.
Main advantages of item-based CF compared to user-based CF are:
• Reduces cold-start problem on new users.
• Improves scalability (similarity between items is more stable than between users, because a user might change his/her interests over time).
Hybrid CF
A number of applications combine the memory-based and the model-based CF algorithms. These overcome the limitations of native CF approaches. Hybrid based CF improves the prediction performance. Most importantly, it overcomes the CF problems such as sparsely and loss of information. However, they have an increased complexity and are expensive to implement.
Limitation of CF
Amazon8uses a CF technique for recommending products, people might interest in. A recent study
(10) has shown that the quality of recommendation is impacted by online retailer. Amazon usually provides good recommendations, but the quality of recommendations is impacted by several known problems of collaborative filtering systems, as well as by limitations of the underlying algorithms used by Amazon.com to enhance the system’s understanding of each user. In combination with these factors, various weaknesses in the Amazon.com user interface contribute to users providing incorrect information about their preferences. This impacts the quality of recommendations, decreasing users’ perception of the system’s usefulness as well as their trust in the recommendations – a critical risk when competitors are only a click away.
From the case study, some of the limitations and risks are listed below:
• One of the big problems is popularity bias for example if a user likes a book that a lot of users also liked, this rating does not help the system learn much about that user (10).
• Noise can be introduced into the data by careless users (natural noise) or by users trying to promote or demote products via ratings and reviews (malicious noise).
• Data sparsity is unlikely that any two users have rated many of the same items, making it difficult to calculate the degree of similarity between users and limiting the range of recommender partner users that can be evaluated (10).
• Two users might share a similar interest in web design, but might not share the same interest in the impact of culture on web design. System has matched these users based on inadequate data.
• New users are likely to get unsatisfactory recommendations (10) because they have not provided any personal information or ratings, thus the system has no data on which to base recommendations and cannot accurately evaluate the user’s closest neighbors.
• Gift-giving: Collaborative filtering systems do not have a computational model that is capable of recognizing two distinct interests in a user’s profile (10)
Social networks and enterprise recommendation recommender systems
Enterprise recommendation solution
The Internet enables individuals to maintain existing social ties and develop new ones with the people who share similar interests (11). The emergence of the social web introduces new opportunities for people to interact and discover those with similar interests. As the users of the social web join online communities and contribute content (as in wikis and blogs) and metadata (such as tags, comments, and ratings), new ways of forming and maintaining relationships are becoming possible.
Social network systems have found that people primarily connect to individuals they already know, and are less likely to approach strangers to initiate and maintain a connection (12). SNSs have also emerged within enterprises. Research indicates that in order to stay in touch with close colleagues, employees use enterprise SNSs to reach out to employees they do not know and build stronger bonds with their weak ties. Their motivations include connecting on a personal level with more coworkers, advancing their career within the company, and campaigning for their ideas (13). The same study also recommends that “enterprise social software specifically supports users in discovering new colleagues through exploration and searching around common interests.” Most of the recommendations are based on two of the core elements of social media–people and tags. Relationship information among people, tags, and items, is collected and aggregated across different sources within the enterprise. Based on these aggregated relationships, the system recommends items related to people and tags that are related to the user. Each recommended item is accompanied by an explanation that includes the people and tags that further leads to its recommendation, as well as their relationships with the user and the item.
Comparison between existing enterprise solutions
Ido Guy, Sigalit Ur, Inbal Ronen, Adam Perer, Michal Jacovi (14) proposes an approach for recommending strangers in enterprise system with whom user shares similar interests. They aim at bringing new people to the user, in contrast to the exploration and search approach among their neighbors. They feel that connecting to strangers within the organization can be more valuable for employees in many ways, such as getting help or advice (15), reaching opportunities beyond those available through existing ties (16), discovering new routes for potential career development, learning about new projects and assets and so on. Compare to our works, our approach also helps end users to get recommendation from strangers based on similar attributes and taste. But people often are not interested to get friend request from the people whom they do not know at all. They mostly like to connect with friends of friends who sharing the same interest. Our enterprise solution solves this issue by finding out similarity in their organizations. Moreover, our approach also finds out some similarity where people did some action on common content such as commenting or liking on the same blog post.
Social aggregation system SAND (17) suggested a tag based recommendations, highlighting the value of tags as concise and accurate content descriptors that takes into account human perceptions of the content (18). In their approach they did not use any explicit input to the system such as rating, liking so on. But our system highly depends on explicit trust value among the neighbors which make it more reliable to the enterprise users. We also successfully deal with cold start problem with the new users (19). Tagging is normally used as a free text in most of the systems. It does not always reflect what users want. Our system also uses user-tag relationship but we provide low value for our tagging adaptor. We have high priority to users input to the system which might reduce the performance of the system but provide more accurate and reliable output to the users.
Trust and privacy in enterprise recommendation systems
Users in social network system need to express their relationships with other users which stores as an information in the system. This information leads us to the social notion of trust which helps users to find their trustworthy friends and share their preferences for an item like a movie or music. This is also due to the fact that users tend to have recommendations from their trusted partners (20).
Trust plays crucial role in many research areas such as psychology, philosophy, and sociology and computer science. It is difficult to clearly define the word “trust” as it is perceived differently by every other person. Normally we believe in something or someone based on our knowledge about them and if that belief reaches to a certain stage then it becomes Trust.
Trust has two main components: belief and commitment. The first part reflects the feeling of one towards something or someone while second part shows the bond (connectivity) towards that. Collaborative filtering (CF) generally gives the recommendations based on similarity between users. But, similarity measurement is not sufficient enough when user profiles are sparse. The connection between how similar two users are also depends on how much they trust each other. An analysis of data in Film Trust (21) shows that there is a correlation between similarity and trust (22) what they read about movies, rate them and write reviews. Hence, trust can be considered as a measure for expressing the relationship between two users in recommendation systems.
Massa and Avesani presented architecture for a trust-aware recommender in which the “web of trust” is explicitly expressed by the users (23). They depicted that trust can be aggregated for all of the users in a social network, and the importance of a certain user is predicated by using a graph walking algorithm.
In our system, we defined trust between two users as a strong bond between them and this bond is computed using different metrics (as explained later) between them based on either user trends (blog writing, liking or commenting) or its profile (organization, region, interests or skills). We computed this trust as a weighted average of correlation between two users based on different metrics that are defined below (Chapter 3). A very basic definition of our system that will explain the trust computation at a very high level is as follows: , = Avg( µ , ) =1
Where u and v are users between whom we find trust, “Cor(u,v)” represents the correlation between users based on a specific correlation function and N represents the no of Correlation functions. In the above formula “µ” represents the weight we give to different correlation functions. The correlations functions are explained in chapter 3.
Due to the huge exposure of personal information, now a challenge is to design effective privacy mechanisms that protect user’s information against unauthorized access to their data. Now-a-days, different social network system uses different trust models that exploit the underlying inter-entity trust information (24). The objective of designing such privacy scheme ensures a user’s online information is disclosed only to sufficiently trustworthy parties.
In our paper, we have defined our privacy protocol in such a way so that users have full freedom to block themselves from the recommendation system. It enables them to private their presence in the network. Along with that, they also have the option to customize their profiles to be visible according to the different levels of trust they define in their profile settings.
Meven: An Enterprise Credibility Aware Recommender
Meven, an enterprise recommandation system that is based on Content (user profiles) with Trust and privacy control policy. The idea is to provide Social Networks with the ability to quickly find related information about the users having similar attitude as the current user. The users will also be able to set the privacy matrics on their profiles so they will not get recommendation of those they feel irrelavent and this is achieved by Privacy metrics (defined below). The following section explains our syetsm in detail along with the component descrioption.
System Specification
This section explains overall system in detail from the conceptual perspective. As we have mentioned earlier in introduction that our system recommends users based on different correlation functions and then compute the trust as a weighted average of these correlation functions so this function also explains these correlation functions.
We will first see how “Meven” works and later we will explain each individual component.
Functional flow of Meven
Meven engine performs the recommendation when a user visits the site and log in with its account. As engine is exposed as a web service so a call is made from the portal to the engine and engine performs the following tasks.
1. The System first gets the current user profile that contains his/her regional and organizational information along with interests, skills, and blog related information.
2. The system checks either the user has any friends or not (cold-start).
3. When a User has no Friends (Cold Start) the system continues as
a. Find users (neighborhood cluster) based on the followings
i. Region
ii. Occupation
4. When a User has Friends the system continues as
a. Find the Friends of most trusted friends till a threshold of K {K=15}.
i. Repeat it until we have get the threshold value or there is no friend left
b. If the limit of selected users does not meet the threshold find the Find users (neighborhood cluster) based on the followings
i. Region
ii. Occupation
c. Combine the result of ‘a’ and ‘b’ to prepare neighborhood cluster
20
5. Filter profiles based on the Privacy Metrics (remove all those profiles where user has indicated not to use his profiles in recommendation)
6. Find User Correlation between current user and selected Neighborhood Clusters based on followings
i. Memberships in similar Groups/Communities
ii. Similar Interests
iii. Skills
iv. Shared Friends
v. Similar Blogs (Blog Posts, Likes, Comments, Category, Tags)
7. Calculate Trust (Implicit) between Current user and Neighborhood Cluster Users by taking the weighted average of above computed correlations.
8. Filter every user ‘x’ from step‘d’ to choose only if computed trust with the current user is equal or more than the allowed value in ‘x’.
9. Return the chosen users in a descending order with respect to computed trust.
The above description explains how meven works while recommending a list of users to the current user. In following sections we will now explain the different adapters (helpers to fetch, normalized profile information from database) and correlation functions (helper to find closeness or similarity between users based on some features).
Implicit Trust
« Implicit Trust » is a computed trust between user X and user Y by measuring the correlation between users. The value (between 1-4) is generated by the system taking into account different correlations between users. See section 2.6.1 for more details on how trust is computed in our system.
Explicit Trust
In our System we have defined Explicit Trust as a measure by which a user can explicitly define criteria of being used by recommendation system. The idea is that a user can specify to whom it should be recommended by choosing a level of trust (we call it Explicit Trust). We have following levels of Explicit Trust in our System
Level 4 = Users who want to be recommended to everyone
Level 3 = Users who want to be recommended to Majority of Users
Level 2 = Users who want to be recommended to Limited Users
Level 1 = Users who want to be recommended to more Limited Users
Explicit Trust is used by the system while recommending a user X to current user if the defined explicit trust of X is equal or greater than the trust value computed by the system (Implicit Trust).
An example of this is like « System is generating recommendations for user named James, and it computes the trust between James and Alina 2.156 (Implicit Trust). Alina has defined her Explicit Trust value as 1 (she wants to be recommended to very limited users). The system will then exclude Alina from the list of recommendations generated for James »
Privacy Protocol
We have defined our privacy protocol in such a way so that users have full freedom to block themselves from the recommendation system. It enables them to private their presence in the network. Along with that, they also have the option to customize their profiles to be visible according to the different levels of trust they define in their profile settings.
Adapters
We define helper components for mining information from data-base as Adapter. We have used three adapters for the purpose of fetching information from the data base and these are as follows
Organization Adapter
During the time of Cold start (when user does not have any friends and content) or when users have fewer number of friends, the system needs to use Organization adaptor. It finds out neighborhood cluster (a cluster of users those who the system feels closer to current user) based on workplace (company or employer) of the current user.
Regional Adapter
During the time of Cold start (when user does not have any friends and content) or when users have fewer number of friends, the system also needs to use Regional adaptor to find another neighborhood cluster (a cluster of users those who the system feels closer to current user) based on Region (address, area or country) of the current user.
Table of contents :
1. Introduction
1.1 Meven
1.2 Motivation
1.3 Contribution
1.4 Outline
2 Related Work
2.1 Social Network Systems
2.2 Recommendation System
2.3 Content based recommendation systems
2.4 Collaborative Filtering (CF) Recommendation Systems
2.4.1 Memory-based CF
2.4.2 Model-based CF
2.4.3 User-based CF
2.4.4 Item-based CF
2.4.5 Hybrid CF
2.4.6 Limitation of CF
2.5 Social networks and enterprise recommendation recommender systems
2.5.1 Enterprise recommendation solution
2.5.2 Comparison between existing enterprise solutions
2.6 Trust and privacy in enterprise recommendation systems
3 Meven: An Enterprise Credibility Aware Recommender
3.1 System Specification
3.2 Functional flow of Meven
3.3 Implicit Trust
3.4 Explicit Trust
3.5 Privacy Protocol
3.6 Adapters
3.6.1 Organization Adapter
3.6.2 Regional Adapter
3.6.3 Friends-Of-Friend Adapter
3.7 Correlation functions
3.7.1 Interest Correlation Function (Pattern Matching)
3.7.2 Skills Correlation Function (Tag comparison)
3.7.3 Membership Correlation Function (Relational Pattern Matching)
3.7.4 Friends Correlation Function (Shared friends; Pattern Matching)
3.7.5 Blog Correlation Function
3.8 Explanation Module
4 Design and Implementation
4.1 System Architecture
4.2 Software Design
4.2.1 Implementation Schema (Class diagrams)
4.3 Execution (Flow diagrams)
4.4 Data Base Design and Schema
4.5 System Implementation (tools and applications)
4.6 Screenshot of Meven
5 Evaluation
5.1 Setup (Dataset)
5.2 Accuracy Tests
5.2.1 Cold –Start Accuracy
5.2.2 Normal Flow Accuracy (When Users has friends in the system)
5.3 Caching and Response Time Evaluation
5.4 Influence of Privacy metrics
5.4.1 Test 1 Influence of Hidden Users on recommendations
5.4.2 Test 2 Influence of Explicit Trust on Recommendations
5.5 Trust Related Evaluation
5.5.1 Test # 1 Average Trust based on Correlation Functions
5.5.2 Test # 2 Average Trust (cold-start) Vs Average Trust (normal flow)
5.6 Frequency Distribution of Implicit Trust (Dynamic)
5.7 Frequency Distribution of Recommendation with Explicit Trust (Static)
6 Future Works
7 Conclusion
8 Bibliography
9 Appendix A
9.1 WSDL
9.2 Messages
9.3 Get Recommendation Service Request
9.4 Get Recommendation Service Response
9.5 Clear Meven Cache Request
9.6 Clear Meven Cache Response