Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique

Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate... Journal of Computing and Information Technology - CIT 22, 2014, 2, 105–113 doi:10.2498/cit.1002223 Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique Mohammad-Hossein Nadimi-Shahraki and Mozhde Bahadorpour Faculty of Computer Engineering, Najafabad branch, Islamic Azad University, Najafabad, Iran To develop a recommender system, the collaborative the system; that is called cold-start user prob- filtering is the best known approach, which considers lem. In other words, the system must attempt the ratings of users who have similar rating profiles to gather information about the new user before or rating patterns. Consistently, it is able to compute being able to fully use the system. the similarity of users when there are enough ratings expressed by users. Therefore, a major challenge of To solve the cold-start user problem, a few ef- the collaborative filtering approach can be how to make recommendations for a new user, that is called cold-start ficient methods have been proposed based on user problem. To solve this problem, there have been ask-to-rate technique [5], in which a new user proposed a few efficient methods based on ask-to-rate is asked to rate the selected items until hav- technique in which the profile of a new user is made by ing a sufficient number of rated items. The integrating information gained from a quick interview. This paper is a review of these proposed methods and methods can be categorized to two non-adaptive how to use the ask-to-rate technique. Consequently, they and adaptive methods depending on whether the are categorized into non-adaptive and adaptive methods. presented items are similar to “all” new users or Then, each category is analyzed and their methods are compared. not. In this paper, both non-adaptive and adap- tive methods are explained and their efficient Keywords: recommender systems, collaborative filter- methods are reviewed. ing, new user, user cold-start The rest of this paper is organized as follows. The recommender systems are introduced in Section 2. The concept of CF recommender sys- tems is described in Section 3. A comprehen- 1. Introduction sive survey of ask-to-rate technique and some of the efficient methods are discussed in Section The idea of personalizing searching engines, in- 4. Finally, in Section 5, the related methods are telligent software agents and recommender sys- discussed and conclusion of this work is pre- tems is taken into consideration by users who sented. ask for help in sorting, classifying, personal- izing, filtering and sharing a large amount of information. One of the common recommender 2. Recommender Systems techniques is Collaborative Filtering (CF)[1-3] which offers preferred items to a user based on the items previously rated by their collaboration. Recommender systems are a subset of informa- The essential supposition is that, if users X and tion filtering systems which are used as efficient Y assign a similar rate to n items or have a simi- tools for overcoming information overloading, lar behavior, they will rate or behave other items inspecting a large set of information and select- similarly [4]. Therefore, a major challenge of ing information related to each user. The issue CF technique can be how to make recommen- of recommendation and rating prediction im- dations for a new user who has recently entered plies items like movie, music, book, etc. or so- 106 Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique cial factors like people or groups that have not some studies have suggested hybrid algorithms been seen by users yet. When recommender [6, 12]. This section focuses on a common systems are able to predict ratings for items that memory-based CF algorithm, named user-based have not been observed yet, the item(s) can be kNN (k-Nearest Neighbors)[2]. recommended to a target user. A target user Memory-based algorithms are essentially heu- is a user for whom the recommendations are ristics as in the user-based kNN system which made. A movie recommender system, for ex- calculates the prediction of a target item based ample, might memorize explicit or implicit user on statistical techniques in order to find users ratings to recommend new movies to the same with similar tastes as follows: user, based on the ones that s/he has already • First, the similarity, sim(u , u ), between tar- t i seen. get user, u , and all other users, u,who t i Thus, how would the recommendation be pro- have rated target item, a , is computed by duced? There is a taxonomy provided by [6] different measures such as Pearson’s Cor- relation (shown in Equation (1)),Cosine with five different techniques including collabo- measure, a recent measure like proximity- rative filtering, content-based, demographic [7, impact-popularity [13] and so on, which re- 8], utility-based [9] and knowledge-based ones. flects distance, correlation or weight be- There is another category to overcome limita- tween two users. tions of the mentioned methods by combining techniques, which tries to use advantages of one (r −r ) · (r −r ) u ,a u u ,a u t m t i m i technique to fix disadvantages of others. Sev- m=1 sim(u , u )= t i eral ways have been proposed for their combi- h h 2 2 (r −r ) · (r −r ) nation to come up with a new hybrid system u ,a u u ,a u m i m i t t m=1 m=1 (see [6] for precise descriptions, where seven (1) categories of hybrid system are presented).CF systems are described here since repeating the detailed explanation of other categories in this where r is rating of item a by user u, r u,a u is mean of rating by users u or u for all the paper might be redundant. The interested au- t i thors could refer to original articles [1, 6, 10]. co-rated items and h is the number of items co-rated by both users. The similarity rang- ing is between −1 (the least similar users to target user) and 1 (the users most similar to 3. Collaborative Filtering Recommender the target user). Systems • Second, prediction for a target item by a tar- get user can be calculated using at most k Collaborative filtering recommender systems nearest neighbors, who have also rated the are one of the biggest sub-domains of informa- target item, found from the former step as tion retrieval. The basic concentration of these Equation (2). systems is on finding users with similar interests to the target user and aggregating their opinions (r −r ) · sim(u , u ) to make a recommendation. So, it calculates u ,a u t h h h h=1 similarity between users instead of the content prediction(u , a )= +r t t u of items. Under the existing amount of infor- |sim(u , u )| t h h=1 mation, both users and website owners receive (2) benefit from CF systems; thus, users are able to come across preferred items; moreover, the profit from e-commerce websites potentially go where r and r are mean of ratings for u u t h up because of persuading the user to buy more the target user and user h on all other rated related products or accessories. items and sim(u , u ) is similarity between t h the target user and user h. Researchers have already classified many al- gorithms for collaborative recommendation in- One of the advantages of memory-based CF al- cluding the memory-based or model-based CF gorithms is their intuitive idea that makes it easy [11]. Also, for taking advantages and alle- to comprehend and the results are conveniently viating certain drawbacks of two algorithms, explainable. Furthermore, the main strength of Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique 107 pure CF systems is that the new data can be this process is completed and, whereas in user- added increasingly and without difficulty since item matrix the row of a new user is not empty, they do not require any tagging of the items’ the new user enters the normal phase of rec- content, like content-based filtering, and rec- ommender system. The CF system should use ommendations are made only using the rating these ratings to compute similarity between new data. Hence, this approach is suitable for any and other users. Whereby, s/he gets precise rec- domain, especially in domains the contents of ommended items, shown in Figure 1. which are either rare (like restaurants) or accru- The system must be cautious about presenting ing contents are difficult (like movies or music). informative items that gather useful informa- Collaborative systems have their own limita- tion before a new user is allowed to normally tions like cold-start problem [5,14-16], scalabil- use the system. If the ratings are obtained by a ity [17] and sparsity [4]. As an important prob- well-designed selection strategy compared with lem, the cold-start user problem occurs when a a strategy in which the users self-select the items to rate, the recommendation accuracy can be user, who is new to the recommender system, enters the system and there are no ratings by the improved. user. The user-based CF cannot compute simi- Generally, techniques should not appear severe larity between new and other users [14-16, 18, to the new users and they must move toward 19] Hence, it is difficult to make recommenda- minimizing user effort and maximizing recom- tions. mendation accuracy. Of interest in [14, 18], evaluation of elicitation methods on user effort To solve this problem, there have been different techniques. The ask-to-rate technique [5, 14, and accuracy metrics is shown in Table 1. The 16, 18, 19] is the most direct way for obtain- methods are mentioned in the following sec- ing some information about the new user and tion. This paper provides an overview of the efficient methods based on the ask-to-rate tech- for learning the user’s preferences. The next nique. Reasonably, they are categorized to non- section explains the ask-to-rate technique. adaptive and adaptive methods, based on how the next items are selected. 4. Asking for Explicit Ratings Recommendation Methods User Effort The most direct way to cope with cold-start user Accuracy problem and make a rapid profile of a new user IGCN is to ask for explicit ratings by presenting items (Log pop)×Ent to the user. It can elicit initial information about Entropy0 the new user with a quick and short interview. HELF After presenting some items to the new user, Popularity Item-Item Entropy Random Table 1. The Evaluation of elicitation strategies in [14, 18] over both online and offline experiments on user effort and accuracy metrics, (:best, :worst). 4.1. Non-adaptive methods Using non-adaptive techniques makes it possi- ble to present the same items to “all” new users regardless of changes in knowledge of the user being interviewed. In most of these methods, Figure 1. The new user prompting process. computation is based on information theory for 108 Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique the new user’s problem. The advantage of these experience. All the proposed methods have methods is that order only needs to be calcu- been measured based on rating prediction ac- curacy – MAE (Mean Absolute Error) evalu- lated once, although these techniques provide little information. Some of the techniques are ation metric. These methods are as follows: classified as non-personalised methods in [16]. • Random strategy: selects the items randomly. It learns about new user preferences in terms Various strategies have been proposed for non- of all available items. Random strategy is adaptive methods such as Variance and Entropy a baseline strategy which is used for com- strategies [19]; Random, Popularity, Pure en- parison. The analysis of the rating matrix is tropy and Balanced strategies [14]; Entropy0, not intelligent and the results of online and HELF strategies [18]; and Greedy strategy and offline experiments point out that it needs the Other People’s Greedy and Variations strate- much more user effort and that accuracy of gies [16]. Details of each non-adaptive strategy predictions is unfavorable. If the distribution are: of ratings is not uniform, the user will prob- 1. Active WebMuseum is the first CF recom- ably not have any opinion about presented mender system which uses the ask-to-rate items. technique [19]. This web-based virtual mu- • Popularity strategy: It has been suggested to seum has a dynamic topology in which art take an item’s popularity into account, i.e. paintings are personalized and ordered by how many users have rated an item. The museum visitors’ taste and preferences. This items are ordered by the number of ratings paper proposes Entropy and Variance meth- that they have been given by all users and ods to present sequence of items to be rated present some of the most popular items to by new users. These methods are the statis- the new user. According to Equation (4), tical analysis of distribution of item ratings popularity of item a is computed, where r t at given by other users in the dataset. Accord- shows its rating. ing to Equation (3) the variance of the target item a is computed. Popularity(a )= |r | (4) t a (r − r ) Implementation of this method is easy and u,a a t t u∈U its computation is inexpensive. It has ac- Variance(a )= (3) |U | complished the important goal of minimiz- ing user effort. However, ratings may be where U is all users which have rated items t uninformative since most users like popu- a , r is rating of item a by user u and r t u,a t a t t lar items. Moreover, the popularity measure is mean of a ’s rating. Experiments use ran- suffers from prefix bias – it is derived from dom strategy (select items to present with- popular items which receive ratings increas- out prior planning) as a baseline measure ingly but not from unpopular items. This and point out that these two methods gener- problem causes unequal distribution of rat- ate more accurate predictions for new users ings in the dataset. than random strategy. • Pure entropy: Another low complexity me- 2. In 2002, MovieLens research group extended thod for item selection is entropy, which was the aforesaid idea in web personalization proposed by [19] and was re-presented in [14]. In this research, some strategies were [14]. The entropy on a target item H(a ) proposed which contained use of informa- is dispersion of the item ratings in the rat- tion theory and aggregated statistics to learn ing matrix. Using pseudocode in Figure 2. about new users. These strategies focused on Then, some not-yet-rated items with the high- the issue of which items to be presented to est score are presented. This method pro- the new user during an initial interview. Dif- vides a lot of information for each rating; ferent strategies have been tested through of- but, some information is not informative fline and online experiments to select movies for the system and sometimes it selects un- that have used MovieLens dataset. Their known items since this method does not take evaluation considered user effort and rec- frequency into account. In offline experi- ommendation accuracy related to the user ment, Entropy, like Random strategy, needs Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique 109 extreme user effort and performs extremely missing evaluations were filled with a sepa- poorly on accuracy. It has not been evaluated rate category like “0” whereas “1-5” was the in the online experiment. Hence, in terms of usual scale. A weighted entropy formulation accuracy and user effort, Entropy and Ran- was used as Equation (5): dom methods lag behind all the methods mentioned in [14]. Entropy0(a )= − p w log(p ) t i i i i=0 Function Entropy (a ) (5) entropy (a )= 0 for each item a in dataset where w = 0.5 is the weight to identify miss- for i as each of the possible rating values // in ing values and w = 1for i = 1, 2, 3, 4,5since movielens i = 1 ... 5 this selection of weights provided the best re- if a ’s rating= i sults for the original experimentation. Note that, value[i]+= 1 // rating frequencies w = 0 changed Entropy0 into the pure entropy end for measure. The Entropy0 method dominates one proportion =value[i]/total number of users who of the limitation of entropy and distinguishes be- rate a tween most unknown items (infrequently rated entropy(a )+=proportion *Math.log t i items) and frequently rated items. It is slightly (proportion ,2) more successful than Popularity method. end for each entropy(a )= −entropy(a ) t t • HELF (Harmonic mean of Entropy and Log- End arithm of Frequency): This strategy is a hy- brid of Popularity and Entropy strategies. It Figure 2. Pseudocode of entropy approach. uses suitable feature of harmonic mean and logarithmic function. HELF combines har- monic mean of Popularity (rating-frequency • Balanced strategy: The logarithmic of pos- of items) and Entropy scores of items; The sibility with which the user has rated the item combined measures are not correlated. HELF (popularity score) is multiplied by entropy, is defined as Equation (6): that is (log popularity) *entropy and some items are presented in a descending order. 2 ∗ LF ∗ H (a ) This method combines advantages of two HELF = (6) LF + H (a ) components, has the best accuracy toward t other methods proposed in [14] and needs where LF = log(|a |)/ log(|U|) is the nor- medial user effort. t malized logarithm of the rating frequency of 3. In 2008, idea of ask-to-rate by MovieLens target item, and H (a )= H(a )/ log(5) is t t research group was further extended in [18] the normalized entropy of target item. to improve order of items and more pre- 4. Other criteria were suggested in [16],which cisely elicit opinion of new users at regis- showed progression of methods for dealing tration time. This paper was a winner in with the new user. Below, details of some Yahoo! Research Best Paper Award [20]. non-adaptive approaches called Greedy and They proposed an offline simulation frame- Other People’s Greedy and variations are ex- work and an online experiment with real plained. users of the MovieLens live recommender system. Three new information theoretic • Greedy strategy: where the next item is cho- strategies were presented: Entropy0, HELF sen from those that the user can rate such and IGCN. Details of the two non-adaptive that the prediction error for its test set is min- strategies include: imized. This method is not feasible in prac- tice and requires knowing what each person • Entropy0 (Entropy Considering Missing Val- can rate and the actual ratings. It is used as ues):In [14], missing ratings (non-ratings) a baseline. were ignored in entropy’s calculation. En- tropy0 was proposed to handle the problem • Other People’s Greedy and Variations,se- of an item with the missing evaluation as all lected items will be presented to the new 110 Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique user from the top-n lists of other users’ items trolled more effectively than in non-adaptive obtained through a greedy method. It uses approaches. Adaptive approaches take into ac- other people’s opinions and selects items count the user’s historical ratings among initial which reduce prediction error. interview and consider the system’s changing profile of the new user; thereby, the number of items familiar to the new user is maximized. 4.2. Adaptive methods Dealing with the cold-start user problem by ask- ing to rate, there are a few adaptive approaches We define those approaches as Adaptive because for a personalized items’ ordering, such as Item- selected items are consistent with “each” new item personalized [14];IGCN [18];na¨ıve Bayes, user’s opinions. For present items that best fit perturbed Other People’s Greedy and Variations user’s personal preferences the system should [16](which are classified as personalised meth- adapt to the earlier rates given by the new user. ods in [16]) and clustering method [21].The Thereupon, they rate items with personalized orderings and the interview process will be con- details of each strategy are: Research Method Pluses Minuses Example • The first statistical analysis of the item’s Variance ratings distribution toward solving user • Not adapted to a user’s rating history cold start problem • Select unknown items. Kohrs, A. • The first statistical analysis of the item’s • Disregard item popularity and the rating Merialdo, B. ratings distribution toward solving user frequencies Pure (2001) cold start problem. • Not considered missing values. Entropy • Using potential information of an item’s • Assign the most entropy value to the ratings. items with uniform rating distribution. • Not adapted to a user’s rating history • Used as a baseline. • Not apply intelligent analysis Random • Apply to all available items. • Not adapted to a user’s rating history Rashid, A.M. • Uninformative rating. Albert, I. • Easy to compute • Increases Prefix bias Cosley, D. Popularity • Easy to implementation • Causes unequal distribution of ratings Lam, S.K. • Most availability of users for rate items. in dataset McNee, S.M. • Not adapted to a user’s rating history Konstan, J.A. Riedl, J. • Poularity’s dominance over multiply • A combination of popularity and en- (2002) Balanced PopbyEnt. tropy’s advantages. • Not adapted to a user’s rating history • Considering missing values • Distinguishes between most unknown • Bias toward frequently rated items. Entropy0 items (infrequently rated items) and fre- • Not adapted to a user’s rating history. Rashid, A.M. quently rated items. Karypis, G. • Using suitable feature of harmonic mean Riedl, J. and logarithmic function (2008) • Not adapted to a user’s rating history. HELF • A combination of popularity and en- tropy’s advantages. • Used as a baseline. • Cannot be applied in practice. Greedy • Selecet items which reduced prediction strategy • Not adapted to a user’s rating history. error in test set. Crane, M. Other (2011) people’s • Using other people’s opinions. • Not adapted to a user’s rating history. greedy • Selecet items which reduced prediction and error. variations Table 2. Classification of asking to rate, non-adaptive approaches Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique 111 • Item-item personalized, where items are pro- approach offers greater accuracy than all posed until the user gives at least one rat- other proposed information theoretic mea- ing; then, similarity between the items will sures in [18]. be computed using a recommender system • Na¨ıve Bayes: This method is a variant of based on some similarity measures and some Popularity method which is personalized. items that the user would be most likely to When a user is capable, or incapable, of rat- buy (or see in movie domain) will be pre- ing an item, it uses na¨ıve bays probability sented. Whenever the user gives more rat- in which the user is capable of rating other ings, the list of similar movies will be up- items. The selecting items to be presented dated. Evaluation of strategy over both ex- are the highest probable items which are able periments on their metrics points out that it to be rated. provides the best user effort like Popularity • Perturbed Other People’s Greedy and Vari- and Entropy0 strategies and the worst ac- ations: This method combines na¨ıve bayes curacy like Entropy and Random strategies probability and Other People’s Greedy me- since the approach tends not to identify items thod to generate a list of personalized items. that the user will like [14] It utilizes advantages of both by selecting • IGCN (Information Gain through Clustered items which the user is able to rate and the Neighbors): Toward achieving an adaptive amount by which the item cuts down on pre- designed selection strategy, at first, [18] con- diction error. sidered using decision trees. Initially, the • In [21], the authors extended the item-item users are clustered to groups and then a de- method to create a personalized methodol- cision tree algorithm such as ID3 is used to ogy for dealing with the new user problem come across the right cluster for the target using ask-to-rate technique. user and learn user profile. This approach takes into account the items that are rated • Clustering: This proposed strategy enjoys by a user so far. The goal of target user is item (in this paper, items were news articles) to follow a route through the decision tree and user clustering information. This ap- from the root node (with the highest infor- proach uses W-kmeans clustering algorithm, mation gain) to the leaf node (which infers to choose which items to select next for rat- the user’s true class or neighborhood). ing by the user. The authors have demon- strated that it performs better than all of the However, the authors refuse to consider common strategies like Random, Popularity, this ideal decision tree scenario because it Pure Entropy, Balanced and Item-Item per- may not be practically feasible with most sonalized and minimizes user effort members of a recommender system; instead, A brief comparison of the classification of meth- they have proposed a two-phase algorithm ods to alleviate user cold-start problem by ask- named IGCN. Before starting the first step, ing to rate, and their advantages and disadvan- user clusters are created using bisecting k- tages are depicted in Table 3. mean approach and the information gain (IG) of items is computed. In the first phase called non-personalized step, the user gives 5. Conclusion several ratings to the items that are ordered by their information gain scores, to build an initial profile until the user has rated at least In summary, the objective of a recommender some threshold numbers of items. In the sec- system typically is to recommend items that ond phase, named personalized step, toward best fit users’ personal preferences. Collabora- creating an affluent profile, information gain tive filtering systems generate recommendation of the items is computed using only the best based on user-user similarity. A new user en- neighbors of the target user as long as the counters a serious problem in the collaborative best neighbors have no changes. IGCN re- filtering approach. Since the system does not quires assuming a predefined clustering of have any data about the new user preferences, it users. The 20 days online experiment per- could not provide any personalized recommen- formed on 468 users presented that IGCN dation for him/her. It has to acquire some data 112 Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique Research Method Pluses Minuses Example Rashid, A. M. Albert, I. Cosley, D. Item-Item Lam, S. K. • The first adaptive method. • Inattention to user’s interests in items. personali- McNee, S. M. • Adapting to a user’s rating history. • Providing uninformative rating zed Konstan, J. A. Riedl, J. (2002) Rashid, A. M. Karypis, G. • Requires assuming a predefined clus- • Adapting to a user’s rating history. IGCN Riedl, J. tering of the users (2008) Na¨ıve • Considering the ability of a user for • Poor performance compared to the Bayes ratings. simplest approach (Random). Perturbed Crane, M. other (2011) • Poor performance compared to other people’s • Reduce prediction error compared to people’s greedy method. greedy and variations Table 3. Classification of asking to rate, Adaptive approaches. about the new user. In this paper we have re- [5] T. N. LILLEGRAVEN,A.C.WOLDEN,Design of a Bayesian recommender system for tourists present- viewed several methods for dealing with the new ing a solution to the cold-start user problem. Master user problem via ask-to-rate technique. The of Science in Computer Science, Department of methods are categorized into two, non-adaptive Computer and Information Science, Norwegian and adaptive categories. University of Science and Technology, 2010. Although a few efficient methods to solve the [6] R. BURKE, Hybrid recommender systems: Survey cold start new user problem have been proposed, and experiments. User modeling and user-adapted it is still not a stone in the corner. During the interaction, 12 (2002), 331–370. items selection, the new incoming ratings of [7] E. RICH, User modeling via stereotypes. Cognitive other users are not considered. Therefore, a fu- science, 3 (1979), 329–354. ture direction can be developing a new method, which will adapt to the earlier ratings given by [8] T. MAHMOOD,F. RICCI, Towards learning user- other users. adaptive state models in a conversational recom- mender system, (2007), pp. 373–378. References [9] R. H. GUTTMAN, Merchant differentiation through integrative negotiation in agent-mediated electronic commerce, 1998. [1] G. ADOMAVICIUS,A. TUZHILIN, Toward the next generation of recommender systems: A survey of [10] F. RICCI,L. ROKACH,B. SHAPIRA, Introduction the state-of-the-art and possible extensions. Know- to recommender systems handbook. Recommender ledge and Data Engineering, IEEE Transactions, Systems Handbook, (2011), 1–35. 17 (2005), 734–749. [11] J. S. BREESE,D. HECKERMAN,C. KADIE, Empirical [2] D. BILLSUS,M.J. PAZZANI, Learning collaborative analysis of predictive algorithms for collaborative information filters, (1998), pp. 48. filtering. In Proceedings of the Fourteenth Con- [3] B. SARWAR,G. KARYPIS,J. KONSTAN,J. RIEDL, ference on Uncertainty in Artificial Intelligence, Item-based collaborative filtering recommendation (1998), pp. 43–52. algorithms, pp. 285–295. [4] X. SU,T. M. KHOSHGOFTAAR, A survey of collab- [12] J. WANG, A.P.DE VRIES, M.J.T.REINDERS,Unify- orative filtering techniques. Advances in Artificial ing user-based and item-based collaborative filtering Intelligence, 2009 (2009), pp. 4. approaches by similarity fusion, pp. 501–508. Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique 113 [13] H. J. AHN, A new similarity measure for collabora- Received: September, 2013 Revised: May, 2014 tive filtering to alleviate the new user cold-starting Accepted: May, 2014 problem. Information Sciences, 178 (2008), 37–51. Contact addresses: [14] A. M. RASHID,I. ALBERT,D.COSLEY, S.K.LAM,S. M. MCNEE,J.A. KONSTAN et al., Getting to know Mohammad-Hossein Nadimi-Shahraki Faculty of Computer Engineering you: learning new user preferences in recommender Najafabad branch systems. In Proceedings of the 7th International Islamic Azad University Conference on Intelligent User Interfaces, (2002) Najafabad San Francisco, California, USA, pp. 127–134. Iran e-mail: [email protected] [15] A. I. SCHEIN,A. POPESCUL,L. H. UNGAR,D. M. Mozhde Bahadorpour PENNOCK, Methods and metrics for cold-start rec- Faculty of Computer Engineering ommendations, pp. 253–260. Najafabad branch Islamic Azad University [16] M. CRANE, The new user problem in collaborative Najafabad Iran filtering. Thesis for the degree of Master of Science, e-mail: [email protected] Department of Computer Science, University of Otago, Dunedin, New Zealand, 2011. [17] B. SARWAR,G. KARYPIS,J. KONSTAN,J. RIEDL, Application of dimensionality reduction in recom- MOHAMMAD-HOSSEIN NADIMI-SHAHRAKI was born in Iran. He re- ceived his Ph.D in computer science from University Putra of Malaysia mender system-a case study, DTIC Document2000. (UPM) in 2010. Currently, he is a full time Assistant Professor at the Faculty of Computer Engineering of Islamic Azad University of [18] A. M. RASHID,G. KARYPIS,J. RIEDL, Learning Najafabad (IAUN), His research interests include data mining, web preferences of new users in recommender systems: mining, social network mining and recommender systems. an information theoretic approach. ACM SIGKDD Explorations Newsletter, 10 (2008), 90–100. MOZHDE BAHADORPOUR was born in Iran. She received her BSc de- [19] A. KOHRS,B. MERIALDO, Improving collaborative gree in computer software engineering from Islamic Azad University filtering for new users by smart object selection. of Najafabad (IAUN) in 2008. She has recently completed her MSc In Proceedings of the International Conference on thesis research on Cold-start problem in collaborative recommender sys- Media Features (ICMF), (2001). tems under the supervision of Dr. Mohammad H. Nadimi from IAUN in 2013. She is currently working as a research assistant in similar research projects. Her research interests include web mining, web per- [20] O. NASRAOUI,M. SPILIOPOULOU,O. R. ZADANE, sonalization and recommender systems. J. SRIVASTAVA,B. MOBASHER, WebKDD 2008: 10 years of knowledge discovery on the web post-workshop report. ACM SIGKDD Explorations Newsletter, 10 (2008), 78–83. [21] C. BOURAS,V. TSOGKAS, Clustering to Deal with the New User Problem. In Computational Science and Engineering (CSE), 2012 IEEE 15th Interna- tional Conference on, (2012), pp. 58–65. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Computing and Information Technology Unpaywall

Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique

Journal of Computing and Information TechnologyJan 1, 2014

Loading next page...
 
/lp/unpaywall/cold-start-problem-in-collaborative-recommender-systems-efficient-qkeLazucTX

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Unpaywall
ISSN
1330-1136
DOI
10.2498/cit.1002223
Publisher site
See Article on Publisher Site

Abstract

Journal of Computing and Information Technology - CIT 22, 2014, 2, 105–113 doi:10.2498/cit.1002223 Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique Mohammad-Hossein Nadimi-Shahraki and Mozhde Bahadorpour Faculty of Computer Engineering, Najafabad branch, Islamic Azad University, Najafabad, Iran To develop a recommender system, the collaborative the system; that is called cold-start user prob- filtering is the best known approach, which considers lem. In other words, the system must attempt the ratings of users who have similar rating profiles to gather information about the new user before or rating patterns. Consistently, it is able to compute being able to fully use the system. the similarity of users when there are enough ratings expressed by users. Therefore, a major challenge of To solve the cold-start user problem, a few ef- the collaborative filtering approach can be how to make recommendations for a new user, that is called cold-start ficient methods have been proposed based on user problem. To solve this problem, there have been ask-to-rate technique [5], in which a new user proposed a few efficient methods based on ask-to-rate is asked to rate the selected items until hav- technique in which the profile of a new user is made by ing a sufficient number of rated items. The integrating information gained from a quick interview. This paper is a review of these proposed methods and methods can be categorized to two non-adaptive how to use the ask-to-rate technique. Consequently, they and adaptive methods depending on whether the are categorized into non-adaptive and adaptive methods. presented items are similar to “all” new users or Then, each category is analyzed and their methods are compared. not. In this paper, both non-adaptive and adap- tive methods are explained and their efficient Keywords: recommender systems, collaborative filter- methods are reviewed. ing, new user, user cold-start The rest of this paper is organized as follows. The recommender systems are introduced in Section 2. The concept of CF recommender sys- tems is described in Section 3. A comprehen- 1. Introduction sive survey of ask-to-rate technique and some of the efficient methods are discussed in Section The idea of personalizing searching engines, in- 4. Finally, in Section 5, the related methods are telligent software agents and recommender sys- discussed and conclusion of this work is pre- tems is taken into consideration by users who sented. ask for help in sorting, classifying, personal- izing, filtering and sharing a large amount of information. One of the common recommender 2. Recommender Systems techniques is Collaborative Filtering (CF)[1-3] which offers preferred items to a user based on the items previously rated by their collaboration. Recommender systems are a subset of informa- The essential supposition is that, if users X and tion filtering systems which are used as efficient Y assign a similar rate to n items or have a simi- tools for overcoming information overloading, lar behavior, they will rate or behave other items inspecting a large set of information and select- similarly [4]. Therefore, a major challenge of ing information related to each user. The issue CF technique can be how to make recommen- of recommendation and rating prediction im- dations for a new user who has recently entered plies items like movie, music, book, etc. or so- 106 Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique cial factors like people or groups that have not some studies have suggested hybrid algorithms been seen by users yet. When recommender [6, 12]. This section focuses on a common systems are able to predict ratings for items that memory-based CF algorithm, named user-based have not been observed yet, the item(s) can be kNN (k-Nearest Neighbors)[2]. recommended to a target user. A target user Memory-based algorithms are essentially heu- is a user for whom the recommendations are ristics as in the user-based kNN system which made. A movie recommender system, for ex- calculates the prediction of a target item based ample, might memorize explicit or implicit user on statistical techniques in order to find users ratings to recommend new movies to the same with similar tastes as follows: user, based on the ones that s/he has already • First, the similarity, sim(u , u ), between tar- t i seen. get user, u , and all other users, u,who t i Thus, how would the recommendation be pro- have rated target item, a , is computed by duced? There is a taxonomy provided by [6] different measures such as Pearson’s Cor- relation (shown in Equation (1)),Cosine with five different techniques including collabo- measure, a recent measure like proximity- rative filtering, content-based, demographic [7, impact-popularity [13] and so on, which re- 8], utility-based [9] and knowledge-based ones. flects distance, correlation or weight be- There is another category to overcome limita- tween two users. tions of the mentioned methods by combining techniques, which tries to use advantages of one (r −r ) · (r −r ) u ,a u u ,a u t m t i m i technique to fix disadvantages of others. Sev- m=1 sim(u , u )= t i eral ways have been proposed for their combi- h h 2 2 (r −r ) · (r −r ) nation to come up with a new hybrid system u ,a u u ,a u m i m i t t m=1 m=1 (see [6] for precise descriptions, where seven (1) categories of hybrid system are presented).CF systems are described here since repeating the detailed explanation of other categories in this where r is rating of item a by user u, r u,a u is mean of rating by users u or u for all the paper might be redundant. The interested au- t i thors could refer to original articles [1, 6, 10]. co-rated items and h is the number of items co-rated by both users. The similarity rang- ing is between −1 (the least similar users to target user) and 1 (the users most similar to 3. Collaborative Filtering Recommender the target user). Systems • Second, prediction for a target item by a tar- get user can be calculated using at most k Collaborative filtering recommender systems nearest neighbors, who have also rated the are one of the biggest sub-domains of informa- target item, found from the former step as tion retrieval. The basic concentration of these Equation (2). systems is on finding users with similar interests to the target user and aggregating their opinions (r −r ) · sim(u , u ) to make a recommendation. So, it calculates u ,a u t h h h h=1 similarity between users instead of the content prediction(u , a )= +r t t u of items. Under the existing amount of infor- |sim(u , u )| t h h=1 mation, both users and website owners receive (2) benefit from CF systems; thus, users are able to come across preferred items; moreover, the profit from e-commerce websites potentially go where r and r are mean of ratings for u u t h up because of persuading the user to buy more the target user and user h on all other rated related products or accessories. items and sim(u , u ) is similarity between t h the target user and user h. Researchers have already classified many al- gorithms for collaborative recommendation in- One of the advantages of memory-based CF al- cluding the memory-based or model-based CF gorithms is their intuitive idea that makes it easy [11]. Also, for taking advantages and alle- to comprehend and the results are conveniently viating certain drawbacks of two algorithms, explainable. Furthermore, the main strength of Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique 107 pure CF systems is that the new data can be this process is completed and, whereas in user- added increasingly and without difficulty since item matrix the row of a new user is not empty, they do not require any tagging of the items’ the new user enters the normal phase of rec- content, like content-based filtering, and rec- ommender system. The CF system should use ommendations are made only using the rating these ratings to compute similarity between new data. Hence, this approach is suitable for any and other users. Whereby, s/he gets precise rec- domain, especially in domains the contents of ommended items, shown in Figure 1. which are either rare (like restaurants) or accru- The system must be cautious about presenting ing contents are difficult (like movies or music). informative items that gather useful informa- Collaborative systems have their own limita- tion before a new user is allowed to normally tions like cold-start problem [5,14-16], scalabil- use the system. If the ratings are obtained by a ity [17] and sparsity [4]. As an important prob- well-designed selection strategy compared with lem, the cold-start user problem occurs when a a strategy in which the users self-select the items to rate, the recommendation accuracy can be user, who is new to the recommender system, enters the system and there are no ratings by the improved. user. The user-based CF cannot compute simi- Generally, techniques should not appear severe larity between new and other users [14-16, 18, to the new users and they must move toward 19] Hence, it is difficult to make recommenda- minimizing user effort and maximizing recom- tions. mendation accuracy. Of interest in [14, 18], evaluation of elicitation methods on user effort To solve this problem, there have been different techniques. The ask-to-rate technique [5, 14, and accuracy metrics is shown in Table 1. The 16, 18, 19] is the most direct way for obtain- methods are mentioned in the following sec- ing some information about the new user and tion. This paper provides an overview of the efficient methods based on the ask-to-rate tech- for learning the user’s preferences. The next nique. Reasonably, they are categorized to non- section explains the ask-to-rate technique. adaptive and adaptive methods, based on how the next items are selected. 4. Asking for Explicit Ratings Recommendation Methods User Effort The most direct way to cope with cold-start user Accuracy problem and make a rapid profile of a new user IGCN is to ask for explicit ratings by presenting items (Log pop)×Ent to the user. It can elicit initial information about Entropy0 the new user with a quick and short interview. HELF After presenting some items to the new user, Popularity Item-Item Entropy Random Table 1. The Evaluation of elicitation strategies in [14, 18] over both online and offline experiments on user effort and accuracy metrics, (:best, :worst). 4.1. Non-adaptive methods Using non-adaptive techniques makes it possi- ble to present the same items to “all” new users regardless of changes in knowledge of the user being interviewed. In most of these methods, Figure 1. The new user prompting process. computation is based on information theory for 108 Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique the new user’s problem. The advantage of these experience. All the proposed methods have methods is that order only needs to be calcu- been measured based on rating prediction ac- curacy – MAE (Mean Absolute Error) evalu- lated once, although these techniques provide little information. Some of the techniques are ation metric. These methods are as follows: classified as non-personalised methods in [16]. • Random strategy: selects the items randomly. It learns about new user preferences in terms Various strategies have been proposed for non- of all available items. Random strategy is adaptive methods such as Variance and Entropy a baseline strategy which is used for com- strategies [19]; Random, Popularity, Pure en- parison. The analysis of the rating matrix is tropy and Balanced strategies [14]; Entropy0, not intelligent and the results of online and HELF strategies [18]; and Greedy strategy and offline experiments point out that it needs the Other People’s Greedy and Variations strate- much more user effort and that accuracy of gies [16]. Details of each non-adaptive strategy predictions is unfavorable. If the distribution are: of ratings is not uniform, the user will prob- 1. Active WebMuseum is the first CF recom- ably not have any opinion about presented mender system which uses the ask-to-rate items. technique [19]. This web-based virtual mu- • Popularity strategy: It has been suggested to seum has a dynamic topology in which art take an item’s popularity into account, i.e. paintings are personalized and ordered by how many users have rated an item. The museum visitors’ taste and preferences. This items are ordered by the number of ratings paper proposes Entropy and Variance meth- that they have been given by all users and ods to present sequence of items to be rated present some of the most popular items to by new users. These methods are the statis- the new user. According to Equation (4), tical analysis of distribution of item ratings popularity of item a is computed, where r t at given by other users in the dataset. Accord- shows its rating. ing to Equation (3) the variance of the target item a is computed. Popularity(a )= |r | (4) t a (r − r ) Implementation of this method is easy and u,a a t t u∈U its computation is inexpensive. It has ac- Variance(a )= (3) |U | complished the important goal of minimiz- ing user effort. However, ratings may be where U is all users which have rated items t uninformative since most users like popu- a , r is rating of item a by user u and r t u,a t a t t lar items. Moreover, the popularity measure is mean of a ’s rating. Experiments use ran- suffers from prefix bias – it is derived from dom strategy (select items to present with- popular items which receive ratings increas- out prior planning) as a baseline measure ingly but not from unpopular items. This and point out that these two methods gener- problem causes unequal distribution of rat- ate more accurate predictions for new users ings in the dataset. than random strategy. • Pure entropy: Another low complexity me- 2. In 2002, MovieLens research group extended thod for item selection is entropy, which was the aforesaid idea in web personalization proposed by [19] and was re-presented in [14]. In this research, some strategies were [14]. The entropy on a target item H(a ) proposed which contained use of informa- is dispersion of the item ratings in the rat- tion theory and aggregated statistics to learn ing matrix. Using pseudocode in Figure 2. about new users. These strategies focused on Then, some not-yet-rated items with the high- the issue of which items to be presented to est score are presented. This method pro- the new user during an initial interview. Dif- vides a lot of information for each rating; ferent strategies have been tested through of- but, some information is not informative fline and online experiments to select movies for the system and sometimes it selects un- that have used MovieLens dataset. Their known items since this method does not take evaluation considered user effort and rec- frequency into account. In offline experi- ommendation accuracy related to the user ment, Entropy, like Random strategy, needs Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique 109 extreme user effort and performs extremely missing evaluations were filled with a sepa- poorly on accuracy. It has not been evaluated rate category like “0” whereas “1-5” was the in the online experiment. Hence, in terms of usual scale. A weighted entropy formulation accuracy and user effort, Entropy and Ran- was used as Equation (5): dom methods lag behind all the methods mentioned in [14]. Entropy0(a )= − p w log(p ) t i i i i=0 Function Entropy (a ) (5) entropy (a )= 0 for each item a in dataset where w = 0.5 is the weight to identify miss- for i as each of the possible rating values // in ing values and w = 1for i = 1, 2, 3, 4,5since movielens i = 1 ... 5 this selection of weights provided the best re- if a ’s rating= i sults for the original experimentation. Note that, value[i]+= 1 // rating frequencies w = 0 changed Entropy0 into the pure entropy end for measure. The Entropy0 method dominates one proportion =value[i]/total number of users who of the limitation of entropy and distinguishes be- rate a tween most unknown items (infrequently rated entropy(a )+=proportion *Math.log t i items) and frequently rated items. It is slightly (proportion ,2) more successful than Popularity method. end for each entropy(a )= −entropy(a ) t t • HELF (Harmonic mean of Entropy and Log- End arithm of Frequency): This strategy is a hy- brid of Popularity and Entropy strategies. It Figure 2. Pseudocode of entropy approach. uses suitable feature of harmonic mean and logarithmic function. HELF combines har- monic mean of Popularity (rating-frequency • Balanced strategy: The logarithmic of pos- of items) and Entropy scores of items; The sibility with which the user has rated the item combined measures are not correlated. HELF (popularity score) is multiplied by entropy, is defined as Equation (6): that is (log popularity) *entropy and some items are presented in a descending order. 2 ∗ LF ∗ H (a ) This method combines advantages of two HELF = (6) LF + H (a ) components, has the best accuracy toward t other methods proposed in [14] and needs where LF = log(|a |)/ log(|U|) is the nor- medial user effort. t malized logarithm of the rating frequency of 3. In 2008, idea of ask-to-rate by MovieLens target item, and H (a )= H(a )/ log(5) is t t research group was further extended in [18] the normalized entropy of target item. to improve order of items and more pre- 4. Other criteria were suggested in [16],which cisely elicit opinion of new users at regis- showed progression of methods for dealing tration time. This paper was a winner in with the new user. Below, details of some Yahoo! Research Best Paper Award [20]. non-adaptive approaches called Greedy and They proposed an offline simulation frame- Other People’s Greedy and variations are ex- work and an online experiment with real plained. users of the MovieLens live recommender system. Three new information theoretic • Greedy strategy: where the next item is cho- strategies were presented: Entropy0, HELF sen from those that the user can rate such and IGCN. Details of the two non-adaptive that the prediction error for its test set is min- strategies include: imized. This method is not feasible in prac- tice and requires knowing what each person • Entropy0 (Entropy Considering Missing Val- can rate and the actual ratings. It is used as ues):In [14], missing ratings (non-ratings) a baseline. were ignored in entropy’s calculation. En- tropy0 was proposed to handle the problem • Other People’s Greedy and Variations,se- of an item with the missing evaluation as all lected items will be presented to the new 110 Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique user from the top-n lists of other users’ items trolled more effectively than in non-adaptive obtained through a greedy method. It uses approaches. Adaptive approaches take into ac- other people’s opinions and selects items count the user’s historical ratings among initial which reduce prediction error. interview and consider the system’s changing profile of the new user; thereby, the number of items familiar to the new user is maximized. 4.2. Adaptive methods Dealing with the cold-start user problem by ask- ing to rate, there are a few adaptive approaches We define those approaches as Adaptive because for a personalized items’ ordering, such as Item- selected items are consistent with “each” new item personalized [14];IGCN [18];na¨ıve Bayes, user’s opinions. For present items that best fit perturbed Other People’s Greedy and Variations user’s personal preferences the system should [16](which are classified as personalised meth- adapt to the earlier rates given by the new user. ods in [16]) and clustering method [21].The Thereupon, they rate items with personalized orderings and the interview process will be con- details of each strategy are: Research Method Pluses Minuses Example • The first statistical analysis of the item’s Variance ratings distribution toward solving user • Not adapted to a user’s rating history cold start problem • Select unknown items. Kohrs, A. • The first statistical analysis of the item’s • Disregard item popularity and the rating Merialdo, B. ratings distribution toward solving user frequencies Pure (2001) cold start problem. • Not considered missing values. Entropy • Using potential information of an item’s • Assign the most entropy value to the ratings. items with uniform rating distribution. • Not adapted to a user’s rating history • Used as a baseline. • Not apply intelligent analysis Random • Apply to all available items. • Not adapted to a user’s rating history Rashid, A.M. • Uninformative rating. Albert, I. • Easy to compute • Increases Prefix bias Cosley, D. Popularity • Easy to implementation • Causes unequal distribution of ratings Lam, S.K. • Most availability of users for rate items. in dataset McNee, S.M. • Not adapted to a user’s rating history Konstan, J.A. Riedl, J. • Poularity’s dominance over multiply • A combination of popularity and en- (2002) Balanced PopbyEnt. tropy’s advantages. • Not adapted to a user’s rating history • Considering missing values • Distinguishes between most unknown • Bias toward frequently rated items. Entropy0 items (infrequently rated items) and fre- • Not adapted to a user’s rating history. Rashid, A.M. quently rated items. Karypis, G. • Using suitable feature of harmonic mean Riedl, J. and logarithmic function (2008) • Not adapted to a user’s rating history. HELF • A combination of popularity and en- tropy’s advantages. • Used as a baseline. • Cannot be applied in practice. Greedy • Selecet items which reduced prediction strategy • Not adapted to a user’s rating history. error in test set. Crane, M. Other (2011) people’s • Using other people’s opinions. • Not adapted to a user’s rating history. greedy • Selecet items which reduced prediction and error. variations Table 2. Classification of asking to rate, non-adaptive approaches Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique 111 • Item-item personalized, where items are pro- approach offers greater accuracy than all posed until the user gives at least one rat- other proposed information theoretic mea- ing; then, similarity between the items will sures in [18]. be computed using a recommender system • Na¨ıve Bayes: This method is a variant of based on some similarity measures and some Popularity method which is personalized. items that the user would be most likely to When a user is capable, or incapable, of rat- buy (or see in movie domain) will be pre- ing an item, it uses na¨ıve bays probability sented. Whenever the user gives more rat- in which the user is capable of rating other ings, the list of similar movies will be up- items. The selecting items to be presented dated. Evaluation of strategy over both ex- are the highest probable items which are able periments on their metrics points out that it to be rated. provides the best user effort like Popularity • Perturbed Other People’s Greedy and Vari- and Entropy0 strategies and the worst ac- ations: This method combines na¨ıve bayes curacy like Entropy and Random strategies probability and Other People’s Greedy me- since the approach tends not to identify items thod to generate a list of personalized items. that the user will like [14] It utilizes advantages of both by selecting • IGCN (Information Gain through Clustered items which the user is able to rate and the Neighbors): Toward achieving an adaptive amount by which the item cuts down on pre- designed selection strategy, at first, [18] con- diction error. sidered using decision trees. Initially, the • In [21], the authors extended the item-item users are clustered to groups and then a de- method to create a personalized methodol- cision tree algorithm such as ID3 is used to ogy for dealing with the new user problem come across the right cluster for the target using ask-to-rate technique. user and learn user profile. This approach takes into account the items that are rated • Clustering: This proposed strategy enjoys by a user so far. The goal of target user is item (in this paper, items were news articles) to follow a route through the decision tree and user clustering information. This ap- from the root node (with the highest infor- proach uses W-kmeans clustering algorithm, mation gain) to the leaf node (which infers to choose which items to select next for rat- the user’s true class or neighborhood). ing by the user. The authors have demon- strated that it performs better than all of the However, the authors refuse to consider common strategies like Random, Popularity, this ideal decision tree scenario because it Pure Entropy, Balanced and Item-Item per- may not be practically feasible with most sonalized and minimizes user effort members of a recommender system; instead, A brief comparison of the classification of meth- they have proposed a two-phase algorithm ods to alleviate user cold-start problem by ask- named IGCN. Before starting the first step, ing to rate, and their advantages and disadvan- user clusters are created using bisecting k- tages are depicted in Table 3. mean approach and the information gain (IG) of items is computed. In the first phase called non-personalized step, the user gives 5. Conclusion several ratings to the items that are ordered by their information gain scores, to build an initial profile until the user has rated at least In summary, the objective of a recommender some threshold numbers of items. In the sec- system typically is to recommend items that ond phase, named personalized step, toward best fit users’ personal preferences. Collabora- creating an affluent profile, information gain tive filtering systems generate recommendation of the items is computed using only the best based on user-user similarity. A new user en- neighbors of the target user as long as the counters a serious problem in the collaborative best neighbors have no changes. IGCN re- filtering approach. Since the system does not quires assuming a predefined clustering of have any data about the new user preferences, it users. The 20 days online experiment per- could not provide any personalized recommen- formed on 468 users presented that IGCN dation for him/her. It has to acquire some data 112 Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique Research Method Pluses Minuses Example Rashid, A. M. Albert, I. Cosley, D. Item-Item Lam, S. K. • The first adaptive method. • Inattention to user’s interests in items. personali- McNee, S. M. • Adapting to a user’s rating history. • Providing uninformative rating zed Konstan, J. A. Riedl, J. (2002) Rashid, A. M. Karypis, G. • Requires assuming a predefined clus- • Adapting to a user’s rating history. IGCN Riedl, J. tering of the users (2008) Na¨ıve • Considering the ability of a user for • Poor performance compared to the Bayes ratings. simplest approach (Random). Perturbed Crane, M. other (2011) • Poor performance compared to other people’s • Reduce prediction error compared to people’s greedy method. greedy and variations Table 3. Classification of asking to rate, Adaptive approaches. about the new user. In this paper we have re- [5] T. N. LILLEGRAVEN,A.C.WOLDEN,Design of a Bayesian recommender system for tourists present- viewed several methods for dealing with the new ing a solution to the cold-start user problem. Master user problem via ask-to-rate technique. The of Science in Computer Science, Department of methods are categorized into two, non-adaptive Computer and Information Science, Norwegian and adaptive categories. University of Science and Technology, 2010. Although a few efficient methods to solve the [6] R. BURKE, Hybrid recommender systems: Survey cold start new user problem have been proposed, and experiments. User modeling and user-adapted it is still not a stone in the corner. During the interaction, 12 (2002), 331–370. items selection, the new incoming ratings of [7] E. RICH, User modeling via stereotypes. Cognitive other users are not considered. Therefore, a fu- science, 3 (1979), 329–354. ture direction can be developing a new method, which will adapt to the earlier ratings given by [8] T. MAHMOOD,F. RICCI, Towards learning user- other users. adaptive state models in a conversational recom- mender system, (2007), pp. 373–378. References [9] R. H. GUTTMAN, Merchant differentiation through integrative negotiation in agent-mediated electronic commerce, 1998. [1] G. ADOMAVICIUS,A. TUZHILIN, Toward the next generation of recommender systems: A survey of [10] F. RICCI,L. ROKACH,B. SHAPIRA, Introduction the state-of-the-art and possible extensions. Know- to recommender systems handbook. Recommender ledge and Data Engineering, IEEE Transactions, Systems Handbook, (2011), 1–35. 17 (2005), 734–749. [11] J. S. BREESE,D. HECKERMAN,C. KADIE, Empirical [2] D. BILLSUS,M.J. PAZZANI, Learning collaborative analysis of predictive algorithms for collaborative information filters, (1998), pp. 48. filtering. In Proceedings of the Fourteenth Con- [3] B. SARWAR,G. KARYPIS,J. KONSTAN,J. RIEDL, ference on Uncertainty in Artificial Intelligence, Item-based collaborative filtering recommendation (1998), pp. 43–52. algorithms, pp. 285–295. [4] X. SU,T. M. KHOSHGOFTAAR, A survey of collab- [12] J. WANG, A.P.DE VRIES, M.J.T.REINDERS,Unify- orative filtering techniques. Advances in Artificial ing user-based and item-based collaborative filtering Intelligence, 2009 (2009), pp. 4. approaches by similarity fusion, pp. 501–508. Cold-start Problem in Collaborative Recommender Systems: Efficient Methods Based on Ask-to-rate Technique 113 [13] H. J. AHN, A new similarity measure for collabora- Received: September, 2013 Revised: May, 2014 tive filtering to alleviate the new user cold-starting Accepted: May, 2014 problem. Information Sciences, 178 (2008), 37–51. Contact addresses: [14] A. M. RASHID,I. ALBERT,D.COSLEY, S.K.LAM,S. M. MCNEE,J.A. KONSTAN et al., Getting to know Mohammad-Hossein Nadimi-Shahraki Faculty of Computer Engineering you: learning new user preferences in recommender Najafabad branch systems. In Proceedings of the 7th International Islamic Azad University Conference on Intelligent User Interfaces, (2002) Najafabad San Francisco, California, USA, pp. 127–134. Iran e-mail: [email protected] [15] A. I. SCHEIN,A. POPESCUL,L. H. UNGAR,D. M. Mozhde Bahadorpour PENNOCK, Methods and metrics for cold-start rec- Faculty of Computer Engineering ommendations, pp. 253–260. Najafabad branch Islamic Azad University [16] M. CRANE, The new user problem in collaborative Najafabad Iran filtering. Thesis for the degree of Master of Science, e-mail: [email protected] Department of Computer Science, University of Otago, Dunedin, New Zealand, 2011. [17] B. SARWAR,G. KARYPIS,J. KONSTAN,J. RIEDL, Application of dimensionality reduction in recom- MOHAMMAD-HOSSEIN NADIMI-SHAHRAKI was born in Iran. He re- ceived his Ph.D in computer science from University Putra of Malaysia mender system-a case study, DTIC Document2000. (UPM) in 2010. Currently, he is a full time Assistant Professor at the Faculty of Computer Engineering of Islamic Azad University of [18] A. M. RASHID,G. KARYPIS,J. RIEDL, Learning Najafabad (IAUN), His research interests include data mining, web preferences of new users in recommender systems: mining, social network mining and recommender systems. an information theoretic approach. ACM SIGKDD Explorations Newsletter, 10 (2008), 90–100. MOZHDE BAHADORPOUR was born in Iran. She received her BSc de- [19] A. KOHRS,B. MERIALDO, Improving collaborative gree in computer software engineering from Islamic Azad University filtering for new users by smart object selection. of Najafabad (IAUN) in 2008. She has recently completed her MSc In Proceedings of the International Conference on thesis research on Cold-start problem in collaborative recommender sys- Media Features (ICMF), (2001). tems under the supervision of Dr. Mohammad H. Nadimi from IAUN in 2013. She is currently working as a research assistant in similar research projects. Her research interests include web mining, web per- [20] O. NASRAOUI,M. SPILIOPOULOU,O. R. ZADANE, sonalization and recommender systems. J. SRIVASTAVA,B. MOBASHER, WebKDD 2008: 10 years of knowledge discovery on the web post-workshop report. ACM SIGKDD Explorations Newsletter, 10 (2008), 78–83. [21] C. BOURAS,V. TSOGKAS, Clustering to Deal with the New User Problem. In Computational Science and Engineering (CSE), 2012 IEEE 15th Interna- tional Conference on, (2012), pp. 58–65.

Journal

Journal of Computing and Information TechnologyUnpaywall

Published: Jan 1, 2014

There are no references for this article.