A shilling attack detector based on convolutional neural network for collaborative recommender system in social aware network

A shilling attack detector based on convolutional neural network for collaborative recommender... Abstract One of the most fundamental tasks in the socially aware network (SAN) paradigm is to explore the attributes and behavior of users, which helps to design more suitable and efficient protocols. Particularly, detection of shilling attackers by mining users’ behavior is a frequently discussed topic in many social scenes like recommender systems based on collaborative filtering. As the performances of collaborative filtering are entirely based on ratings provided by users, they are vulnerable to shilling attacks which perform injection of biased profiles into rating databases to alter the systems. Current shilling attack detection methods detect spam users through artificially designed features, which are neither robust nor efficient enough. This paper illustrates a novel convolutional neural network-based method named CNN-SAD, which applies transformed network structure to exploit deep-level features from users rating profiles. Since the achieved deep-level features elaborate users rating more precisely than artificially designed features, CNN-SAD can detect shilling attacks more efficiently. According to the experimental results, the proposed method is capable of detecting the vast majority of obfuscated attacks precisely and outperforms other state-of-the-art algorithms, which contributes to applications and security in SAN. 1. INTRODUCTION The last few decades have witnessed dramatic developments in wireless communications and networking technologies which have lay down a substantial foundation for the emergence of a modern paradigm—socially aware network (SAN) [1]. With the increasing number and types of wireless mobile devices like smart mobile devices, SAN, as one novel paradigm to exploit the social properties of network devices especially mobile devices, can be applied over many forms of human-centric wireless networks such as pocket switched networks [2], vehicular ad-hoc networks [3] and cyber physical systems [4], etc. SAN can comprehensively study individuals’ social properties like personal property, the relationship of human-to-human, human-to-community and human-to-environment. Personal property includes the individuals’ behavior, attributes, habit and so on. In the SAN paradigm, the exploration and analysis of users’ attributes and behavior can discover the preference and willingness of them. Therefore, it is essential to describe the sociability and social behaviors of human beings. Besides, it is of great significance to design more efficient protocols and applications of SAN paradigm. Detection of shilling attackers by mining users’ behavior is a frequently discussed topic in recommender systems based on collaborative filtering (CF). The rapid increase of on-line information has led to the problem of information overload [5]. CF-based approaches which have achieved great success in multimedia recommendation by recommending items have been successfully introduced into computer applications to filter irrelevant information and recommend products to users [6] or even predict users’ future profile [7], by judging the items according to the attitudes of users’ neighbors. However, due to the nature of recommender systems, CF-based approaches are seriously vulnerable to shilling attacks [8–10]. To increase or decrease a target item’s rating score, attackers can simply inject numerous automatically generated profiles into a recommender system. Such attacks have been referred to as shilling attacks. Prior research has shown that shilling attacks can significantly reduce the efficiency of recommender systems [11]. Detecting shilling attacks is, therefore, emerging as a momentous challenge to the stabilization and effectiveness of recommender systems. Generally speaking, detection methods in this field can be classified into supervised methods, unsupervised methods and semi-supervised methods. Typical examples of supervised methods includes the k-nearest-neighbor ( k-nn)-based classification algorithm, the support vector machine (SVM)-based method, Similar supervised method, etc. There are two major categories of unsupervised shilling attack detection methods: clustering-based method and principal component analysis (PCA)-based method. The clustering-based method has a frank and simple principle, which is clustering the profiles by artificially designed features. Meanwhile, the PCA-based methods have better identification results. However, current shilling attack detection methods have some limitations [12]. Traditional supervised methods suffer from obfuscated attacks like mixed attacks, which combined fake profiles from multiple shilling attack methods. Most clustering-based methods are not stable enough since some normal users have similarities to spam users [13]. Although PCA-based approaches are efficient against most attacks, they still suffer from certain attacks such as average over popular (AoP) attack. Semi-supervised methods are more stable than the above two kinds of methods, but much more complex than the other algorithms, and they would cost unbearably long time in calculation. Another significant drawback of existing methods is that they detect spam users with artificially designed features. These manual features are not highly nonlinear, and moreover, many features of shilling attack detection are designed for specific attack models. In other words, traditional features are inefficient against unknown or complicated attacks. Therefore, an algorithm rich in accessibility, stability and efficiency is urgent. In the last decades, deep learning theory has developed rapidly and achieved success in various domains and it has achieved great successes in many fields such as image classification and speech recognition [14]. Hubel and Wiecel found that animal’s visual cortex cells respond to optical signals [15], and inspired of it, Kunihiko Fukushima proposed Neocognitron, the predecessor of convolutional neural network in 1980 [16]. LeCun et al. [17] proposed LeNet-5, a multilayer artificial neural network which established the modern structure of convolutional neural network. Krizhevsky et al. [18] used convolutional neural network called AlexNet to classify 1.3 million images and had a major breakthrough in image classification. After the success of AlexNet, the researchers worked on and proposed several advanced theories. It is employed in [19] for handwritten character classification, and in [20] for hashtag recommending. The main contribution of this paper is that a deep convolutional neural network has been proposed for shilling attacks detection named CNN-SAD, which is more accurate and effective than traditional methods. As the best of our knowledge, the proposed approach is the first method that detects shilling profiles with auto-generated deep-level features, which is robust and adaptive against the vast majority of attack methods, even an unknown attack method. The proposed approach can be summarized as a two-step process. First of all, a rating matrix is initialized according to a stochastic matrix. Then, the network is trained by utilizing the back propagation algorithm and gradient decent method. The rest of this paper is organized as follows. Section 2 illustrates the architecture and details of the proposed deep convolutional neural network. Experiments and evaluations to demonstrate the effectiveness and efficiency of CNN-SAD are drawn in Section 3. Section 4 summarizes the whole paper. 2. CNN-SAD: A CONVOLUTIONAL NEURAL NETWORK-BASED METHOD Convolutional neural networks are usually used in image processing or natural language processing, which is because of its advantage of local processing mechanism. This mechanism allows each neural connect to merely one subset of the input, which is helpful to relieve the calculation. Besides, it has several layers through which the deep-level features are extracted. All these characteristics make convolutional neural network a suitable candidate for shilling attack detection. The local processing mechanism is born for clustering tasks such as attack detection. The traditional detection methods use manually chosen features, which is highly nonlinear and mostly limited to specific attack models. Meanwhile, the auto-generated deep-level features adopted by convolutional neural networks are more appropriate. Consequently, CNN is employed as algorithm of the proposed detector. Due to the low difficulty of the classification tasks, the architecture of the neural network is not sophisticated as the one for figure recognition. Therefore, a relatively simple one has been built. As is shown in Fig. 1, the proposed convolutional neural network consists of four layers, including one transforming layer, one convolutional features layer, one pooling layer and one output layer. The architecture of the convolutional neural network is similar to the systems presented in [21]. Figure 1. View largeDownload slide The architecture of the proposed convolutional neural network (CNN-SAD). Figure 1. View largeDownload slide The architecture of the proposed convolutional neural network (CNN-SAD). In the following, a brief introduction of the proposed networks’ main components is given, including the transformation matrix, the convolutional feature map, the pooling (sampling) layer and the output layer. The process is shown in Algorithm 1. Besides, how to extract deep-level features from rating profiles, and hence to detect shilling attacks are also demonstrated in the following sections. 2.1. Transforming In CNN-SAD, the input of a user’s rating profile is treated as a M×1 vector: [r1,r2,…,rM], where M is the number of items and ri represents the score that the current user rated to movie i. As Fig. 1 indicates, the vector is reshaped into a matrix, by the order of similarity generated by specific similarity calculation methods, which are elaborated in Section 3.3. This is for the purpose of trying to cluster the similar items, which is inspired by the practice of CNN used in natural language processing. There are similarity between the rating and the linguistic emotion, and moreover, the shape and size of the matrix as well as the similarity calculation method could affect the detecting effect. Therefore, the relevant experiments are conducted to prove it, and the results are shown in Section 3. Algorithm 1 CNN-SAD Require: I=[r1,r2,…,rM] is the rating vector of a user in the recommender system.  Ensure: output the result: 1 represents spam user, 0 represents normal user in the recommender system.  1:  sortedI =sort(I,sortMethod)  ▹ sort  2:  map =reshape(sortedI,shape)  ▹ reshape  3:  convResult =sigm(conv(map ,convKernel) +convBias)  ▹ convolution  4:  poolResult =maxPool(convResult ,poolSize)  ▹ pooling  5:  output =sigm((poolResult×outWeights) +outBias)  ▹ output  Require: I=[r1,r2,…,rM] is the rating vector of a user in the recommender system.  Ensure: output the result: 1 represents spam user, 0 represents normal user in the recommender system.  1:  sortedI =sort(I,sortMethod)  ▹ sort  2:  map =reshape(sortedI,shape)  ▹ reshape  3:  convResult =sigm(conv(map ,convKernel) +convBias)  ▹ convolution  4:  poolResult =maxPool(convResult ,poolSize)  ▹ pooling  5:  output =sigm((poolResult×outWeights) +outBias)  ▹ output  View Large Algorithm 1 CNN-SAD Require: I=[r1,r2,…,rM] is the rating vector of a user in the recommender system.  Ensure: output the result: 1 represents spam user, 0 represents normal user in the recommender system.  1:  sortedI =sort(I,sortMethod)  ▹ sort  2:  map =reshape(sortedI,shape)  ▹ reshape  3:  convResult =sigm(conv(map ,convKernel) +convBias)  ▹ convolution  4:  poolResult =maxPool(convResult ,poolSize)  ▹ pooling  5:  output =sigm((poolResult×outWeights) +outBias)  ▹ output  Require: I=[r1,r2,…,rM] is the rating vector of a user in the recommender system.  Ensure: output the result: 1 represents spam user, 0 represents normal user in the recommender system.  1:  sortedI =sort(I,sortMethod)  ▹ sort  2:  map =reshape(sortedI,shape)  ▹ reshape  3:  convResult =sigm(conv(map ,convKernel) +convBias)  ▹ convolution  4:  poolResult =maxPool(convResult ,poolSize)  ▹ pooling  5:  output =sigm((poolResult×outWeights) +outBias)  ▹ output  View Large 2.2. Convolution Convolution is the core of CNN-SAD, which is designed to extract the features of a single profile and find differences between the normal users and shilling attackers for the purpose of detection. As shown in Fig. 1, the input map of the convolutional layer is an m×n rating matrix. Suppose that the kernel size is d×d, then the convolutional result is calculated as   cf=∑id∑jdSi,j×Fd−i,d−jf (1)where S is a sub matrix of the rating matrix and Ff represents the fth filter map. In the proposed method, every filter map’s step is 1, that is to say, since Ratingmatrix∈Rm,n, the dimension of convolutional layer’s output matrix is (m−d+1)×(n−d+1). In convolutional layer, CNN-SAD designed six feature maps, i.e. in the following pooling layer, six different feature matrices are obtained. In practice, a bias parameter is needed for each filter. A bias vector bi∈Rf is required for each convolutional layer. There are many activation functions, CNN-SAD employs sigmoid, which is represented as S(x)=1/(1+e(−x)). Thus, the calculation of convolutional layer is illustrated as   xl=f(ul)withul=Wl×xl−1+b (2)where Wl represents the weight of filter f; b denotes the bias parameter of filter f; xl means the output of convolutional layer l−1. 2.3. Pooling In most cases, there is a pooling layer after a convolutional layer. The goal of the pooling layer is to aggregate the information and reduce the representation, which can also be seen as a fuzzy filter that has the effect of secondary feature extraction. There are two major pooling processes: max pooling and mean pooling. In this paper, CNN-SAD chooses max pooling. Besides, in order to reduce the computational complexity and training time, CNN-SAD does not add any weights or biases in pooling layers. 2.4. Output layer After the process of convolutional layer and pooling layer, as shown in Fig. 1, a serial of features of the rating matrix are achieved. Then CNN-SAD simply classifies an input rating matrix which follows   p(yj)=sigmoid(yPTwj+bj) (3)where p(yj) represents the probability that the current input rating profile belongs to class j; yP denotes the output of the pooling layer; w and b are the weight vector and bias vector of the output layer. So far the probabilities of the two classes, normal or shilling, are achieved, and then an input profile is labeled with the biggest probability of the two classes. 2.5. To train the network In CNN-SAD, back propagation algorithm is adopted to compute the gradients. Gradient descent method is utilized to train the proposed model as in   E(W,b)=1m∑i=1mE(W,b,xi,yi) (4)  =1m∑i=1m12∥hW,b(xi)−yi∥2 (5)where E(W,b) denotes the loss function that is used in the proposed approach; xi and yi mean the input and output, respectively; m represents size of training set; hW,b(xi) denotes the prediction of the deep convolutional neural network. Since the loss function has been defined, CNN-SAD adopts Equations 6 and 7 to tuning the weights and biases in the deep convolutional neural network,   Wijl=Wijl−α∂∂WijlE(W,b) (6)  bil=bil−α∂∂bilE(W,b) (7)where Wijl means the weight of layer ls ith feature map; bil denotes the bias of feature map i in layer l; E(W,b) represents the loss function which is illustrated in Equation (5); α denotes the learning rate. 3. EXPERIMENTS AND EVALUATIONS 3.1. Dataset and detection performance metrics In this paper, the publicly available Netflix dataset1 and Movie-Lens 100K dataset2 are both adopted to test the proposed approach. Netflix dataset consists of 100 000 000 ratings from 480 000 randomly-chosen, anonymous Netflix users over 17 770 movie titles and Movie-Lens 100K dataset consists of 100 000 ratings from 943 users on 1682 movies. All the ratings are integer values, the biggest of which is 5 (like) and the smallest is 1(dislike). Since the Netflix dataset is too large to compete with Movie-Lens dataset, a subset of it has been chosen, which consists of about 40 000 ratings from 2000 users on 2000 movies, a similar scale to Movie-Lens dataset. This subset is randomly sampled from the Netflix dataset so it is qualified to represent. The subsequent experiments are conducted over these two datasets, whose features are shown in Table 1. The training set and test set is 50%/50% split of the whole dataset. For CNN-SAD, the training set contains 500 normal users and 443 spam users, while the test set is formed by 500 normal users and 443×(0.1,0.15,0.2) spam users on Movie-Lens dataset. In Netflix dataset, the training set contains 1000 normal users and 1000 spam users, while the test set is formed by 1000 normal users and 1,000×(0.1,0.15,0.2) spam users. TABLE 1. Features of datasets. Dataset  User  Item  Rating  Density (%)  Netflix  2000  2000  40 000  1.09  Movie-Lens  943  1682  100 000  6.30  Dataset  User  Item  Rating  Density (%)  Netflix  2000  2000  40 000  1.09  Movie-Lens  943  1682  100 000  6.30  View Large TABLE 1. Features of datasets. Dataset  User  Item  Rating  Density (%)  Netflix  2000  2000  40 000  1.09  Movie-Lens  943  1682  100 000  6.30  Dataset  User  Item  Rating  Density (%)  Netflix  2000  2000  40 000  1.09  Movie-Lens  943  1682  100 000  6.30  View Large There are three staple metrics P (precision), R (recall) and F (F-measure) to evaluate the performance of the proposed convolutional neural network and other detection methods against different kinds of shilling attacks. Generally speaking, P and R show the completeness and accuracy of a classifier. F reveals a global view of the classifier. In this section, only F is shown to demonstrate experiment results. It is worthy of noting that, for PCA-based method, it is supposed that the number of spam users is already known. Therefore, the values of P, R and F are the same. The three metrics are defined as follows,   P=#TruePositive#TruePositive+#FalsePositive (8)  R=#TruePositve#TruePositive+#FalseNegative (9)  F=2P×RP+R (10)where #TruePositive represents the number spam users which are detected successfully by a detector. #FalsePositive denotes the number of normal users that are judged as spam users. The TensorFlow framework is employed to conduct the experiments [22]. 3.2. Shilling attack models The objective of shilling attack models is to generate attack profiles and inject them into a PSN. Generally speaking, an attack profile consists of four components: selected items, filler items, non-voted items and a target item. Selected items are a collection of items that are chosen by attackers according to some special prior knowledge. Filler items are picked from the system to make attack profiles look like normal profiles. Target item is the one that attackers intend to push or nuke. Figure 2 illustrates the general form of an attack profile. Variables associated with the experiment include: attack size and filler size, where attack size denotes the number of injected spam profiles and filler size means the number of filler items of an single injected attack profile. In every experiment, one filler size and one attack size have been picked to generate the attack data, then keep one of them fixed and vary the other one. Each attack model pushes a target movie at 10, 15 and 20% as the attack size, and the average result of the three is shown. For a specific attack profile, the filler size is set as 1, 3, 5, 7, and 9% on Movie-Lens dataset, and 0.5, 0.75, 1, 1.25, and 1.5% on Netflix dataset, as the two datasets has different densities. FIGURE 2. View largeDownload slide The general form of an attack profile. FIGURE 2. View largeDownload slide The general form of an attack profile. There are six attack models used in the experiments: random attack, average attack [23], bandwagon attack, segment attack [24], AoP attack [25] and mixed attack. As is shown in Table 2, random attack does not have any selected items, of which the filler item is randomly chosen from all the items and rated a score which follows N(μ,σ2), where μ and σ are the expectation and variance of the whole system respectively. A filler item’s rating in average attack is determined by the average rating of the item. However, the average rating of any item can only be acquired through the PSN. Thus an average attack model requires more information than a random attack model. Bandwagon attack chooses popular items as selected items. Meanwhile, segment attack selects the items which are similar to the target item as selected items. Since the information of popular items and similarities among items can be achieved from other places, bandwagon attack and segment attack require small amount of knowledge. AoP attack is able to greatly decline the efficiency of PCA-based methods. AoP attack selects filler items stochastically from top x% popular items, where x could be changed. Mixed attack model literally mixes the attack profiles according to all the models described above. In this paper, random attack is mixed with segment attack, while average attack is mixed with bandwagon attack. It is worthy of noting that, requiring the detailed item information, segment attack could not been deployed on Netflix dataset, which lacks the indispensable information, so there are seven attack models on Movie-Lens dataset and five on Netflix. The arguments of the attack models are listed in Table 2, where IS denotes selected items; IF represents filler items; it denotes target item; knowledge means the amount of knowledge that an attack model requires. TABLE 2. Attack models. Attack type  IS    IF    I∅  it    Items  Rating  Items  Rating      Random  Not used    Randomly chosen  N(μ,σ2)  I−IF  rmax  Average  Not used    Randomly chosen  Item mean  I−IF  rmax  Bandwagon  Popular items  rmax  Randomly chosen  N(μ,σ2)  I−(IF∪IS)  rmax  Segment  Segmented items  rmax  Randomly chosen  rmin  I−(IF∪IS)  rmax  AoP  Not used    Randomly chosen (top x%)  rmin  I−(IF∪IS)  rmax  Mixed  Popular items  rmax  Randomly chosen  rmin or item mean  I−(IF∪IS)  rmax  Attack type  IS    IF    I∅  it    Items  Rating  Items  Rating      Random  Not used    Randomly chosen  N(μ,σ2)  I−IF  rmax  Average  Not used    Randomly chosen  Item mean  I−IF  rmax  Bandwagon  Popular items  rmax  Randomly chosen  N(μ,σ2)  I−(IF∪IS)  rmax  Segment  Segmented items  rmax  Randomly chosen  rmin  I−(IF∪IS)  rmax  AoP  Not used    Randomly chosen (top x%)  rmin  I−(IF∪IS)  rmax  Mixed  Popular items  rmax  Randomly chosen  rmin or item mean  I−(IF∪IS)  rmax  View Large TABLE 2. Attack models. Attack type  IS    IF    I∅  it    Items  Rating  Items  Rating      Random  Not used    Randomly chosen  N(μ,σ2)  I−IF  rmax  Average  Not used    Randomly chosen  Item mean  I−IF  rmax  Bandwagon  Popular items  rmax  Randomly chosen  N(μ,σ2)  I−(IF∪IS)  rmax  Segment  Segmented items  rmax  Randomly chosen  rmin  I−(IF∪IS)  rmax  AoP  Not used    Randomly chosen (top x%)  rmin  I−(IF∪IS)  rmax  Mixed  Popular items  rmax  Randomly chosen  rmin or item mean  I−(IF∪IS)  rmax  Attack type  IS    IF    I∅  it    Items  Rating  Items  Rating      Random  Not used    Randomly chosen  N(μ,σ2)  I−IF  rmax  Average  Not used    Randomly chosen  Item mean  I−IF  rmax  Bandwagon  Popular items  rmax  Randomly chosen  N(μ,σ2)  I−(IF∪IS)  rmax  Segment  Segmented items  rmax  Randomly chosen  rmin  I−(IF∪IS)  rmax  AoP  Not used    Randomly chosen (top x%)  rmin  I−(IF∪IS)  rmax  Mixed  Popular items  rmax  Randomly chosen  rmin or item mean  I−(IF∪IS)  rmax  View Large 3.3. Comparison among various similarity calculation methods As an optimizing measure, the data map is sorted according to the similarities among the items, which are calculated by various similarity calculation methods, including the rating quantity, the average rating and the optimized cosine similarity. It is believed that the sequence of items affects the detection results, and the experiment proves it. The average attack is used in the experiment, with filler size at 5% and attack size at 15%. It turns out that clustering the similar items by transforming the input map has a positive impact on detecting shilling attack with CNN. As the results shown in Fig. 3 indicates, the rating quantity and average rating methods accelerate the converging progress and slightly improve the detection effect compared to non-sorted one, which only achieves 92.5% as F value when iteration 20. The rating quantity method produces the best effect with 99.34% at iteration 50, and the average rating method takes second place with 98.68%, while the optimized cosine similarity method even has negative effect on the contrary. As a reference, the normal one without sorting scores 97.4% in the end. FIGURE 3. View largeDownload slide Comparison among various similarity calculation methods. FIGURE 3. View largeDownload slide Comparison among various similarity calculation methods. On account of the best performance, the rating quantity method is employed by the subsequent experiments, which focus on the convolution kernel in the next layer. 3.4. Effect of the shape of convolution map The shape of input map of the convolutional neural network influences the detection progress. Instead of the conventional square, a variety of rectangles with different side lengths have been chosen. This experiment is conducted on Movie-Lens dataset with 1682 items. Traditionally, for every user profile, the vector with length at 1682 was required to reshape into a 42×42 matrix along with a little complement. However, in this experiment, the vector is reshaped into rectangles with the short side lengths 30, 26, 22, 18, and 14, respectively, to study the effect of variation of the shape. As Fig. 4 indicates, it achieves the peak when the short side is 22, numerically 100% at iteration 14, and has slightly worse results as the side lengths vary. As a consequence, it is inferred that the shape of input map does have an optimum choice, which is determined by situation. FIGURE 4. View largeDownload slide Comparison among various convolution map shape. FIGURE 4. View largeDownload slide Comparison among various convolution map shape. 3.5. Comparison among CNN-SAD and other detection methods To fully embody the efficiency of the proposed method, comparisons among CNN-SAD and other four state-of-the-art methods are conducted, including a PCA-based method, a PLSA-based method [26], a k-means-based clustering method [27] and a semi-supervised method Semi-SAD [28], against several different attacks, is elaborated individually over two different datasets. Figures to 7, demonstrate the performance. On both datasets, as shown in the figures, the values of F of CNN-SAD against most attacks are over 99% except for mixed attack. That is to say, under most circumstances, the proposed approach could detect shilling attack profiles with bare false against single attack method. It can be easily found that Semi-SAD performs well against all the three kinds of obfuscated attacks as well, of which F are mostly higher than 0.9. Meanwhile, the clustering-based, PCA-based and PLSA-based approaches performed not so well. The F measure of PCA-based and PLSA-based methods could hardly reach 20%, while clustering-based method is having an extremely unstable outcome. It can be easily found that PCA-based method performs well against all the four kinds of obfuscated attacks, of which F values are bigger than 70%. Meanwhile, the clustering-based approaches are unstable. Though Semi-SAD is not as effective or efficient as CNN-SAD, it outperforms other methods. FIGURE 5. View largeDownload slide Comparison of detectors over Netflix dataset (%). (a) Random attack, (b) average attack, (c) bandwagon attack and (d) average+bandwagon attack. FIGURE 5. View largeDownload slide Comparison of detectors over Netflix dataset (%). (a) Random attack, (b) average attack, (c) bandwagon attack and (d) average+bandwagon attack. Figure 6. View largeDownload slide Comparison of detectors over Movie-Lens dataset (%). (a) Random attack, (b) average attack, (c) bandwagon attack, (d) segment attack, (e) average+bandwagon attack and (f) random+segment attack. Figure 6. View largeDownload slide Comparison of detectors over Movie-Lens dataset (%). (a) Random attack, (b) average attack, (c) bandwagon attack, (d) segment attack, (e) average+bandwagon attack and (f) random+segment attack. FIGURE 7. View largeDownload slide Comparison of detectors against AoP attacks (%). (a) Movie-Lens, (b) Netflix, (c) different AoP rates over Movie-Lens and (d) different AoP rates over Netflix. FIGURE 7. View largeDownload slide Comparison of detectors against AoP attacks (%). (a) Movie-Lens, (b) Netflix, (c) different AoP rates over Movie-Lens and (d) different AoP rates over Netflix. Figure 7c and d illustrates the variation F when AoP rate is changing and filler size is fixed as 5% over Movie-Lens dataset and as 1% over Netflix dataset. From the figures, it is clear that CNN-SAD performs best, which can achieve almost 100% accuracy. In other words, for the proposed method AoP rate has little influence on it. Comparing the results of the two different datasets, it is easy to conclude that almost all detection methods are affected by the variation of data density, since the average performance in low-density Netflix subset is lower than that in Movie-Lens dataset. The performance of proposed CNN-SAD method barely changed, while the PCA-based and PLSA-based methods drop drastically. It obviously gets harder for these methods to extract features when density declines and gross increase, but not for CNN-SAD which has auto-generated deep-level features. The detection of CNN-SAD gets even more accurate thanks to the training process. As is mentioned before, there are some normal users which are similar to spam users. Therefore, clustering algorithms are likely to judge these normal users as spam users. 4. CONCLUSION SAN is emerging as a new paradigm to exploit the social properties of network nodes to design efficient and effective networking protocols. Detecting shilling attacks in recommender systems, as one part of research area in SAN paradigm, is a pressing and imperative problem. However, existing methods can only detect spam users through artificially designed features which are difficult to meet the requirements of highly nonlinear. In this paper, a convolutional neural network is proposed aimed at automatically extracting representative and discriminative deep-level features. The proposed method is composed of one matrix transformation layer, one convolutional layer, one pooling layer and one output layer. The purpose of matrix transformation layer is to convert users’ rating profiles to matrices which meet the requirement of convolutional neural networks. The convolutional layers can extract abstract deep-level features of users’ rating profiles. Through the pooling layer, the proposed method can not only conduct a secondary feature extraction but reduce the representation. The comparison among the proposed method and other four state-of-the-art approaches against six different attack strategies shows that the proposed approach outperforms other algorithms under most circumstances. The proposed method can detect spam users with an accuracy rate over 98% under most circumstances. Detecting shilling attacks using automatically extracted high-level features are proven efficient and promising. This work can detect the shilling attacks in the recommender system more accurately and effectively than other methods, which is contributing to the development of SAN applications and security. In the future, we will train the deep convolutional neural network model in larger and more varied datasets using some novel training tactics. FUNDING National Natural Science Foundation of China (61472024 and U1433203); the National Funding from the FCT—Fundação para a Ciência e a Tecnologia through the UID/EEA/50008/2013 Project; the Government of Russian Federation (Grant 074-U01); Finep, with resources from Funttel, (Grant no. 01.14.0231.00), under the Centro de Referência em Radiocomunicações—CRR project of the Instituto Nacional de Telecomunicações (Inatel), Brazil. REFERENCES 1 Xia, F., Liu, L., Li, J., Ma, J. and Vasilakos, A.V. ( 2017) Socially aware networking: a survey. IEEE Syst. J. , 9, 904– 921. Google Scholar CrossRef Search ADS   2 Pan, H., Chaintreau, A., Gass, R., Scott, J., Crowcroft, J. and Diot, C. ( 2005) Pocket Switched Networking: Challenges, Feasibility and Implementation Issues. Autonomic Communication . Springer, Berlin Heidelberg. 3 Chen, W., Guha, R.K., Kwon, T.J., Lee, J. and Hsu, Y.Y. ( 2011) A survey and challenges in routing and data dissemination in vehicular ad-hoc networks. Wireless Commun. Mobile Comput. , 11, 787– 795. Google Scholar CrossRef Search ADS   4 Wu, F.J., Kao, Y.F. and Tseng, Y.C. ( 2011) From wireless sensor networks towards cyber physical systems. Pervasive Mobile Comput. , 7, 397– 413. Google Scholar CrossRef Search ADS   5 Edmunds, A. and Morris, A. ( 2000) The problem of information overload in business organisations: a review of the literature. Int. J. Inform. Manag. , 20, 17– 28. Google Scholar CrossRef Search ADS   6 Herlocker, J.L., Konstan, J.A., Terveen, L.G. and Riedl, J.T. ( 2004) Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. , 22, 5– 53. Google Scholar CrossRef Search ADS   7 Lu, Z., Pan, S.J., Li, Y., Jiang, J. and Yang, Q. ( 2016) Collaborative Evolution for User Profiling in Recommender Systems . IJCAI, New York City, USA. 8 O’Mahony, M., Hurley, N. and Kushmerick, N. ( 2004) Collaborative recommendation: a robustness analysis. ACM Trans. Internet Technol. , 4, 344– 377. Google Scholar CrossRef Search ADS   9 Mobasher, B., Burke, R., Bhaumik, R. and Williams, C. ( 2007) Toward trustworthy recommender systems:an analysis of attack models and algorithm robustness. ACM Trans. Internet Technol. , 7, 23. Google Scholar CrossRef Search ADS   10 Mehta, B. and Nejdl, W. ( 2008) Attack Resistant Collaborative Filtering. Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.75–82, ACM. 11 Gunes, I., Kaleli, C., Bilge, A. and Polat, H. ( 2014) Shilling attacks against recommender systems: a comprehensive survey. Artif. Intell. Rev. , 42, 767– 799. Google Scholar CrossRef Search ADS   12 Wang, Y., Zhang, L., Tao, H., Wu, Z. and Cao, J. ( 2015) A Comparative Study of Shilling Attack Detectors for Recommender Systems. 2015 12th International Conference on Service Systems and Service Management, pp. 1–6, IEEE. 13 Patel, K., Thakkar, A., Shah, C. and Makvana, K. ( 2016) A State of Art Survey on Shilling Attack in Collaborative Filtering Based Recommendation System. Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems, pp. 377–385, Springer. 14 Schmidhuber, J. ( 2015) Deep learning in neural networks: an overview. Neural. Netw. , 61, 85– 117. Google Scholar CrossRef Search ADS PubMed  15 Hubel, D.H. and Wiesel, T.N. ( 1968) Receptive fields and functional architecture of monkey striate cortex. J. Physiol. , 195, 215– 243. Google Scholar CrossRef Search ADS PubMed  16 Fukushima, K. ( 1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. , 36, 19– 202. Google Scholar CrossRef Search ADS PubMed  17 LeCun, Y., Jackel, L.D., Bottou, L., Brunot, A., Cortes, C., Denker, J.S., Drucker, H., Guyon, I., Muller, U.A. and Sackinger, E. ( 1995) Comparison of learning algorithms for handwritten digit recognition. Int. Conf. Artif. Neural Netw. , 60, 53– 60. 18 Krizhevsky, A., Sutskever, I. and Hinton, G.E. ( 2012) Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. , 1097– 1105. 19 Ciresan, D.C., Meier, U., Gambardella, L.M. and Schmidhuber, J. ( 2011) Convolutional Neural Network Committees for Handwritten Character Classification. 2011 International Conference on Document Analysis and Recognition, pp. 1135–1139, IEEE. 20 Gong, Y. and Zhang, Q. ( 2016) Hashtag Recommendation Using Attention-Based Convolutional Neural Network. IJCAI 2016, Proceedings of the 26rd International Joint Conference on Artificial Intelligence, New York City, USA. 21 Kalchbrenner, N., Grefenstette, E. and Blunsom, P. ( 2014) A convolutional neural network for modelling sentences. Eprint Arxiv, 1. 22 Martn, A., Ashish, A. and Paul, B. TensorFlow: large-scale machine learning on heterogeneous systems. http://tensorflow.org/ (accessed 11 January 2018). 23 Burke, R., Mobasher, B., Williams, C. and Bhaumik, R. ( 2006) Classification features for attack detection in collaborative recommender systems. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 542–547). ACM. 24 Williams, C.A., Research Advisor and Mobasher, B. ( 2006) Profile injection attack detection for securing collaborative recommender systems. Thesis, Service Oriented Computing and Applications, 1(3), 157–170. 25 Hurley, N., Cheng, Z. and Zhang, M. ( 2009) Statistical Attack Detection. ACM Conference on Recommender Systems, Recsys 2009, New York, NY, USA, October (pp. 149–156). DBLP. 26 Mehta, B. ( 2007). Unsupervised Shilling Detection for Collaborative Filtering. AAAI Conference on Artificial Intelligence, July 22–26, 2007, Vancouver, British Columbia, Canada (Vol. 1, pp. 1402–1407). DBLP. 27 Bhaumik, R., Mobasher, B. and Burke, R. ( 2012). A clustering approach to unsupervised attack detection in collaborative recommender systems. Proceedings of the 7th IEEE International Conference on Data Mining, Las Vegas, NV, USA , pp. 181– 187. 28 Cao, J., Wu, Z., Mao, B. and Zhang, Y. ( 2013) Shilling attack detection utilizing semi-supervised learning method for collaborative recommender system. World Wide Web , 16, 729– 748. Google Scholar CrossRef Search ADS   Footnotes 1 http://netflixprize.com/index.html 2 http://grouplens.org/datasets/movielens/ Author notes Handling editor: Zhaolong Ning © The British Computer Society 2018. All rights reserved. For Permissions, please email: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The Computer Journal Oxford University Press

A shilling attack detector based on convolutional neural network for collaborative recommender system in social aware network

Loading next page...
 
/lp/ou_press/a-shilling-attack-detector-based-on-convolutional-neural-network-for-fj6NJKCQ6o
Publisher
Oxford University Press
Copyright
© The British Computer Society 2018. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ISSN
0010-4620
eISSN
1460-2067
D.O.I.
10.1093/comjnl/bxy008
Publisher site
See Article on Publisher Site

Abstract

Abstract One of the most fundamental tasks in the socially aware network (SAN) paradigm is to explore the attributes and behavior of users, which helps to design more suitable and efficient protocols. Particularly, detection of shilling attackers by mining users’ behavior is a frequently discussed topic in many social scenes like recommender systems based on collaborative filtering. As the performances of collaborative filtering are entirely based on ratings provided by users, they are vulnerable to shilling attacks which perform injection of biased profiles into rating databases to alter the systems. Current shilling attack detection methods detect spam users through artificially designed features, which are neither robust nor efficient enough. This paper illustrates a novel convolutional neural network-based method named CNN-SAD, which applies transformed network structure to exploit deep-level features from users rating profiles. Since the achieved deep-level features elaborate users rating more precisely than artificially designed features, CNN-SAD can detect shilling attacks more efficiently. According to the experimental results, the proposed method is capable of detecting the vast majority of obfuscated attacks precisely and outperforms other state-of-the-art algorithms, which contributes to applications and security in SAN. 1. INTRODUCTION The last few decades have witnessed dramatic developments in wireless communications and networking technologies which have lay down a substantial foundation for the emergence of a modern paradigm—socially aware network (SAN) [1]. With the increasing number and types of wireless mobile devices like smart mobile devices, SAN, as one novel paradigm to exploit the social properties of network devices especially mobile devices, can be applied over many forms of human-centric wireless networks such as pocket switched networks [2], vehicular ad-hoc networks [3] and cyber physical systems [4], etc. SAN can comprehensively study individuals’ social properties like personal property, the relationship of human-to-human, human-to-community and human-to-environment. Personal property includes the individuals’ behavior, attributes, habit and so on. In the SAN paradigm, the exploration and analysis of users’ attributes and behavior can discover the preference and willingness of them. Therefore, it is essential to describe the sociability and social behaviors of human beings. Besides, it is of great significance to design more efficient protocols and applications of SAN paradigm. Detection of shilling attackers by mining users’ behavior is a frequently discussed topic in recommender systems based on collaborative filtering (CF). The rapid increase of on-line information has led to the problem of information overload [5]. CF-based approaches which have achieved great success in multimedia recommendation by recommending items have been successfully introduced into computer applications to filter irrelevant information and recommend products to users [6] or even predict users’ future profile [7], by judging the items according to the attitudes of users’ neighbors. However, due to the nature of recommender systems, CF-based approaches are seriously vulnerable to shilling attacks [8–10]. To increase or decrease a target item’s rating score, attackers can simply inject numerous automatically generated profiles into a recommender system. Such attacks have been referred to as shilling attacks. Prior research has shown that shilling attacks can significantly reduce the efficiency of recommender systems [11]. Detecting shilling attacks is, therefore, emerging as a momentous challenge to the stabilization and effectiveness of recommender systems. Generally speaking, detection methods in this field can be classified into supervised methods, unsupervised methods and semi-supervised methods. Typical examples of supervised methods includes the k-nearest-neighbor ( k-nn)-based classification algorithm, the support vector machine (SVM)-based method, Similar supervised method, etc. There are two major categories of unsupervised shilling attack detection methods: clustering-based method and principal component analysis (PCA)-based method. The clustering-based method has a frank and simple principle, which is clustering the profiles by artificially designed features. Meanwhile, the PCA-based methods have better identification results. However, current shilling attack detection methods have some limitations [12]. Traditional supervised methods suffer from obfuscated attacks like mixed attacks, which combined fake profiles from multiple shilling attack methods. Most clustering-based methods are not stable enough since some normal users have similarities to spam users [13]. Although PCA-based approaches are efficient against most attacks, they still suffer from certain attacks such as average over popular (AoP) attack. Semi-supervised methods are more stable than the above two kinds of methods, but much more complex than the other algorithms, and they would cost unbearably long time in calculation. Another significant drawback of existing methods is that they detect spam users with artificially designed features. These manual features are not highly nonlinear, and moreover, many features of shilling attack detection are designed for specific attack models. In other words, traditional features are inefficient against unknown or complicated attacks. Therefore, an algorithm rich in accessibility, stability and efficiency is urgent. In the last decades, deep learning theory has developed rapidly and achieved success in various domains and it has achieved great successes in many fields such as image classification and speech recognition [14]. Hubel and Wiecel found that animal’s visual cortex cells respond to optical signals [15], and inspired of it, Kunihiko Fukushima proposed Neocognitron, the predecessor of convolutional neural network in 1980 [16]. LeCun et al. [17] proposed LeNet-5, a multilayer artificial neural network which established the modern structure of convolutional neural network. Krizhevsky et al. [18] used convolutional neural network called AlexNet to classify 1.3 million images and had a major breakthrough in image classification. After the success of AlexNet, the researchers worked on and proposed several advanced theories. It is employed in [19] for handwritten character classification, and in [20] for hashtag recommending. The main contribution of this paper is that a deep convolutional neural network has been proposed for shilling attacks detection named CNN-SAD, which is more accurate and effective than traditional methods. As the best of our knowledge, the proposed approach is the first method that detects shilling profiles with auto-generated deep-level features, which is robust and adaptive against the vast majority of attack methods, even an unknown attack method. The proposed approach can be summarized as a two-step process. First of all, a rating matrix is initialized according to a stochastic matrix. Then, the network is trained by utilizing the back propagation algorithm and gradient decent method. The rest of this paper is organized as follows. Section 2 illustrates the architecture and details of the proposed deep convolutional neural network. Experiments and evaluations to demonstrate the effectiveness and efficiency of CNN-SAD are drawn in Section 3. Section 4 summarizes the whole paper. 2. CNN-SAD: A CONVOLUTIONAL NEURAL NETWORK-BASED METHOD Convolutional neural networks are usually used in image processing or natural language processing, which is because of its advantage of local processing mechanism. This mechanism allows each neural connect to merely one subset of the input, which is helpful to relieve the calculation. Besides, it has several layers through which the deep-level features are extracted. All these characteristics make convolutional neural network a suitable candidate for shilling attack detection. The local processing mechanism is born for clustering tasks such as attack detection. The traditional detection methods use manually chosen features, which is highly nonlinear and mostly limited to specific attack models. Meanwhile, the auto-generated deep-level features adopted by convolutional neural networks are more appropriate. Consequently, CNN is employed as algorithm of the proposed detector. Due to the low difficulty of the classification tasks, the architecture of the neural network is not sophisticated as the one for figure recognition. Therefore, a relatively simple one has been built. As is shown in Fig. 1, the proposed convolutional neural network consists of four layers, including one transforming layer, one convolutional features layer, one pooling layer and one output layer. The architecture of the convolutional neural network is similar to the systems presented in [21]. Figure 1. View largeDownload slide The architecture of the proposed convolutional neural network (CNN-SAD). Figure 1. View largeDownload slide The architecture of the proposed convolutional neural network (CNN-SAD). In the following, a brief introduction of the proposed networks’ main components is given, including the transformation matrix, the convolutional feature map, the pooling (sampling) layer and the output layer. The process is shown in Algorithm 1. Besides, how to extract deep-level features from rating profiles, and hence to detect shilling attacks are also demonstrated in the following sections. 2.1. Transforming In CNN-SAD, the input of a user’s rating profile is treated as a M×1 vector: [r1,r2,…,rM], where M is the number of items and ri represents the score that the current user rated to movie i. As Fig. 1 indicates, the vector is reshaped into a matrix, by the order of similarity generated by specific similarity calculation methods, which are elaborated in Section 3.3. This is for the purpose of trying to cluster the similar items, which is inspired by the practice of CNN used in natural language processing. There are similarity between the rating and the linguistic emotion, and moreover, the shape and size of the matrix as well as the similarity calculation method could affect the detecting effect. Therefore, the relevant experiments are conducted to prove it, and the results are shown in Section 3. Algorithm 1 CNN-SAD Require: I=[r1,r2,…,rM] is the rating vector of a user in the recommender system.  Ensure: output the result: 1 represents spam user, 0 represents normal user in the recommender system.  1:  sortedI =sort(I,sortMethod)  ▹ sort  2:  map =reshape(sortedI,shape)  ▹ reshape  3:  convResult =sigm(conv(map ,convKernel) +convBias)  ▹ convolution  4:  poolResult =maxPool(convResult ,poolSize)  ▹ pooling  5:  output =sigm((poolResult×outWeights) +outBias)  ▹ output  Require: I=[r1,r2,…,rM] is the rating vector of a user in the recommender system.  Ensure: output the result: 1 represents spam user, 0 represents normal user in the recommender system.  1:  sortedI =sort(I,sortMethod)  ▹ sort  2:  map =reshape(sortedI,shape)  ▹ reshape  3:  convResult =sigm(conv(map ,convKernel) +convBias)  ▹ convolution  4:  poolResult =maxPool(convResult ,poolSize)  ▹ pooling  5:  output =sigm((poolResult×outWeights) +outBias)  ▹ output  View Large Algorithm 1 CNN-SAD Require: I=[r1,r2,…,rM] is the rating vector of a user in the recommender system.  Ensure: output the result: 1 represents spam user, 0 represents normal user in the recommender system.  1:  sortedI =sort(I,sortMethod)  ▹ sort  2:  map =reshape(sortedI,shape)  ▹ reshape  3:  convResult =sigm(conv(map ,convKernel) +convBias)  ▹ convolution  4:  poolResult =maxPool(convResult ,poolSize)  ▹ pooling  5:  output =sigm((poolResult×outWeights) +outBias)  ▹ output  Require: I=[r1,r2,…,rM] is the rating vector of a user in the recommender system.  Ensure: output the result: 1 represents spam user, 0 represents normal user in the recommender system.  1:  sortedI =sort(I,sortMethod)  ▹ sort  2:  map =reshape(sortedI,shape)  ▹ reshape  3:  convResult =sigm(conv(map ,convKernel) +convBias)  ▹ convolution  4:  poolResult =maxPool(convResult ,poolSize)  ▹ pooling  5:  output =sigm((poolResult×outWeights) +outBias)  ▹ output  View Large 2.2. Convolution Convolution is the core of CNN-SAD, which is designed to extract the features of a single profile and find differences between the normal users and shilling attackers for the purpose of detection. As shown in Fig. 1, the input map of the convolutional layer is an m×n rating matrix. Suppose that the kernel size is d×d, then the convolutional result is calculated as   cf=∑id∑jdSi,j×Fd−i,d−jf (1)where S is a sub matrix of the rating matrix and Ff represents the fth filter map. In the proposed method, every filter map’s step is 1, that is to say, since Ratingmatrix∈Rm,n, the dimension of convolutional layer’s output matrix is (m−d+1)×(n−d+1). In convolutional layer, CNN-SAD designed six feature maps, i.e. in the following pooling layer, six different feature matrices are obtained. In practice, a bias parameter is needed for each filter. A bias vector bi∈Rf is required for each convolutional layer. There are many activation functions, CNN-SAD employs sigmoid, which is represented as S(x)=1/(1+e(−x)). Thus, the calculation of convolutional layer is illustrated as   xl=f(ul)withul=Wl×xl−1+b (2)where Wl represents the weight of filter f; b denotes the bias parameter of filter f; xl means the output of convolutional layer l−1. 2.3. Pooling In most cases, there is a pooling layer after a convolutional layer. The goal of the pooling layer is to aggregate the information and reduce the representation, which can also be seen as a fuzzy filter that has the effect of secondary feature extraction. There are two major pooling processes: max pooling and mean pooling. In this paper, CNN-SAD chooses max pooling. Besides, in order to reduce the computational complexity and training time, CNN-SAD does not add any weights or biases in pooling layers. 2.4. Output layer After the process of convolutional layer and pooling layer, as shown in Fig. 1, a serial of features of the rating matrix are achieved. Then CNN-SAD simply classifies an input rating matrix which follows   p(yj)=sigmoid(yPTwj+bj) (3)where p(yj) represents the probability that the current input rating profile belongs to class j; yP denotes the output of the pooling layer; w and b are the weight vector and bias vector of the output layer. So far the probabilities of the two classes, normal or shilling, are achieved, and then an input profile is labeled with the biggest probability of the two classes. 2.5. To train the network In CNN-SAD, back propagation algorithm is adopted to compute the gradients. Gradient descent method is utilized to train the proposed model as in   E(W,b)=1m∑i=1mE(W,b,xi,yi) (4)  =1m∑i=1m12∥hW,b(xi)−yi∥2 (5)where E(W,b) denotes the loss function that is used in the proposed approach; xi and yi mean the input and output, respectively; m represents size of training set; hW,b(xi) denotes the prediction of the deep convolutional neural network. Since the loss function has been defined, CNN-SAD adopts Equations 6 and 7 to tuning the weights and biases in the deep convolutional neural network,   Wijl=Wijl−α∂∂WijlE(W,b) (6)  bil=bil−α∂∂bilE(W,b) (7)where Wijl means the weight of layer ls ith feature map; bil denotes the bias of feature map i in layer l; E(W,b) represents the loss function which is illustrated in Equation (5); α denotes the learning rate. 3. EXPERIMENTS AND EVALUATIONS 3.1. Dataset and detection performance metrics In this paper, the publicly available Netflix dataset1 and Movie-Lens 100K dataset2 are both adopted to test the proposed approach. Netflix dataset consists of 100 000 000 ratings from 480 000 randomly-chosen, anonymous Netflix users over 17 770 movie titles and Movie-Lens 100K dataset consists of 100 000 ratings from 943 users on 1682 movies. All the ratings are integer values, the biggest of which is 5 (like) and the smallest is 1(dislike). Since the Netflix dataset is too large to compete with Movie-Lens dataset, a subset of it has been chosen, which consists of about 40 000 ratings from 2000 users on 2000 movies, a similar scale to Movie-Lens dataset. This subset is randomly sampled from the Netflix dataset so it is qualified to represent. The subsequent experiments are conducted over these two datasets, whose features are shown in Table 1. The training set and test set is 50%/50% split of the whole dataset. For CNN-SAD, the training set contains 500 normal users and 443 spam users, while the test set is formed by 500 normal users and 443×(0.1,0.15,0.2) spam users on Movie-Lens dataset. In Netflix dataset, the training set contains 1000 normal users and 1000 spam users, while the test set is formed by 1000 normal users and 1,000×(0.1,0.15,0.2) spam users. TABLE 1. Features of datasets. Dataset  User  Item  Rating  Density (%)  Netflix  2000  2000  40 000  1.09  Movie-Lens  943  1682  100 000  6.30  Dataset  User  Item  Rating  Density (%)  Netflix  2000  2000  40 000  1.09  Movie-Lens  943  1682  100 000  6.30  View Large TABLE 1. Features of datasets. Dataset  User  Item  Rating  Density (%)  Netflix  2000  2000  40 000  1.09  Movie-Lens  943  1682  100 000  6.30  Dataset  User  Item  Rating  Density (%)  Netflix  2000  2000  40 000  1.09  Movie-Lens  943  1682  100 000  6.30  View Large There are three staple metrics P (precision), R (recall) and F (F-measure) to evaluate the performance of the proposed convolutional neural network and other detection methods against different kinds of shilling attacks. Generally speaking, P and R show the completeness and accuracy of a classifier. F reveals a global view of the classifier. In this section, only F is shown to demonstrate experiment results. It is worthy of noting that, for PCA-based method, it is supposed that the number of spam users is already known. Therefore, the values of P, R and F are the same. The three metrics are defined as follows,   P=#TruePositive#TruePositive+#FalsePositive (8)  R=#TruePositve#TruePositive+#FalseNegative (9)  F=2P×RP+R (10)where #TruePositive represents the number spam users which are detected successfully by a detector. #FalsePositive denotes the number of normal users that are judged as spam users. The TensorFlow framework is employed to conduct the experiments [22]. 3.2. Shilling attack models The objective of shilling attack models is to generate attack profiles and inject them into a PSN. Generally speaking, an attack profile consists of four components: selected items, filler items, non-voted items and a target item. Selected items are a collection of items that are chosen by attackers according to some special prior knowledge. Filler items are picked from the system to make attack profiles look like normal profiles. Target item is the one that attackers intend to push or nuke. Figure 2 illustrates the general form of an attack profile. Variables associated with the experiment include: attack size and filler size, where attack size denotes the number of injected spam profiles and filler size means the number of filler items of an single injected attack profile. In every experiment, one filler size and one attack size have been picked to generate the attack data, then keep one of them fixed and vary the other one. Each attack model pushes a target movie at 10, 15 and 20% as the attack size, and the average result of the three is shown. For a specific attack profile, the filler size is set as 1, 3, 5, 7, and 9% on Movie-Lens dataset, and 0.5, 0.75, 1, 1.25, and 1.5% on Netflix dataset, as the two datasets has different densities. FIGURE 2. View largeDownload slide The general form of an attack profile. FIGURE 2. View largeDownload slide The general form of an attack profile. There are six attack models used in the experiments: random attack, average attack [23], bandwagon attack, segment attack [24], AoP attack [25] and mixed attack. As is shown in Table 2, random attack does not have any selected items, of which the filler item is randomly chosen from all the items and rated a score which follows N(μ,σ2), where μ and σ are the expectation and variance of the whole system respectively. A filler item’s rating in average attack is determined by the average rating of the item. However, the average rating of any item can only be acquired through the PSN. Thus an average attack model requires more information than a random attack model. Bandwagon attack chooses popular items as selected items. Meanwhile, segment attack selects the items which are similar to the target item as selected items. Since the information of popular items and similarities among items can be achieved from other places, bandwagon attack and segment attack require small amount of knowledge. AoP attack is able to greatly decline the efficiency of PCA-based methods. AoP attack selects filler items stochastically from top x% popular items, where x could be changed. Mixed attack model literally mixes the attack profiles according to all the models described above. In this paper, random attack is mixed with segment attack, while average attack is mixed with bandwagon attack. It is worthy of noting that, requiring the detailed item information, segment attack could not been deployed on Netflix dataset, which lacks the indispensable information, so there are seven attack models on Movie-Lens dataset and five on Netflix. The arguments of the attack models are listed in Table 2, where IS denotes selected items; IF represents filler items; it denotes target item; knowledge means the amount of knowledge that an attack model requires. TABLE 2. Attack models. Attack type  IS    IF    I∅  it    Items  Rating  Items  Rating      Random  Not used    Randomly chosen  N(μ,σ2)  I−IF  rmax  Average  Not used    Randomly chosen  Item mean  I−IF  rmax  Bandwagon  Popular items  rmax  Randomly chosen  N(μ,σ2)  I−(IF∪IS)  rmax  Segment  Segmented items  rmax  Randomly chosen  rmin  I−(IF∪IS)  rmax  AoP  Not used    Randomly chosen (top x%)  rmin  I−(IF∪IS)  rmax  Mixed  Popular items  rmax  Randomly chosen  rmin or item mean  I−(IF∪IS)  rmax  Attack type  IS    IF    I∅  it    Items  Rating  Items  Rating      Random  Not used    Randomly chosen  N(μ,σ2)  I−IF  rmax  Average  Not used    Randomly chosen  Item mean  I−IF  rmax  Bandwagon  Popular items  rmax  Randomly chosen  N(μ,σ2)  I−(IF∪IS)  rmax  Segment  Segmented items  rmax  Randomly chosen  rmin  I−(IF∪IS)  rmax  AoP  Not used    Randomly chosen (top x%)  rmin  I−(IF∪IS)  rmax  Mixed  Popular items  rmax  Randomly chosen  rmin or item mean  I−(IF∪IS)  rmax  View Large TABLE 2. Attack models. Attack type  IS    IF    I∅  it    Items  Rating  Items  Rating      Random  Not used    Randomly chosen  N(μ,σ2)  I−IF  rmax  Average  Not used    Randomly chosen  Item mean  I−IF  rmax  Bandwagon  Popular items  rmax  Randomly chosen  N(μ,σ2)  I−(IF∪IS)  rmax  Segment  Segmented items  rmax  Randomly chosen  rmin  I−(IF∪IS)  rmax  AoP  Not used    Randomly chosen (top x%)  rmin  I−(IF∪IS)  rmax  Mixed  Popular items  rmax  Randomly chosen  rmin or item mean  I−(IF∪IS)  rmax  Attack type  IS    IF    I∅  it    Items  Rating  Items  Rating      Random  Not used    Randomly chosen  N(μ,σ2)  I−IF  rmax  Average  Not used    Randomly chosen  Item mean  I−IF  rmax  Bandwagon  Popular items  rmax  Randomly chosen  N(μ,σ2)  I−(IF∪IS)  rmax  Segment  Segmented items  rmax  Randomly chosen  rmin  I−(IF∪IS)  rmax  AoP  Not used    Randomly chosen (top x%)  rmin  I−(IF∪IS)  rmax  Mixed  Popular items  rmax  Randomly chosen  rmin or item mean  I−(IF∪IS)  rmax  View Large 3.3. Comparison among various similarity calculation methods As an optimizing measure, the data map is sorted according to the similarities among the items, which are calculated by various similarity calculation methods, including the rating quantity, the average rating and the optimized cosine similarity. It is believed that the sequence of items affects the detection results, and the experiment proves it. The average attack is used in the experiment, with filler size at 5% and attack size at 15%. It turns out that clustering the similar items by transforming the input map has a positive impact on detecting shilling attack with CNN. As the results shown in Fig. 3 indicates, the rating quantity and average rating methods accelerate the converging progress and slightly improve the detection effect compared to non-sorted one, which only achieves 92.5% as F value when iteration 20. The rating quantity method produces the best effect with 99.34% at iteration 50, and the average rating method takes second place with 98.68%, while the optimized cosine similarity method even has negative effect on the contrary. As a reference, the normal one without sorting scores 97.4% in the end. FIGURE 3. View largeDownload slide Comparison among various similarity calculation methods. FIGURE 3. View largeDownload slide Comparison among various similarity calculation methods. On account of the best performance, the rating quantity method is employed by the subsequent experiments, which focus on the convolution kernel in the next layer. 3.4. Effect of the shape of convolution map The shape of input map of the convolutional neural network influences the detection progress. Instead of the conventional square, a variety of rectangles with different side lengths have been chosen. This experiment is conducted on Movie-Lens dataset with 1682 items. Traditionally, for every user profile, the vector with length at 1682 was required to reshape into a 42×42 matrix along with a little complement. However, in this experiment, the vector is reshaped into rectangles with the short side lengths 30, 26, 22, 18, and 14, respectively, to study the effect of variation of the shape. As Fig. 4 indicates, it achieves the peak when the short side is 22, numerically 100% at iteration 14, and has slightly worse results as the side lengths vary. As a consequence, it is inferred that the shape of input map does have an optimum choice, which is determined by situation. FIGURE 4. View largeDownload slide Comparison among various convolution map shape. FIGURE 4. View largeDownload slide Comparison among various convolution map shape. 3.5. Comparison among CNN-SAD and other detection methods To fully embody the efficiency of the proposed method, comparisons among CNN-SAD and other four state-of-the-art methods are conducted, including a PCA-based method, a PLSA-based method [26], a k-means-based clustering method [27] and a semi-supervised method Semi-SAD [28], against several different attacks, is elaborated individually over two different datasets. Figures to 7, demonstrate the performance. On both datasets, as shown in the figures, the values of F of CNN-SAD against most attacks are over 99% except for mixed attack. That is to say, under most circumstances, the proposed approach could detect shilling attack profiles with bare false against single attack method. It can be easily found that Semi-SAD performs well against all the three kinds of obfuscated attacks as well, of which F are mostly higher than 0.9. Meanwhile, the clustering-based, PCA-based and PLSA-based approaches performed not so well. The F measure of PCA-based and PLSA-based methods could hardly reach 20%, while clustering-based method is having an extremely unstable outcome. It can be easily found that PCA-based method performs well against all the four kinds of obfuscated attacks, of which F values are bigger than 70%. Meanwhile, the clustering-based approaches are unstable. Though Semi-SAD is not as effective or efficient as CNN-SAD, it outperforms other methods. FIGURE 5. View largeDownload slide Comparison of detectors over Netflix dataset (%). (a) Random attack, (b) average attack, (c) bandwagon attack and (d) average+bandwagon attack. FIGURE 5. View largeDownload slide Comparison of detectors over Netflix dataset (%). (a) Random attack, (b) average attack, (c) bandwagon attack and (d) average+bandwagon attack. Figure 6. View largeDownload slide Comparison of detectors over Movie-Lens dataset (%). (a) Random attack, (b) average attack, (c) bandwagon attack, (d) segment attack, (e) average+bandwagon attack and (f) random+segment attack. Figure 6. View largeDownload slide Comparison of detectors over Movie-Lens dataset (%). (a) Random attack, (b) average attack, (c) bandwagon attack, (d) segment attack, (e) average+bandwagon attack and (f) random+segment attack. FIGURE 7. View largeDownload slide Comparison of detectors against AoP attacks (%). (a) Movie-Lens, (b) Netflix, (c) different AoP rates over Movie-Lens and (d) different AoP rates over Netflix. FIGURE 7. View largeDownload slide Comparison of detectors against AoP attacks (%). (a) Movie-Lens, (b) Netflix, (c) different AoP rates over Movie-Lens and (d) different AoP rates over Netflix. Figure 7c and d illustrates the variation F when AoP rate is changing and filler size is fixed as 5% over Movie-Lens dataset and as 1% over Netflix dataset. From the figures, it is clear that CNN-SAD performs best, which can achieve almost 100% accuracy. In other words, for the proposed method AoP rate has little influence on it. Comparing the results of the two different datasets, it is easy to conclude that almost all detection methods are affected by the variation of data density, since the average performance in low-density Netflix subset is lower than that in Movie-Lens dataset. The performance of proposed CNN-SAD method barely changed, while the PCA-based and PLSA-based methods drop drastically. It obviously gets harder for these methods to extract features when density declines and gross increase, but not for CNN-SAD which has auto-generated deep-level features. The detection of CNN-SAD gets even more accurate thanks to the training process. As is mentioned before, there are some normal users which are similar to spam users. Therefore, clustering algorithms are likely to judge these normal users as spam users. 4. CONCLUSION SAN is emerging as a new paradigm to exploit the social properties of network nodes to design efficient and effective networking protocols. Detecting shilling attacks in recommender systems, as one part of research area in SAN paradigm, is a pressing and imperative problem. However, existing methods can only detect spam users through artificially designed features which are difficult to meet the requirements of highly nonlinear. In this paper, a convolutional neural network is proposed aimed at automatically extracting representative and discriminative deep-level features. The proposed method is composed of one matrix transformation layer, one convolutional layer, one pooling layer and one output layer. The purpose of matrix transformation layer is to convert users’ rating profiles to matrices which meet the requirement of convolutional neural networks. The convolutional layers can extract abstract deep-level features of users’ rating profiles. Through the pooling layer, the proposed method can not only conduct a secondary feature extraction but reduce the representation. The comparison among the proposed method and other four state-of-the-art approaches against six different attack strategies shows that the proposed approach outperforms other algorithms under most circumstances. The proposed method can detect spam users with an accuracy rate over 98% under most circumstances. Detecting shilling attacks using automatically extracted high-level features are proven efficient and promising. This work can detect the shilling attacks in the recommender system more accurately and effectively than other methods, which is contributing to the development of SAN applications and security. In the future, we will train the deep convolutional neural network model in larger and more varied datasets using some novel training tactics. FUNDING National Natural Science Foundation of China (61472024 and U1433203); the National Funding from the FCT—Fundação para a Ciência e a Tecnologia through the UID/EEA/50008/2013 Project; the Government of Russian Federation (Grant 074-U01); Finep, with resources from Funttel, (Grant no. 01.14.0231.00), under the Centro de Referência em Radiocomunicações—CRR project of the Instituto Nacional de Telecomunicações (Inatel), Brazil. REFERENCES 1 Xia, F., Liu, L., Li, J., Ma, J. and Vasilakos, A.V. ( 2017) Socially aware networking: a survey. IEEE Syst. J. , 9, 904– 921. Google Scholar CrossRef Search ADS   2 Pan, H., Chaintreau, A., Gass, R., Scott, J., Crowcroft, J. and Diot, C. ( 2005) Pocket Switched Networking: Challenges, Feasibility and Implementation Issues. Autonomic Communication . Springer, Berlin Heidelberg. 3 Chen, W., Guha, R.K., Kwon, T.J., Lee, J. and Hsu, Y.Y. ( 2011) A survey and challenges in routing and data dissemination in vehicular ad-hoc networks. Wireless Commun. Mobile Comput. , 11, 787– 795. Google Scholar CrossRef Search ADS   4 Wu, F.J., Kao, Y.F. and Tseng, Y.C. ( 2011) From wireless sensor networks towards cyber physical systems. Pervasive Mobile Comput. , 7, 397– 413. Google Scholar CrossRef Search ADS   5 Edmunds, A. and Morris, A. ( 2000) The problem of information overload in business organisations: a review of the literature. Int. J. Inform. Manag. , 20, 17– 28. Google Scholar CrossRef Search ADS   6 Herlocker, J.L., Konstan, J.A., Terveen, L.G. and Riedl, J.T. ( 2004) Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. , 22, 5– 53. Google Scholar CrossRef Search ADS   7 Lu, Z., Pan, S.J., Li, Y., Jiang, J. and Yang, Q. ( 2016) Collaborative Evolution for User Profiling in Recommender Systems . IJCAI, New York City, USA. 8 O’Mahony, M., Hurley, N. and Kushmerick, N. ( 2004) Collaborative recommendation: a robustness analysis. ACM Trans. Internet Technol. , 4, 344– 377. Google Scholar CrossRef Search ADS   9 Mobasher, B., Burke, R., Bhaumik, R. and Williams, C. ( 2007) Toward trustworthy recommender systems:an analysis of attack models and algorithm robustness. ACM Trans. Internet Technol. , 7, 23. Google Scholar CrossRef Search ADS   10 Mehta, B. and Nejdl, W. ( 2008) Attack Resistant Collaborative Filtering. Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.75–82, ACM. 11 Gunes, I., Kaleli, C., Bilge, A. and Polat, H. ( 2014) Shilling attacks against recommender systems: a comprehensive survey. Artif. Intell. Rev. , 42, 767– 799. Google Scholar CrossRef Search ADS   12 Wang, Y., Zhang, L., Tao, H., Wu, Z. and Cao, J. ( 2015) A Comparative Study of Shilling Attack Detectors for Recommender Systems. 2015 12th International Conference on Service Systems and Service Management, pp. 1–6, IEEE. 13 Patel, K., Thakkar, A., Shah, C. and Makvana, K. ( 2016) A State of Art Survey on Shilling Attack in Collaborative Filtering Based Recommendation System. Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems, pp. 377–385, Springer. 14 Schmidhuber, J. ( 2015) Deep learning in neural networks: an overview. Neural. Netw. , 61, 85– 117. Google Scholar CrossRef Search ADS PubMed  15 Hubel, D.H. and Wiesel, T.N. ( 1968) Receptive fields and functional architecture of monkey striate cortex. J. Physiol. , 195, 215– 243. Google Scholar CrossRef Search ADS PubMed  16 Fukushima, K. ( 1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. , 36, 19– 202. Google Scholar CrossRef Search ADS PubMed  17 LeCun, Y., Jackel, L.D., Bottou, L., Brunot, A., Cortes, C., Denker, J.S., Drucker, H., Guyon, I., Muller, U.A. and Sackinger, E. ( 1995) Comparison of learning algorithms for handwritten digit recognition. Int. Conf. Artif. Neural Netw. , 60, 53– 60. 18 Krizhevsky, A., Sutskever, I. and Hinton, G.E. ( 2012) Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. , 1097– 1105. 19 Ciresan, D.C., Meier, U., Gambardella, L.M. and Schmidhuber, J. ( 2011) Convolutional Neural Network Committees for Handwritten Character Classification. 2011 International Conference on Document Analysis and Recognition, pp. 1135–1139, IEEE. 20 Gong, Y. and Zhang, Q. ( 2016) Hashtag Recommendation Using Attention-Based Convolutional Neural Network. IJCAI 2016, Proceedings of the 26rd International Joint Conference on Artificial Intelligence, New York City, USA. 21 Kalchbrenner, N., Grefenstette, E. and Blunsom, P. ( 2014) A convolutional neural network for modelling sentences. Eprint Arxiv, 1. 22 Martn, A., Ashish, A. and Paul, B. TensorFlow: large-scale machine learning on heterogeneous systems. http://tensorflow.org/ (accessed 11 January 2018). 23 Burke, R., Mobasher, B., Williams, C. and Bhaumik, R. ( 2006) Classification features for attack detection in collaborative recommender systems. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 542–547). ACM. 24 Williams, C.A., Research Advisor and Mobasher, B. ( 2006) Profile injection attack detection for securing collaborative recommender systems. Thesis, Service Oriented Computing and Applications, 1(3), 157–170. 25 Hurley, N., Cheng, Z. and Zhang, M. ( 2009) Statistical Attack Detection. ACM Conference on Recommender Systems, Recsys 2009, New York, NY, USA, October (pp. 149–156). DBLP. 26 Mehta, B. ( 2007). Unsupervised Shilling Detection for Collaborative Filtering. AAAI Conference on Artificial Intelligence, July 22–26, 2007, Vancouver, British Columbia, Canada (Vol. 1, pp. 1402–1407). DBLP. 27 Bhaumik, R., Mobasher, B. and Burke, R. ( 2012). A clustering approach to unsupervised attack detection in collaborative recommender systems. Proceedings of the 7th IEEE International Conference on Data Mining, Las Vegas, NV, USA , pp. 181– 187. 28 Cao, J., Wu, Z., Mao, B. and Zhang, Y. ( 2013) Shilling attack detection utilizing semi-supervised learning method for collaborative recommender system. World Wide Web , 16, 729– 748. Google Scholar CrossRef Search ADS   Footnotes 1 http://netflixprize.com/index.html 2 http://grouplens.org/datasets/movielens/ Author notes Handling editor: Zhaolong Ning © The British Computer Society 2018. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Journal

The Computer JournalOxford University Press

Published: Feb 2, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off