Search

Filter

  • Advanced Filters:

  • to
  • Specific Data Sources:

    All Edit

    Select All  |  Select None

Reset filters

Poster: Telephony Network Characterization for Spammer Identi cation Hossein Kaffash Bokharaei — , Yashar Ganjali — Ram Keralapura , Antonio Nucci — Department of Computer Science, University of Toronto Narus Inc., California, USA {hossein, yganjali}@cs.toronto.edu {rkeralapura, anucci}@narus.com Voice-over-IP (VoIP) has moved beyond being a mere technological object and has become an integral part of many people ™s cyber lives. Unfortunately, the very same openness and ubiquity that make IP networks such powerful infrastructures also make them a liability. Risks include Denial of Service (DoS), service theft, spam, call routing manipulation, identity theft and impersonation, among others. In this work, we study individual and network-wide characteristics of telephony communications in a large phone network. Our objective is to analyze phone call patterns and nd statistical properties that are inherent of such service. We study several metrics in the network graph de ned by phone calls including node degrees, call duration, neighborhood connectivity, call repetition, call reciprocity, and call density. We aim at identifying metrics which are helpful in retaining the inner structure of a telephony service and its usage as well as metrics that act as indicators of abnormal service usage. This can be helpful, for instance, in identifying Spam in Internet Telephony (SPIT). For this study, we collect data from one of the largest phone providers in North America for a period of two months (Oct-Nov 2009). Our dataset contains the call records of more than 14 million phone subscribers and 450 million calls to/from these subscribers. Each call record includes call time, duration, and anonymized caller and callee IDs. We do not have access to the content of calls. In order to capture the most salient properties of phone user behaviors, i.e., nodes in the call pattern graph, we focus on dynamics such as node degree and neighborhood connectivity. We nd that almost 80% of nodes in the graph have an in-degree larger than their out-degree with a very small percentage (less than 5%) of users with extraordinary large out-degree being indicative of telemarketers, large-medium organizations and/or SPITters. In terms of neighborhood connectivity we nd that the majority of the users exhibit a very small clustering coe ƒcient implying that people who they talked tend not to talk directly to each other unlike online social networks. Interestingly, the existence of a diversity in average talk duration between neighbors of a user with small out-degree have predictable call duration when compared to users with large out-degree. When focusing on call duration of a user who places less than 300 outgoing calls we nd that she tends to talk for about 300 seconds; however user with more than a thousand outgoing calls talk for about 100 seconds. Repetitive and reciprocal calls represent strong social connections between users. In our dataset, we nd that 80% of the calls are repetitive and 50% are reciprocal. In addition, 50% of reciprocal calls account for 15% of all the edges in the network indicating strong call activity between small number of users. In fact, the talk duration among these users is ten times more than others. When it comes to identifying SPITters, one cannot simply rely on the basic statistical properties of the call pattern graph in isolation. In other words, even though each individual metric (such as in-degree, out-degree, call duration, call frequency, reciprocity, etc.) can point to a set of suspicious nodes, this set will almost always include a large number of legitimate users. For instance, even though SPITters need to make a large number of calls, there are always legitimate users (say businesses) who make a large number of calls. Or, as mentioned before, neighborhood connectivity cannot be used to distinguish legitimate users from SPITters: unlike online social networks even legitimate users seem to have small clustering coe ƒcients in the phone call network. On the positive note, we show that slightly more complex metrics can be very e €ective in identifying SPITters. To this end, we consider two properties of the phone network, namely strong ties property and weak ties property. Strong ties property is well-known in the context of social networks. Simply stated, it says people normally spend most of their time communicating with only a small number of their friends. In our dataset, we observe that 90% of phone system subscribers spend more than 80% of their time talking to only 5 people. Weak ties property considers the other end of the spectrum: for a legitimate user, we expect to have a signi cant fraction of calls to be long. For instance, in our dataset for 90% of users the length of the call is longer than a minute for at least 30% of the calls. We believe a combination of weak ties and strong ties properties can be very e €ective in identifying SPITters. We also measure the global ranking of nodes in our network using a variation of the famous PageRank algorithm called SymRank [1]. Unlike weak ties and strong ties which are based on local properties in the network, SymRank assigns a ranking score to each node based on the global properties of the network. Interestingly, this seemingly orthogonal ranking matches very nicely with the outliers of the strong ties and weak ties properties.

Page 1 of 1

Page 1 of 1

Toggle back to continuous viewing mode

/lp/association-for-computing-machinery/poster-telephony-network-characterization-for-spammer-identification-kEos3yFKhE
Welcome to DeepDyve! Rent Premier Research Articles and Save Up to 90%

Learn more

Free Article

Bookmark

Poster: telephony network characterization for spammer identification

More Info

More Like This Article

View All dataSource[]=actageo&dataSource[]=aspet&dataSource[]=aaos&dataSource[]=aacc&dataSource[]=aacr&dataSource[]=aea&dataSource[]=aip&dataSource[]=ajnr&dataSource[]=ams&dataSource[]=aps_physical&dataSource[]=appi_book&dataSource[]=appi_journal&dataSource[]=apha&dataSource[]=asip&dataSource[]=asm&dataSource[]=asn&dataSource[]=aspb&dataSource[]=avs&dataSource[]=annual_reviews&dataSource[]=arxiv&dataSource[]=acm&dataSource[]=berghahn&dataSource[]=cabi&dataSource[]=clinical_trials&dataSource[]=dailymed&dataSource[]=degruyter&dataSource[]=du_press&dataSource[]=esa&dataSource[]=eu_press&dataSource[]=elsevier&dataSource[]=emerald&dataSource[]=ejtr&dataSource[]=emea&dataSource[]=epo&dataSource[]=faseb&dataSource[]=gsa&dataSource[]=health_affairs&dataSource[]=hindawi&dataSource[]=imanager&dataSource[]=imedpub&dataSource[]=informa_healthcare&dataSource[]=informs&dataSource[]=iop&dataSource[]=iucr&dataSource[]=iospress&dataSource[]=jbjs&dataSource[]=leftcoast&dataSource[]=lu_press&dataSource[]=mesharpe&dataSource[]=mary_ann_liebert&dataSource[]=medline&dataSource[]=mit_press&dataSource[]=nature&dataSource[]=oxford&dataSource[]=pier_professional&dataSource[]=pnas&dataSource[]=portlandpress&dataSource[]=psyc_articles&dataSource[]=psyc_books&dataSource[]=psyc_critiques&dataSource[]=plos_journal&dataSource[]=pubmed_central&dataSource[]=rsna&dataSource[]=rockefeller&dataSource[]=rcn&dataSource[]=ria&dataSource[]=rsc&dataSource[]=sage&dataSource[]=spie&dataSource[]=springer_journal&dataSource[]=springer&dataSource[]=taylor_francis&dataSource[]=aps&dataSource[]=the_scientist&dataSource[]=uc_press&dataSource[]=uspto_abstract&dataSource[]=wiley&dataSource[]=pct

Browse: Subject Areas | Journals | Publishers

Sign Up for a DeepDyve Account

Bookmark an Article

To bookmark an article, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK

Subscribe to Journal Email Alerts

To subscribe to email alerts, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK

Thank you for renting with DeepDyve

Your PayPal account has been charged $2.99. You now have access to the full text of this article. A rental receipt has also been sent to your email address.

Your credit card has been charged $2.99. You now have access to the full text of this article. A rental receipt has also been sent to your email address.

OK

New! You can now keep track of new articles from ACM SIGMETRICS Performance Evaluation Review on your personalized homepage! Learn more

PDF Download — Not Available

Thanks for your interest in purchasing the PDF. Your request has been noted and we will work with our publisher partner to discuss enabling this feature.

In the meantime, you can get the PDF by visiting the publisher site.

Thank you for purchasing with DeepDyve

Your PayPal account has been charged $.

Your credit card has been charged $.

You can now download this article. A purchase receipt has also been sent to your email address.

Download This Article or I'm done with my download

Print Page — Not Available

Thanks for your interest in printing individual pages. Your request has been noted and we will work with our publisher partner to discuss enabling this feature.

In the meantime, you can get the PDF by visiting the publisher site.

Thank you for printing with DeepDyve

Your PayPal account has been charged $0.

Your credit card has been charged $0.

You can now print this article. A purchase receipt has also been sent to your email address.

Print the Selected Pages or I'm done with my printing

Please refresh to generate a new download link

Your article download link has expired. Please refresh this page to obtain a new download link and try again.

Follow a Journal

To get new article updates from a journal on your personalized homepage, please log in first, or sign up for a DeepDyve account if you don't already have one.

OK