Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Drug–target interaction prediction by learning from local information and neighbors

Drug–target interaction prediction by learning from local information and neighbors Vol. 29 no. 2 2013, pages 238–245 BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/bts670 Systems biology Advance Access publication November 17, 2012 Drug–target interaction prediction by learning from local information and neighbors 1, 1 1 1,2 1 Jian-Ping Mei , Chee-Keong Kwoh ,Peng Yang ,Xiao-Li Li and Jie Zheng Bioinformatics Research Centre, School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore and Institute for Infocomm Research, A*Star, 1 Fusionopolis Way #21-01 Connexis, Singapore 138632, Singapore Associate Editor: Trey Ideker potential complement that provides useful information in an ABSTRACT efficient way. Motivation: In silico methods provide efficient ways to predict pos- Generally, the prediction performance is decided by both the sible interactions between drugs and targets. Supervised learning data used and the particular analysis method that is applied to. approach, bipartite local model (BLM), has recently been shown to An intuitive and straightforward way to identify new targets for be effective in prediction of drug–target interactions. However, for a drug is to compare the candidate proteins with those existing drug-candidate compounds or target-candidate proteins that currently targets of that drug. Different results may be obtained depending have no known interactions available, its pure ‘local’ model is not able on which perspective the comparison is made with respect to. to be learned and hence BLM may fail to make correct prediction when involving such kind of new candidates. Keiser et al. (2009) compare targets based on the chemical struc- Results: We present a simple procedure called neighbor-based ture of ligands that bind to them. As reviewed in Haupt and interaction-profile inferring (NII) and integrate it into the existing BLM Schroeder (2011), the structure of binding sites is another import- method to handle the new candidate problem. Specifically, the ant way to compare proteins or to measure the similarity be- inferred interaction profile is treated as label information and is used tween proteins. Although binding site is an effective measure for model learning of new candidates. This functionality is particularly for identification of new targets, the structures of binding site important in practice to find targets for new drug-candidate com- are only available for a small set of proteins, of which the 3D pounds and identify targeting drugs for new target-candidate proteins. structures are known. To be able to consider more proteins, Consistent good performance of the new BLM–NII approach has been amino acid sequence may be used as it is available for most observed in the experiment for the prediction of interactions between proteins. Similarly, to identify new targeting compounds for a drugs and four categories of target proteins. Especially for nuclear specific target, comparison is made on the compound side or receptors, BLM–NII achieves the most significant improvement as drug side with respect to chemical structures (Laggner et al., this dataset contains many drugs/targets with no interactions in the 2012; Martin et al., 2002), side effects (Campillos et al., 2008) cross-validation. This demonstrates the effectiveness of the NII strat- or other possible measurements of drug. egy and also shows the great potential of BLM–NII for prediction of More sophisticated statistical and machine learning methods compound–protein interactions. have been developed recently for prediction of genome-wide Contact: [email protected] drug–target interactions. In He et al. (2010) and Perlman et al. Supplementary information: Supplementary data are available at (2011), multiple groups of drug-related features and Bioinformatics online. protein-related features have been extracted to describe each drug–target pair. After feature selection, a certain classifier is Received on July 2, 2012; revised on October 15, 2012; accepted on used to predict whether a given pair is interacting or not. November 12, 2012 Yamanishi et al. (2008) proposed a supervised bipartite graph learning approach. In this approach, the chemical space and the 1INTRODUCTION geometric space are mapped into a unified space so that those interacting drugs and targets are close to each other while those Identification of interactions between drugs/compounds and non-interacting drugs and targets are far away from each other. protein targets is an important part of the drug discovery pipe- line. The great advances in molecular medicine and the human By mapping the query pair of drug and target to that space with genome project provide more opportunities to discover unknown the learned mapping function, the probability of interaction associations in the compound–protein interaction network. The between them is then calculated as their closeness in the newly discovered interactions are helpful for discovering new mapped space. Another method called the weighted profile drugs by screening candidate compounds and also may help method was also given in Yamanishi et al. (2008). For a query understand the causes of side effects of existing drugs. Since drug, the weighted profile method assigns a probability of experimental way to determine drug–target interactions is interaction to the query target based on how the neighbors of costly and time-consuming, in silico prediction becomes a this drug interact with this target. Basically, weighted profile is a nearest-neighbor approach and it is called drug-based/target- *To whom correspondence should be addressed. based similarity inference in Cheng et al. (2012). Other than 238  The Author 2012. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: [email protected] Bipartite model for learning from local information and neighbors inferring interactions from the drug similarity or target profile method, which is a nearest-neighbor approach. Our ex- similarity, network-based inference was also studied in perimental results show that BLM–NII performs much better Cheng et al. (2012), which infers or predicts drug–target inter- than the weighted profile method. actions based on the topology of the known interaction network. Systematic experiments are conducted to simulate the task of Different from the work in Cheng et al. (2012), which makes use drug–target interactions prediction cross four datasets. of the drug similarity, target similarity and network-based simi- Compared with state-of-the-art approaches, our proposed larity separately, Chen et al. (2012) apply random walk on a approach achieves consistent improvement in terms of area heterogeneous network constructed with these three types of under ROC (AUC) curve and area under precision versus similarities. Another promising approach is the bipartite local recall (AUPR) curve. As these four datasets contain different model (BLM) approach. Bleakley and Yamanishi (2009) portions of new drug candidates and target candidates in the showed that the ensemble of independent drug-based prediction simulation, the improvements of BLM–NII compared with and target-based prediction with supervised learning performs BLM are also different for the four datasets. The most significant much better than only using each single type of prediction. The improvement is achieved on the nuclear receptor dataset, which BLM method has been further studied and improved in Xia et al. contains the largest portion of new candidates. This shows that (2010) and Laarhoven et al. (2011). The main differences of these the NII strategy, i.e. to infer label information or training three methods include the drug–drug and target–target similari- data from neighbors when there is no training data readily avail- ties, the classifiers and the way used to combine the drug-based able from the query compound/protein itself, is feasible and ef- and target-based interaction probabilities. In Xia et al. (2010), fective for dealing with the new candidate problem of the original semi-supervised approach is used instead of supervised approach BLM. for local model learning; while Laarhoven et al. (2011) found that using only the kernel based on the topology of the known interaction network is able to obtain a very good performance. 2 METHODS In the existing framework of BLM, the model for the query 2.1 Problem formalization drug or target is learned based on local information, i.e. its own Assume that the bipartite interaction network N illustrated in Figure 1 interaction profile. Despite a good performance, BLM has limi- involves m drugs/compounds and m targets, which are referred to as d t tations. It is unable to learn without training data and hence is existing drug candidates and target candidates, respectively. We use not able to provide a reasonable prediction for drug/target can- matrix A to represent this network, i.e. a 2 A ¼1if the i-th compound ij didates that are currently new. Here, a drug-candidate com- d is known to interact with the j-th target t . All other entries of A are 0. i j pound is new if it does not have any known targets, and a The problem under consideration is how to make use of the target-candidate protein is new if it is not targeted by any known interactions together with the compound similarities and protein drugs/compounds. We call this the new candidate problem of similarities to predict new interactions between n drug-candidate com- BLM. Since a large number of compounds and proteins, which pounds and n target-candidate proteins, where n 4m and n4m . This t d d t t are possible drug candidates and target candidates, respectively, means there are m ¼ n  m new drug candidates and m ¼ n  m d d d t t t new target candidates, which have no interactions currently known. are new, in this study, we focus on handling the new candidate The whole network involving n compounds and n proteins can be rep- d t problem by proposing an improved version of BLM called BLM resented as with neighbor-based interaction-profile inferring (BLM–NII). "# The NII procedure is developed to incorporate the capacity of ðN Þ ðN Þ A0 1 m m 2 m m d t d t N  n ¼ ¼ , ð1Þ nd t learning from neighbors into the original BLM method. More ðN Þ ðN Þ 00 3 4 m m m m d d d t specifically, when the query involves a new drug/target candi- where known interactions correspond to non-zero entries of A.Now,we date, we first derive the initial weighted interactions for the want to predict possible interactions in N between existing drug candi- new candidate from its neighbors’ interaction profiles, and then dates and target candidates, as well as in other three subnetworks N , N 2 3 use the inferred interactions as label information to train the and N , where the interactions at least involve one type of new model. In general, neighbors refer to compounds/proteins that have large similarities to the query compound/protein. The presented NII idea happen to be similar to the weighted profile method in some sense. However, our BLM–NII method is substantially different from the weighted profile method in the following aspects. In BLM–NII, the derived interaction profile is used as label information to train the local model or the classi- fier, while in the weighted profile method, the derived weighted interaction is directly used as the final predicted interaction prob- ability. Moreover, in BLM–NII, the NII procedure is integrated into the BLM framework where a certain classifier plays the main role in model learning, and NII is activated only for new drug/target candidates; while in the weighted profile method, thereisnoother classifier andthe procedureofderiving the Fig. 1. Bipartite interaction network: a network consists of two types of weighted profile acts as a classification process, which is applied nodes, where edges only connect different types of nodes. The drug–target for any drug/target candidates. To sum up, the BLM–NII is an interaction network is a bipartite network, where drug and target are two enhanced BLM method, and it is different from the weighted types of nodes and the interactions between them are the edges 239 J.-P.Mei et al. candidates, i.e. the target candidate is new, the drug candidate is new or similarity which encodes the topology information of the interaction net- both are new. work has been shown to provide good results. With the Gaussian kernel, the network-based drug similarity S and network-based target similarity S are calculated as: 2.2 Bipartite local model 0 0 2 ka  a k To predict p , the probability that a drug d and a target t interact, i j ij i j d S ði; jÞ¼ exp  ; ð11Þ the basic BLM proposed by Bleakley and Yamanishi (2009) is described as follows. A local model for d denoted as Mod (i) is first learned i d 0  2 based on its interaction profile a and the similarities between targets ka  a k i j i t t S ði; jÞ¼ exp  ; ð12Þ S ,i.e. t 0 n 1 2 Mod ðiÞ¼ trainðS ; a Þ: ð2Þ d where the bandwidth  ¼   a , and different bandwidths may i 0 i¼1 ij be used for drug and target, respectively. However, the result with Here, train represents the learning process of a certain classifier, e.g. network-based similarity may not remain good when the information support vector machine or (Kernel) regularized least squares (RLS), the contained in the interaction network is not sufficient enough. Rather similarity matrix S is used as the observed data of target candidates, and than considering one type of similarity, a more general way is to combine the interaction profile a ,i.e.the i-th row vector of A,serves aslabel several types of similarities. Here, we use both the network-based simi- information to label each target candidate whether interacting with this larity and chemical similarity for drug similarity S , and the drug. Once the model Mod (i)is learned, itisused to predict p ,the d t ij network-based similarity and sequence similarity for target similarity S probability of interaction between d and the query target candidate t : i j through linear combination: d t p ¼ testðMod ðiÞ; s Þ, ð3Þ ij j d d d S ¼ S þð1  ÞS ; ð13Þ c n where s is the j-th column of S recording the similarities between t and t t t S ¼ S þð1  ÞS ; ð14Þ other targets. The similar model learning and prediction process are s n performed independently from the query-target side to get p ,i.e. ij d t where S is the chemical structure similarity for drug, S is the amino acid c s sequence similarity for protein and  is the combination weight set by Mod ðjÞ¼ trainðS ; a Þ; ð4Þ t j user. Although more sophisticated ways such as Kronecker product t d may be used to combine two types of similarity matrices or kernel p ¼ testðMod ðjÞ; s Þ; ð5Þ ij i matrices, experimental results in (Laarhoven et al. 2011) show that the where a is the j-th column vector of A or the interaction profile of target linear combination gives comparable performance with a much lower d t t . Once both p and p have been calculated, they are combined to get computational complexity. ij ij probability p : ij d t 2.3 Neighbor-based interaction-profile inferring p ¼ gðp ; p Þ; ð6Þ ij ij ij Good performance of supervised learning is largely dependent on the d t where g is a function that combines or integrates p and p . Examples ij ij amount and quality of labeled training data. When a drug/target candi- d t d t include p ¼ maxfp ; p g and p ¼ 0:5ðp þ p Þ,where g is the max or ij ij ij ij ij ij date is new, it has no existing interactions that can be used as label average function. information and the model for this candidate thus can not be learned. After p is calculated for each pair of compound i and protein j,the ij As shown in (7), interactions between new drug candidates and new output network of BLM may be represented as target candidates remain unpredicted in BLM. To extend the application "# BLM BLM domain of BLM to new drug/target candidates, we propose to derive N N 1 2 N ¼ ; ð7Þ BLM BLM training data from their neighbors. Based on the assumption that N 0 drugs/compounds which are similar to each other interact with the same targets, interaction profile for new drug-candidate compounds with could be possibly inferred from their neighbors’ interactions. BLM N ¼ N þ P ðMod ; Mod Þ; ð8Þ 1 1 d t Compounds with large similarities to the new drug-candidate compound are said to be its neighbors. Since new drug-candidate compounds have BLM N ¼ P ðMod Þ; ð9Þ 2 d 2 no interactions, or all the elements of its current interaction profile vector are 0, it is not suitable to consider network-based similarity here, so only BLM N ¼ P ðMod Þ: ð10Þ 3 t chemical structure similarity is used to define the neighbors of a drug-candidate compound. Formally, for a compound d which is a where P gives the predicted interactions between existing drug candi- new drug-candidate, we infer the j-th dimension of its interaction profile dates and existing target candidates, P are predicted interactions between 2 d l (i)with existing drug candidates and new target candidates and P gives predicted interactions between new drug candidates and existing target X l ðiÞ¼ s a ; ð15Þ ih hj candidates. h¼1 For any classifier that is used, the known targets of d corresponding to 0 t non-zero elements of a and the pairwise target similarity S are critical to where s is the chemical similarity between two compounds d and d .The i ih i h the final prediction of p . The model learned for d describes how this above formula shows that the interaction weight of this drug with respect ij drug selects targets. Once the model is learned, the similarities between to the j-th target is the collection of its neighbors’ interactions to this the query target and those known targets of d largely decide p .Similarly, target. For a given new drug-candidate compound, the simple formula ij known targeting drugs of t or non-zero elements of t ’s interaction profile given in Equation (15) defines that the inferred weight of interaction j j a and the pairwise drug similarity S are critical to the final prediction of between this compound and a target is high if many of its neighbors p . Under the same BLM framework, different results are produced due interact with this target, and also it is decided more by neighbors with ij d t to the differences in S , S , the classifier and the combination function g. large similarities than those with small similarities. Since new According to the study of Laarhoven et al. (2011), network-based target-candidate proteins have no interactions with any compound, the 240 Bipartite model for learning from local information and neighbors inferred interactions for d are only with existing target candidates. To be BLMNII 0 N ¼ P ðMod ; Mod Þ; ð20Þ 2 d 2 t more specific, l ðiÞ > 0if the j-th target candidate is an existing one, i.e. a 40 for at least one h,and l ðiÞ¼ 0if the j-th target candidate is BLMNII 0 hj N ¼ P ðMod ; Mod Þ; ð21Þ 3 t 3 d new, i.e. a ¼0for all h. To ensure the value of each l ðiÞ is in the hj range of [0, 1], linear scale is performed subsequently, i.e. BLMNII 0 0 N ¼ P ðMod ; Mod Þ: ð22Þ d d d d d 4 d t l ðiÞ¼ðl ðiÞ min l ðiÞÞ=ðmax l ðiÞ min l ðiÞÞ. After we obtained h h h j j h h h the inferred interaction profile, we can use it as label information to Comparing N and N , it is observed that the interactions be- BLMNII BLM learn the model of d : i tween existing drug candiates and target candiates are the same for the two approaches, while the interactions in the other three cases in BLM– 0 t d Mod ðiÞ¼ trainðS ; l ðiÞÞ: ð16Þ NII are different from those in BLM. First, BLM–NII is able to predict In the same way, this procedure is applied to a new target-candidate P , the interactions between drug candidates and target candidates that protein t to obtain its inferred interaction profile l (j), where its neighbors are both new. Second, P and P in BLM–NII are predicted from both j 2 3 are defined based on sequence similarity. The model of t can then be the drug side and the target side, while in BLM are predicted only from learned with l (j): one side. Learning from neighbors allows drug/target candidates to obtain 0 d t Mod ðjÞ¼ trainðS ; l ðjÞÞ: ð17Þ labeled data when themselves do not have or have insufficient labeled data for training. This procedure actually introduces some degree of glo- This interaction profile inferring technique is particularly useful for balization into the original local model to provide more chances of learn- those new drug/target candidates, for which existing supervised methods ing from known knowledge. However, too much globalization is not (e.g. BLM) fail to produce reasonable predictions. It can also be useful to desired as it could eliminate the local characteristics and make the enhance the classification models for any compounds/proteins without models of individual candidates less discriminative. Moreover, the low enough training data or label information. quality of neighbors due to imprecise similarity measure may cause nega- tive impact when the learning process replies on too much neighbors’ 2.4 BLM with NII information. In other words, the inferred interaction profile, although is helpful, may introduce a certain amount of noise. Therefore, in this By integrating the above presented NII strategy into the BLM frame- study, we only activate the neighbor-based learning for totally new can- work, we have the BLM with NII (BLM–NII). The detailed steps of didates. For other cases, we still train the model locally with its own BLM–NII to predict the probability p between any compound i and ij known interactions. any protein j is described in Algorithms 1 and 2. Algorithm 1: BLM–NII 3 MATERIALS d t input : A, S , S c s To facilitate comparison with published approaches, we used the output: p ij d d t same groups of four datasets which are first analyzed by get p ¼ NII-integrated Learning and Prediction ( A, S , S )from d ; ij c s i t t d get p ¼ NII-integrated Learning and Prediction ( A, S , S )from t ; Yamanishi et al. (2008) and then later by Bleakley and ij s c j d t d t Combine p and p to get the final result p ¼ gðp ; p Þ ij Yamanishi (2009), Xia et al. (2010), Laarhoven et al. (2011) ij ij ij ij and Cheng et al. (2012). These four datasets correspond to drug–target interactions of four important categories of protein targets, namely enzyme, ion channel, G-protein-coupled receptor Algorithm 2: NII-integrated learning and prediction d t (GPCR) and nuclear receptor, respectively. The datasets were input : A, S , S c s downloaded from http://web.kuicr.kyoto-u.ac.jp/supp/yoshi/ output: p ij drugtarget/. if d is new then |obtain l (i) with Eq. (15) with S Table 1 gives some statistics of each dataset including the total else number of drugs (n ), the total number of targets (n ), the total d t | l (i)isthe i-th row of A number of interactions (E), the average number of targets for end each drug (D ), the average number of targeting drugs for each t t Compute S with Eq. (12) and S with Eq. (14); target (D ), the percentage of drugs that have only one target t d Learn a local model for d,i.e., Mod (i) ¼ train(S , l (i)) ; i d (D ¼ 1) and the percentage of targets that have one targeting if t is new then t drug (D ¼ 1). It is shown from this table that among the four d t | predict p with Mod (i)and S ij s drug–target interaction networks, on average, each drug and else target in ion channel and enzyme have more interactions than | predict p with Mod (i)and S ij end those in GPCR and nuclear receptor. It is also worthy noting that in the leave-one-out cross–validation (LOOCV), drugs and targets with one interaction are ‘new candidates’ as the only one interaction is covered over to leave no recorded interaction, e.g. The output network of BLM–NII is expressed as 72% drugs in the nuclear receptor are ‘new candidates’ in the "# simulation. BLMNII BLMNII N N 1 2 Each dataset is described by three types of information in the N ¼ ; ð18Þ BLMNII BLMNII BLMNII N N 3 4 form of three matrices: (i) the drug–target interaction matrix; (ii) the drug–drug similarity matrix and (iii) the target–target simi- with larity matrix. The interaction networks were retrieved from the BLMNII BLM N ¼ N ; ð19Þ 1 1 KEGG BRITE (Kanehisa et al., 2006), BRENDA (Schomburg 241 J.-P.Mei et al. Table 1. Some statistics of the four datasets Table 2. Comparison with existing approaches for the four datasets Dataset Enzyme Ion channel GPCR Nuclear Dataset Method AUC AUPR receptor Enzyme Weighted profile 86.4 6.30 n 445 210 223 54 BY(2009) 97.6 83.3 n 664 204 95 26 Laarhoven et al. (2011) 97.8 91.5 E 2926 1476 635 90 BLM–NII 98.8 92.9 D 6.58 7.03 2.85 1.67 Ion channel Weighted profile 81.9 17.2 D 4.41 7.24 6.68 3.46 BY(2009) 97.3 78.1 D ¼ 1(%) 39.78 38.57 47.53 72.22 Laarhoven et al. (2011) 98.4 94.3 D ¼ 1(%) 43.37 11.27 35.79 30.77 BLM–NII 99.0 95.0 GPCR Weighted profile 76.5 10.9 BY(2009) 95.5 66.7 Laarhoven et al. (2011) 95.4 79.0 et al., 2004), SuperTarget (Gnther et al., 2008) and DrugBank BLM–NII 98.4 86.5 (Wishart et al., 2008). The drug–drug similarity is measured Nuclear receptor Weighted profile 74.9 17.1 based on chemical structures from the DRUG and BY(2009) 88.1 61.2 Laarhoven et al. (2011) 92.2 68.4 COMPOUND sections in the KEGG LIGAND database BLM–NII 98.1 86.6 (Kanehisa et al., 2006). The chemical structure similarities be- tween drugs are computed with SIMCOMP (Hattori et al., 2003), which uses a graph alignment algorithm to get a global three for all the datasets. Since the results of weighted profile are similarity score based on the size of the common substructures much worse than those of the three BLM-based methods namely between two compounds. The target–target similarity is mea- BY (2009), Laarhoven et al. (2011) and BLM–NII, we now focus sured based on the amino acid sequences retrieved from the on the comparison of these three approaches. As been discussed KEGG GENES database (Kanehisa et al., 2006). The sequence in Laarhoven et al. (2011), by incorporating the network-based similarities between proteins are computed with a normalized similarity, the performance of BLM can be improved, i.e. the version of Smith–Waterman score. More details on how the results of Laarhoven et al. (2011) in terms of AUPR are much data have been collected and calculated are given in Yamanishi better than those of BY (2009). It is also shown that the per- et al. (2008). formance of BLM can further be improved by integrating the NII procedure, i.e. the results of BLM–NII is consistently better than those of Laarhoven et al. (2011). 4EVALUATION It is interesting to observe that different levels of improve- Systematic experiments are performed to evaluate the perform- ments have been achieved for different datasets. Comparing ance of the presented approach with datasets summarized in Laarhoven et al. (2011) and BY (2009), the improvement is the Table 1. As in Laarhoven et al. (2011), LOOCV is performed. most significant on ion channel and the least significant on nu- Since the real interaction to be predicted is left out, compounds clear receptor. Differently, comparing BLM–NII and Laarhoven and proteins with one interaction (i.e. D ¼1or D ¼ 1) turn out d t et al. (2011), the improvement is the largest for nuclear receptor to have no training data and thus they are treated as ‘new can- and the least for ion channel. Such kind of differences are ex- didates’ in the cross-validation. To test the robustness of the pected due to the differences in the structure of the datasets. presented approach, we also performed 10-fold cross-validation. From Table 1, it is shown that among the four datasets, the The results of 10 trials 10-fold cross-validation can be found in average numbers of interactions of each drug and target are Tables S5–S8 of the Supplementary Material. the largest for ion channel and the smallest for nuclear receptor. This means that the interaction network of ion channel contains 4.1 Compare with state-of-the-art approaches more information than nuclear receptor and thus the First, we compare the performance of BLM–NII (g ¼ max, network-based similarity of ion channel is more robust and in- ¼ 0.5) with the weighted profile method (Yamanishi et al., formative than that of nuclear receptor. Therefore, incorporating 2008) and two other state-of-the-art approaches (Bleakley and the network-based similarity results in larger improvement for Yamanishi, 2009) and (Laarhoven et al., 2011) denoted as BY ion channel. Since drugs or targets with one interaction are ‘new (2009) and Laarhoven et al. (2011), respectively. The same RLS candidates’ in the simulation, it is also shown from Table 1 that classifier is used for BLM–NII as Laarhoven et al. (2011). We the nuclear receptor contains the largest portion of ‘new candi- measure the quality of the predicted interactions in terms of dates’ while the ion channel contains the least. Thus, by applying AUC curve (or true-positive rate versus false-positive rate the NII procedure, BLM–NII has more chances to improve the curve) and AUPR curve. results for nuclear receptor than for Ion Channel. Table 2 gives the AUC and AUPR scores of the four approaches for the four datasets. The results of BY (2009) and 4.2 Comparison between BLM and BLM–NII Laarhoven et al. (2011) are the best ones reported in Bleakley and Yamanishi (2009) and Laarhoven et al. (2011), respectively. To directly show the improvements attributed to the NII strat- From this table, it is clear that BLM–NII outperforms the other egy, we now compare BLM–NII and BLM, i.e. the results of 242 Bipartite model for learning from local information and neighbors (a)(b)(c) Fig. 2. AUPR of BLM and BLM–NII for nuclear receptor with different types of similarities: (a)  ¼ 1, (b)  ¼ 0 and (c)  ¼ 0.5 (a)(b)(c) Fig. 3. Precision–recall curve of BLM and BLM–NII for nuclear receptor with different similarities: (a)  ¼ 1, (b)  ¼ 0 and (c)  ¼ 0.5 Table 3. Compare 1% and 3% top ranked pairs of BLM and BLM–NII for nuclear receptor Top 1% Top 3% Method Sensitivity PPV MCC Sensitivity PPV MCC 1 BLM 13.3 85.7 32.5 35.6 76.2 50.0 BLM–NII 15.6 100.0 38.3 44.4 95.2 63.7 0 BLM 16.7 93.8 38.3 32.2 67.4 44.3 BLM–NII 18.9 100.0 42.3 45.6 97.6 65.4 0.5 BLM 15.6 100.0 38.3 40.0 85.7 56.9 BLM–NII 15.6 100.0 38.3 45.6 97.6 65.4 BLM–NII where new candidates are treated as existing ones. We the precision–recall curve of BLM and BLM–NII. Table 3 shows applied both BLM and BLM–NII with three different groups of the sensitivity (or recall), PPV (positive predictive value or pre- inputs by setting  in Equations (13) and (14) to 1, 0 and 0.5. cision) and MCC (Matthews correlation coefficient). The two We obtained the AUC and AUPR scores of both methods groups of results in Table 3 are calculated by considering the with g ¼ max. The results of both with g ¼ average or mean 1% and 3% pairs with the highest p values as positive, respect- ij have also been produced, which can be found in Tables S1–S4 ively. It is clearly shown from these results that with NII being of the Supplementary Material. Since the same conclusion can be integrated, the performance of BLM has been improved. drawn with respect to either of the two metrics, we put the AUC scores in the Supplementary Material and plot the AUPR scores of BLM and BLM–NII for the four datasets with three different 4.3 Detailed analysis of the effectiveness of NII types of similarities in Figure 2. It is shown that for any type of To take a close look at the difference in the results attributed to similarities, BLM–NII performs better than BLM for all the the NII strategy, we now compare those top ranked interactions datasets. Again, the improvements made by BLM–NII are more significant for nuclear receptor and GPCR than for the of the nuclear receptor dataset produced by BLM–NII and other two datasets. BLM. Since this dataset has 90 known interactions, we inspect Now using nuclear receptor, we make further comparison of the 90 interactions with the highest probabilities predicted by the performance between BLM and BLM–NII. Figure 3 plots each algorithm. 243 J.-P.Mei et al. As summarized in Table 4 (More detailed results are in Table (nuclear receptor subfamily 1, group I, member 3) and D05341 [Palmitic acid (NF)] – hsa3174 (hepatocyte nuclear factor 4, S11 of the Supplementary Material), among the top 90 predicted gamma), which are assigned extremely low ranks by BLM are interactions, BLM only correctly detected 58 known interactions successfully detected by BLM–NII as shown in Figure 4. After while BLM–NII detected 71, and 57 known interactions are checking, we find that the query drug D00163 of the first pair ranked within 90 by both. Although one interaction detected only has one target which happens to be the query target by BLM is missed by BLM–NII, this one ranks 104 in BLM– hsa9971, and the query target is known to be only interacting NII, which indicates that this pair is still recognized to be inter- with the query drug. The other two pairs have the same situation acting with a highly possibility by BLM–NII. Nevertheless, 14 as this pair. As we left out the true interaction in our simulation, interactions detected by BLM–NII are missed by BLM. The the testing for these three pairs becomes to predict interaction average rank of these 14 interactions produced by BLM is 388 between new drug-candidate compound and new target- as some of them ranks very low. candidate protein. Since training data are absent for both the Among these 14 drug–target pairs, three pairs namely D00163 query drug and query target, BLM fails to detect interactions (Chenodeoxycholic acid) – hsa9971 (nuclear receptor subfamily for those three pairs. Although difficulty is presented for such 1, group H, member 4), D00506 (Phenobarbital) – hsa9970 kind of cases, BLM–NII successfully detected these three pairs to be interacting. This shows the effectiveness of NII for prediction of interaction involving new candidates. Table 4. Performance of BLM and BLM–NII on nuclear receptor Now using D00163 and hsa9971 as an example, we give inter- mediate results to illustrate how NII helps detect the interactions between new drug-candidate compounds and new target- Total known interactions: 90 candidate proteins. Figure 5 shows the local model learned for D00163 with the help of inferred training data. Specifically, Interactions detected by BLM 58 Figure 5a shows the inferred interaction profile of D00163, i.e. Interactions detected by BLM–NII 71 Interactions detected by both BLM and BLM-–NII 57 the weighted interactions between D00163 and 25 non-query tar- gets calculated with Equation (15). It shows that the associations between D00163 and several targets such as hsa2099 are large. This is because many of D00163’s neighbors or similar drugs, (b) (a) (c) such as D00066, interact with this target as seen from Figure 5b. Using this inferred interaction profile as label information, Figure 5c shows the learned local model of D00163, or the pair1 weight of each of the targets learned with the classifier with re- spect to D00163. With this learned model, BLM–NII successfully detected the interaction between D00163 and hsa9971 based on pair2 the similarities between the query target hsa9971 and other tar- gets especially those with large weights in the model of D00163. In the same manner, the local model of the query target which is a ‘new’ candidate can be learned with NII. This example illus- trates the feasibility and effectiveness of the presented approach pair3 to infer training data or label information from the interaction Fig. 4. Drug–target interaction matrix of nuclear receptor. (a)Predicted profiles of neighbors. by BLM, (b) predicted by BLM–NII, (c) real interaction matrix. Each entry of the interaction matrix is plotted as a pixel. The brightness of a pixel represents the interaction possibility of the corresponding pair, i.e. the 5 CONCLUSION AND DISCUSSION brighter the more possible that the pair interacts. Three pairs are circled in We proposed an intuitive solution to the new candidate problem (b). These three pairs which consist of drug candidate and protein candi- of BLM by integrating a NII procedure, i.e. infer training data date that are both ‘new’ in Loo validation are detected by BLM–NII (a)(b)(c)(d) Fig. 5. Local model learning for D00163 with BLM–NII. (a) inferred interaction profile l of D00163, (b) weighted interaction of D00163’s neighbors to hsa2099 calculated with s(D00163, i)  a(i, hsa2099) for each drug i, (c) learned model of D00163 by the RLS classifier, (d) similarities between hsa9971 and other proteins, i.e. s has9971 244 Bipartite model for learning from local information and neighbors from neighbors’ interaction profiles. Through systematic REFERENCES experiments with benchmark datasets, we demonstrated the Bleakley,K. and Yamanishi,Y. (2009) Supervised prediction of drug–target inter- effectiveness of BLM–NII for predicting interactions between actions using bipartite local models. Bioinformatics, 25, 2397–2403. new drug-candidate compounds and new target-candidate Campillos,M. et al. (2008) Drug target identification using side-effect similarity. Science, 321, 263–266. proteins. Chen,X. et al. (2012) Drug–target interaction prediction by random walk on the In the presented approach, we allow all the neighbors to par- heterogeneous network. Mol. BioSyst, 8, 1970–1978. ticipate in training data inferring. To allow only neighbors with Cheng,F. et al. (2012) Prediction of drug–target interactions and drug repositioning large similarities to contribute, a threshold may be used to reduce via network-based inference. PLoS Comput. Biol., 8, e1002503. the impact of those non-important neighbors to 0. Alternately, a Gunther,S. et al. (2008) Supertarget and matador: resources for exploring drug– target relationships. Nucleic Acids Res., 36, D919–D922. Gaussian function may be introduced to gradually decrease the Hattori,M. et al. (2003) Development of a chemical structure comparison method influence of neighbors based on their distances to the new drug/ for integrated analysis of chemical and genomic information in the metabolic target candidate in query. pathways. J. Am. Chem Soc., 125, 11853–11865. In the current work, we only apply the NII procedure for those Haupt,V.J. and Schroeder,M. (2011) Old friends in new guise: repositioning of completely new candidates that have no existing training data at known drugs with structural bioinformatics. Breif. Bioinform, 12, 312–326. He,Z. et al. (2010) Predicting drug–target interaction networks based on functional all, and we find that the results are already good enough to show groups and biological features. PLoS One, 5,e9603. the usefulness of NII. Since it is quite common that drugs only Kanehisa,M. et al. (2006) From genomics to chemical genomics: new developments activate or inhibit a small number of targets and targets are only in kegg. Nucleic Acids Res., 34,D354–D357. activated or inhibited by very limited drugs, the NII procedure Keiser,M.J. et al. (2009) Predicting new molecular targets for known drugs. Nature, 462, 175–181. may be applied to drugs and targets which do not have sufficient Laarhoven,T.V. et al. (2011) Gaussian interaction profile kernels for predicting training data. We expect that more accurate prediction models drug–target interaction. Bioinformatics, 27, 3036–3043. may be build by using neighbors’ information to enhance the Laggner,C. et al. (2012) Chemical informatics and target identification in a zebrafish limited training examples. However, too much emphasis on phenotypic screen. Nat. Chem. Biol., 8, 144–146. neighbors tends to eliminate the local characteristics of each Martin,Y.C. et al. (2002) Do structurally similar molecules have similar biological drug and target and could cause deterioration in the prediction activity? J. Med. Chem., 45, 4350–4358. Perlman,L. et al. (2011) Combining durg and gene similarity measures for drug– performance. Nevertheless, it would be an interesting future target elucidation. J. Comput. Biol., 18, 133–145. work to explore the balance between local information and Schomburg,I. et al. (2004) Brenda, the enzyme database: updates and major new global information in model learning. developments. Nucleic Acids Res., 32, D431–D433. Wishart,D.S. et al. (2008) Drugbank: a knowledgebase for drugs, drug actions and Funding: This research was supported by Singapore MOE AcRF drug targets. Nucleic Acids Res., 36, D901–D906. (MOE2008-T2-1-074) and Startup (M4080108.020) from Xia,Z. et al. (2010) Semi-supervised drug–protein interaction prediction from het- Nanyang Technological University, Singapore. erogeneous biological spaces. BMC Syst. Biol., 4 (Suppl. 2), S6. Yamanishi,Y. et al. (2008) Prediction of drug–target interaction networks from the in- Conflict of Interest: none declared. tegration of chemical and genomic spaces. Bioinformatics, 24, i232–i240. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Bioinformatics Oxford University Press

Drug–target interaction prediction by learning from local information and neighbors

Bioinformatics , Volume 29 (2): 8 – Nov 17, 2012

Loading next page...
 
/lp/oxford-university-press/drug-target-interaction-prediction-by-learning-from-local-information-9qo36L3YAJ

References (40)

Publisher
Oxford University Press
Copyright
© The Author 2012. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: [email protected]
ISSN
1367-4803
eISSN
1460-2059
DOI
10.1093/bioinformatics/bts670
pmid
23162055
Publisher site
See Article on Publisher Site

Abstract

Vol. 29 no. 2 2013, pages 238–245 BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/bts670 Systems biology Advance Access publication November 17, 2012 Drug–target interaction prediction by learning from local information and neighbors 1, 1 1 1,2 1 Jian-Ping Mei , Chee-Keong Kwoh ,Peng Yang ,Xiao-Li Li and Jie Zheng Bioinformatics Research Centre, School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore and Institute for Infocomm Research, A*Star, 1 Fusionopolis Way #21-01 Connexis, Singapore 138632, Singapore Associate Editor: Trey Ideker potential complement that provides useful information in an ABSTRACT efficient way. Motivation: In silico methods provide efficient ways to predict pos- Generally, the prediction performance is decided by both the sible interactions between drugs and targets. Supervised learning data used and the particular analysis method that is applied to. approach, bipartite local model (BLM), has recently been shown to An intuitive and straightforward way to identify new targets for be effective in prediction of drug–target interactions. However, for a drug is to compare the candidate proteins with those existing drug-candidate compounds or target-candidate proteins that currently targets of that drug. Different results may be obtained depending have no known interactions available, its pure ‘local’ model is not able on which perspective the comparison is made with respect to. to be learned and hence BLM may fail to make correct prediction when involving such kind of new candidates. Keiser et al. (2009) compare targets based on the chemical struc- Results: We present a simple procedure called neighbor-based ture of ligands that bind to them. As reviewed in Haupt and interaction-profile inferring (NII) and integrate it into the existing BLM Schroeder (2011), the structure of binding sites is another import- method to handle the new candidate problem. Specifically, the ant way to compare proteins or to measure the similarity be- inferred interaction profile is treated as label information and is used tween proteins. Although binding site is an effective measure for model learning of new candidates. This functionality is particularly for identification of new targets, the structures of binding site important in practice to find targets for new drug-candidate com- are only available for a small set of proteins, of which the 3D pounds and identify targeting drugs for new target-candidate proteins. structures are known. To be able to consider more proteins, Consistent good performance of the new BLM–NII approach has been amino acid sequence may be used as it is available for most observed in the experiment for the prediction of interactions between proteins. Similarly, to identify new targeting compounds for a drugs and four categories of target proteins. Especially for nuclear specific target, comparison is made on the compound side or receptors, BLM–NII achieves the most significant improvement as drug side with respect to chemical structures (Laggner et al., this dataset contains many drugs/targets with no interactions in the 2012; Martin et al., 2002), side effects (Campillos et al., 2008) cross-validation. This demonstrates the effectiveness of the NII strat- or other possible measurements of drug. egy and also shows the great potential of BLM–NII for prediction of More sophisticated statistical and machine learning methods compound–protein interactions. have been developed recently for prediction of genome-wide Contact: [email protected] drug–target interactions. In He et al. (2010) and Perlman et al. Supplementary information: Supplementary data are available at (2011), multiple groups of drug-related features and Bioinformatics online. protein-related features have been extracted to describe each drug–target pair. After feature selection, a certain classifier is Received on July 2, 2012; revised on October 15, 2012; accepted on used to predict whether a given pair is interacting or not. November 12, 2012 Yamanishi et al. (2008) proposed a supervised bipartite graph learning approach. In this approach, the chemical space and the 1INTRODUCTION geometric space are mapped into a unified space so that those interacting drugs and targets are close to each other while those Identification of interactions between drugs/compounds and non-interacting drugs and targets are far away from each other. protein targets is an important part of the drug discovery pipe- line. The great advances in molecular medicine and the human By mapping the query pair of drug and target to that space with genome project provide more opportunities to discover unknown the learned mapping function, the probability of interaction associations in the compound–protein interaction network. The between them is then calculated as their closeness in the newly discovered interactions are helpful for discovering new mapped space. Another method called the weighted profile drugs by screening candidate compounds and also may help method was also given in Yamanishi et al. (2008). For a query understand the causes of side effects of existing drugs. Since drug, the weighted profile method assigns a probability of experimental way to determine drug–target interactions is interaction to the query target based on how the neighbors of costly and time-consuming, in silico prediction becomes a this drug interact with this target. Basically, weighted profile is a nearest-neighbor approach and it is called drug-based/target- *To whom correspondence should be addressed. based similarity inference in Cheng et al. (2012). Other than 238  The Author 2012. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: [email protected] Bipartite model for learning from local information and neighbors inferring interactions from the drug similarity or target profile method, which is a nearest-neighbor approach. Our ex- similarity, network-based inference was also studied in perimental results show that BLM–NII performs much better Cheng et al. (2012), which infers or predicts drug–target inter- than the weighted profile method. actions based on the topology of the known interaction network. Systematic experiments are conducted to simulate the task of Different from the work in Cheng et al. (2012), which makes use drug–target interactions prediction cross four datasets. of the drug similarity, target similarity and network-based simi- Compared with state-of-the-art approaches, our proposed larity separately, Chen et al. (2012) apply random walk on a approach achieves consistent improvement in terms of area heterogeneous network constructed with these three types of under ROC (AUC) curve and area under precision versus similarities. Another promising approach is the bipartite local recall (AUPR) curve. As these four datasets contain different model (BLM) approach. Bleakley and Yamanishi (2009) portions of new drug candidates and target candidates in the showed that the ensemble of independent drug-based prediction simulation, the improvements of BLM–NII compared with and target-based prediction with supervised learning performs BLM are also different for the four datasets. The most significant much better than only using each single type of prediction. The improvement is achieved on the nuclear receptor dataset, which BLM method has been further studied and improved in Xia et al. contains the largest portion of new candidates. This shows that (2010) and Laarhoven et al. (2011). The main differences of these the NII strategy, i.e. to infer label information or training three methods include the drug–drug and target–target similari- data from neighbors when there is no training data readily avail- ties, the classifiers and the way used to combine the drug-based able from the query compound/protein itself, is feasible and ef- and target-based interaction probabilities. In Xia et al. (2010), fective for dealing with the new candidate problem of the original semi-supervised approach is used instead of supervised approach BLM. for local model learning; while Laarhoven et al. (2011) found that using only the kernel based on the topology of the known interaction network is able to obtain a very good performance. 2 METHODS In the existing framework of BLM, the model for the query 2.1 Problem formalization drug or target is learned based on local information, i.e. its own Assume that the bipartite interaction network N illustrated in Figure 1 interaction profile. Despite a good performance, BLM has limi- involves m drugs/compounds and m targets, which are referred to as d t tations. It is unable to learn without training data and hence is existing drug candidates and target candidates, respectively. We use not able to provide a reasonable prediction for drug/target can- matrix A to represent this network, i.e. a 2 A ¼1if the i-th compound ij didates that are currently new. Here, a drug-candidate com- d is known to interact with the j-th target t . All other entries of A are 0. i j pound is new if it does not have any known targets, and a The problem under consideration is how to make use of the target-candidate protein is new if it is not targeted by any known interactions together with the compound similarities and protein drugs/compounds. We call this the new candidate problem of similarities to predict new interactions between n drug-candidate com- BLM. Since a large number of compounds and proteins, which pounds and n target-candidate proteins, where n 4m and n4m . This t d d t t are possible drug candidates and target candidates, respectively, means there are m ¼ n  m new drug candidates and m ¼ n  m d d d t t t new target candidates, which have no interactions currently known. are new, in this study, we focus on handling the new candidate The whole network involving n compounds and n proteins can be rep- d t problem by proposing an improved version of BLM called BLM resented as with neighbor-based interaction-profile inferring (BLM–NII). "# The NII procedure is developed to incorporate the capacity of ðN Þ ðN Þ A0 1 m m 2 m m d t d t N  n ¼ ¼ , ð1Þ nd t learning from neighbors into the original BLM method. More ðN Þ ðN Þ 00 3 4 m m m m d d d t specifically, when the query involves a new drug/target candi- where known interactions correspond to non-zero entries of A.Now,we date, we first derive the initial weighted interactions for the want to predict possible interactions in N between existing drug candi- new candidate from its neighbors’ interaction profiles, and then dates and target candidates, as well as in other three subnetworks N , N 2 3 use the inferred interactions as label information to train the and N , where the interactions at least involve one type of new model. In general, neighbors refer to compounds/proteins that have large similarities to the query compound/protein. The presented NII idea happen to be similar to the weighted profile method in some sense. However, our BLM–NII method is substantially different from the weighted profile method in the following aspects. In BLM–NII, the derived interaction profile is used as label information to train the local model or the classi- fier, while in the weighted profile method, the derived weighted interaction is directly used as the final predicted interaction prob- ability. Moreover, in BLM–NII, the NII procedure is integrated into the BLM framework where a certain classifier plays the main role in model learning, and NII is activated only for new drug/target candidates; while in the weighted profile method, thereisnoother classifier andthe procedureofderiving the Fig. 1. Bipartite interaction network: a network consists of two types of weighted profile acts as a classification process, which is applied nodes, where edges only connect different types of nodes. The drug–target for any drug/target candidates. To sum up, the BLM–NII is an interaction network is a bipartite network, where drug and target are two enhanced BLM method, and it is different from the weighted types of nodes and the interactions between them are the edges 239 J.-P.Mei et al. candidates, i.e. the target candidate is new, the drug candidate is new or similarity which encodes the topology information of the interaction net- both are new. work has been shown to provide good results. With the Gaussian kernel, the network-based drug similarity S and network-based target similarity S are calculated as: 2.2 Bipartite local model 0 0 2 ka  a k To predict p , the probability that a drug d and a target t interact, i j ij i j d S ði; jÞ¼ exp  ; ð11Þ the basic BLM proposed by Bleakley and Yamanishi (2009) is described as follows. A local model for d denoted as Mod (i) is first learned i d 0  2 based on its interaction profile a and the similarities between targets ka  a k i j i t t S ði; jÞ¼ exp  ; ð12Þ S ,i.e. t 0 n 1 2 Mod ðiÞ¼ trainðS ; a Þ: ð2Þ d where the bandwidth  ¼   a , and different bandwidths may i 0 i¼1 ij be used for drug and target, respectively. However, the result with Here, train represents the learning process of a certain classifier, e.g. network-based similarity may not remain good when the information support vector machine or (Kernel) regularized least squares (RLS), the contained in the interaction network is not sufficient enough. Rather similarity matrix S is used as the observed data of target candidates, and than considering one type of similarity, a more general way is to combine the interaction profile a ,i.e.the i-th row vector of A,serves aslabel several types of similarities. Here, we use both the network-based simi- information to label each target candidate whether interacting with this larity and chemical similarity for drug similarity S , and the drug. Once the model Mod (i)is learned, itisused to predict p ,the d t ij network-based similarity and sequence similarity for target similarity S probability of interaction between d and the query target candidate t : i j through linear combination: d t p ¼ testðMod ðiÞ; s Þ, ð3Þ ij j d d d S ¼ S þð1  ÞS ; ð13Þ c n where s is the j-th column of S recording the similarities between t and t t t S ¼ S þð1  ÞS ; ð14Þ other targets. The similar model learning and prediction process are s n performed independently from the query-target side to get p ,i.e. ij d t where S is the chemical structure similarity for drug, S is the amino acid c s sequence similarity for protein and  is the combination weight set by Mod ðjÞ¼ trainðS ; a Þ; ð4Þ t j user. Although more sophisticated ways such as Kronecker product t d may be used to combine two types of similarity matrices or kernel p ¼ testðMod ðjÞ; s Þ; ð5Þ ij i matrices, experimental results in (Laarhoven et al. 2011) show that the where a is the j-th column vector of A or the interaction profile of target linear combination gives comparable performance with a much lower d t t . Once both p and p have been calculated, they are combined to get computational complexity. ij ij probability p : ij d t 2.3 Neighbor-based interaction-profile inferring p ¼ gðp ; p Þ; ð6Þ ij ij ij Good performance of supervised learning is largely dependent on the d t where g is a function that combines or integrates p and p . Examples ij ij amount and quality of labeled training data. When a drug/target candi- d t d t include p ¼ maxfp ; p g and p ¼ 0:5ðp þ p Þ,where g is the max or ij ij ij ij ij ij date is new, it has no existing interactions that can be used as label average function. information and the model for this candidate thus can not be learned. After p is calculated for each pair of compound i and protein j,the ij As shown in (7), interactions between new drug candidates and new output network of BLM may be represented as target candidates remain unpredicted in BLM. To extend the application "# BLM BLM domain of BLM to new drug/target candidates, we propose to derive N N 1 2 N ¼ ; ð7Þ BLM BLM training data from their neighbors. Based on the assumption that N 0 drugs/compounds which are similar to each other interact with the same targets, interaction profile for new drug-candidate compounds with could be possibly inferred from their neighbors’ interactions. BLM N ¼ N þ P ðMod ; Mod Þ; ð8Þ 1 1 d t Compounds with large similarities to the new drug-candidate compound are said to be its neighbors. Since new drug-candidate compounds have BLM N ¼ P ðMod Þ; ð9Þ 2 d 2 no interactions, or all the elements of its current interaction profile vector are 0, it is not suitable to consider network-based similarity here, so only BLM N ¼ P ðMod Þ: ð10Þ 3 t chemical structure similarity is used to define the neighbors of a drug-candidate compound. Formally, for a compound d which is a where P gives the predicted interactions between existing drug candi- new drug-candidate, we infer the j-th dimension of its interaction profile dates and existing target candidates, P are predicted interactions between 2 d l (i)with existing drug candidates and new target candidates and P gives predicted interactions between new drug candidates and existing target X l ðiÞ¼ s a ; ð15Þ ih hj candidates. h¼1 For any classifier that is used, the known targets of d corresponding to 0 t non-zero elements of a and the pairwise target similarity S are critical to where s is the chemical similarity between two compounds d and d .The i ih i h the final prediction of p . The model learned for d describes how this above formula shows that the interaction weight of this drug with respect ij drug selects targets. Once the model is learned, the similarities between to the j-th target is the collection of its neighbors’ interactions to this the query target and those known targets of d largely decide p .Similarly, target. For a given new drug-candidate compound, the simple formula ij known targeting drugs of t or non-zero elements of t ’s interaction profile given in Equation (15) defines that the inferred weight of interaction j j a and the pairwise drug similarity S are critical to the final prediction of between this compound and a target is high if many of its neighbors p . Under the same BLM framework, different results are produced due interact with this target, and also it is decided more by neighbors with ij d t to the differences in S , S , the classifier and the combination function g. large similarities than those with small similarities. Since new According to the study of Laarhoven et al. (2011), network-based target-candidate proteins have no interactions with any compound, the 240 Bipartite model for learning from local information and neighbors inferred interactions for d are only with existing target candidates. To be BLMNII 0 N ¼ P ðMod ; Mod Þ; ð20Þ 2 d 2 t more specific, l ðiÞ > 0if the j-th target candidate is an existing one, i.e. a 40 for at least one h,and l ðiÞ¼ 0if the j-th target candidate is BLMNII 0 hj N ¼ P ðMod ; Mod Þ; ð21Þ 3 t 3 d new, i.e. a ¼0for all h. To ensure the value of each l ðiÞ is in the hj range of [0, 1], linear scale is performed subsequently, i.e. BLMNII 0 0 N ¼ P ðMod ; Mod Þ: ð22Þ d d d d d 4 d t l ðiÞ¼ðl ðiÞ min l ðiÞÞ=ðmax l ðiÞ min l ðiÞÞ. After we obtained h h h j j h h h the inferred interaction profile, we can use it as label information to Comparing N and N , it is observed that the interactions be- BLMNII BLM learn the model of d : i tween existing drug candiates and target candiates are the same for the two approaches, while the interactions in the other three cases in BLM– 0 t d Mod ðiÞ¼ trainðS ; l ðiÞÞ: ð16Þ NII are different from those in BLM. First, BLM–NII is able to predict In the same way, this procedure is applied to a new target-candidate P , the interactions between drug candidates and target candidates that protein t to obtain its inferred interaction profile l (j), where its neighbors are both new. Second, P and P in BLM–NII are predicted from both j 2 3 are defined based on sequence similarity. The model of t can then be the drug side and the target side, while in BLM are predicted only from learned with l (j): one side. Learning from neighbors allows drug/target candidates to obtain 0 d t Mod ðjÞ¼ trainðS ; l ðjÞÞ: ð17Þ labeled data when themselves do not have or have insufficient labeled data for training. This procedure actually introduces some degree of glo- This interaction profile inferring technique is particularly useful for balization into the original local model to provide more chances of learn- those new drug/target candidates, for which existing supervised methods ing from known knowledge. However, too much globalization is not (e.g. BLM) fail to produce reasonable predictions. It can also be useful to desired as it could eliminate the local characteristics and make the enhance the classification models for any compounds/proteins without models of individual candidates less discriminative. Moreover, the low enough training data or label information. quality of neighbors due to imprecise similarity measure may cause nega- tive impact when the learning process replies on too much neighbors’ 2.4 BLM with NII information. In other words, the inferred interaction profile, although is helpful, may introduce a certain amount of noise. Therefore, in this By integrating the above presented NII strategy into the BLM frame- study, we only activate the neighbor-based learning for totally new can- work, we have the BLM with NII (BLM–NII). The detailed steps of didates. For other cases, we still train the model locally with its own BLM–NII to predict the probability p between any compound i and ij known interactions. any protein j is described in Algorithms 1 and 2. Algorithm 1: BLM–NII 3 MATERIALS d t input : A, S , S c s To facilitate comparison with published approaches, we used the output: p ij d d t same groups of four datasets which are first analyzed by get p ¼ NII-integrated Learning and Prediction ( A, S , S )from d ; ij c s i t t d get p ¼ NII-integrated Learning and Prediction ( A, S , S )from t ; Yamanishi et al. (2008) and then later by Bleakley and ij s c j d t d t Combine p and p to get the final result p ¼ gðp ; p Þ ij Yamanishi (2009), Xia et al. (2010), Laarhoven et al. (2011) ij ij ij ij and Cheng et al. (2012). These four datasets correspond to drug–target interactions of four important categories of protein targets, namely enzyme, ion channel, G-protein-coupled receptor Algorithm 2: NII-integrated learning and prediction d t (GPCR) and nuclear receptor, respectively. The datasets were input : A, S , S c s downloaded from http://web.kuicr.kyoto-u.ac.jp/supp/yoshi/ output: p ij drugtarget/. if d is new then |obtain l (i) with Eq. (15) with S Table 1 gives some statistics of each dataset including the total else number of drugs (n ), the total number of targets (n ), the total d t | l (i)isthe i-th row of A number of interactions (E), the average number of targets for end each drug (D ), the average number of targeting drugs for each t t Compute S with Eq. (12) and S with Eq. (14); target (D ), the percentage of drugs that have only one target t d Learn a local model for d,i.e., Mod (i) ¼ train(S , l (i)) ; i d (D ¼ 1) and the percentage of targets that have one targeting if t is new then t drug (D ¼ 1). It is shown from this table that among the four d t | predict p with Mod (i)and S ij s drug–target interaction networks, on average, each drug and else target in ion channel and enzyme have more interactions than | predict p with Mod (i)and S ij end those in GPCR and nuclear receptor. It is also worthy noting that in the leave-one-out cross–validation (LOOCV), drugs and targets with one interaction are ‘new candidates’ as the only one interaction is covered over to leave no recorded interaction, e.g. The output network of BLM–NII is expressed as 72% drugs in the nuclear receptor are ‘new candidates’ in the "# simulation. BLMNII BLMNII N N 1 2 Each dataset is described by three types of information in the N ¼ ; ð18Þ BLMNII BLMNII BLMNII N N 3 4 form of three matrices: (i) the drug–target interaction matrix; (ii) the drug–drug similarity matrix and (iii) the target–target simi- with larity matrix. The interaction networks were retrieved from the BLMNII BLM N ¼ N ; ð19Þ 1 1 KEGG BRITE (Kanehisa et al., 2006), BRENDA (Schomburg 241 J.-P.Mei et al. Table 1. Some statistics of the four datasets Table 2. Comparison with existing approaches for the four datasets Dataset Enzyme Ion channel GPCR Nuclear Dataset Method AUC AUPR receptor Enzyme Weighted profile 86.4 6.30 n 445 210 223 54 BY(2009) 97.6 83.3 n 664 204 95 26 Laarhoven et al. (2011) 97.8 91.5 E 2926 1476 635 90 BLM–NII 98.8 92.9 D 6.58 7.03 2.85 1.67 Ion channel Weighted profile 81.9 17.2 D 4.41 7.24 6.68 3.46 BY(2009) 97.3 78.1 D ¼ 1(%) 39.78 38.57 47.53 72.22 Laarhoven et al. (2011) 98.4 94.3 D ¼ 1(%) 43.37 11.27 35.79 30.77 BLM–NII 99.0 95.0 GPCR Weighted profile 76.5 10.9 BY(2009) 95.5 66.7 Laarhoven et al. (2011) 95.4 79.0 et al., 2004), SuperTarget (Gnther et al., 2008) and DrugBank BLM–NII 98.4 86.5 (Wishart et al., 2008). The drug–drug similarity is measured Nuclear receptor Weighted profile 74.9 17.1 based on chemical structures from the DRUG and BY(2009) 88.1 61.2 Laarhoven et al. (2011) 92.2 68.4 COMPOUND sections in the KEGG LIGAND database BLM–NII 98.1 86.6 (Kanehisa et al., 2006). The chemical structure similarities be- tween drugs are computed with SIMCOMP (Hattori et al., 2003), which uses a graph alignment algorithm to get a global three for all the datasets. Since the results of weighted profile are similarity score based on the size of the common substructures much worse than those of the three BLM-based methods namely between two compounds. The target–target similarity is mea- BY (2009), Laarhoven et al. (2011) and BLM–NII, we now focus sured based on the amino acid sequences retrieved from the on the comparison of these three approaches. As been discussed KEGG GENES database (Kanehisa et al., 2006). The sequence in Laarhoven et al. (2011), by incorporating the network-based similarities between proteins are computed with a normalized similarity, the performance of BLM can be improved, i.e. the version of Smith–Waterman score. More details on how the results of Laarhoven et al. (2011) in terms of AUPR are much data have been collected and calculated are given in Yamanishi better than those of BY (2009). It is also shown that the per- et al. (2008). formance of BLM can further be improved by integrating the NII procedure, i.e. the results of BLM–NII is consistently better than those of Laarhoven et al. (2011). 4EVALUATION It is interesting to observe that different levels of improve- Systematic experiments are performed to evaluate the perform- ments have been achieved for different datasets. Comparing ance of the presented approach with datasets summarized in Laarhoven et al. (2011) and BY (2009), the improvement is the Table 1. As in Laarhoven et al. (2011), LOOCV is performed. most significant on ion channel and the least significant on nu- Since the real interaction to be predicted is left out, compounds clear receptor. Differently, comparing BLM–NII and Laarhoven and proteins with one interaction (i.e. D ¼1or D ¼ 1) turn out d t et al. (2011), the improvement is the largest for nuclear receptor to have no training data and thus they are treated as ‘new can- and the least for ion channel. Such kind of differences are ex- didates’ in the cross-validation. To test the robustness of the pected due to the differences in the structure of the datasets. presented approach, we also performed 10-fold cross-validation. From Table 1, it is shown that among the four datasets, the The results of 10 trials 10-fold cross-validation can be found in average numbers of interactions of each drug and target are Tables S5–S8 of the Supplementary Material. the largest for ion channel and the smallest for nuclear receptor. This means that the interaction network of ion channel contains 4.1 Compare with state-of-the-art approaches more information than nuclear receptor and thus the First, we compare the performance of BLM–NII (g ¼ max, network-based similarity of ion channel is more robust and in- ¼ 0.5) with the weighted profile method (Yamanishi et al., formative than that of nuclear receptor. Therefore, incorporating 2008) and two other state-of-the-art approaches (Bleakley and the network-based similarity results in larger improvement for Yamanishi, 2009) and (Laarhoven et al., 2011) denoted as BY ion channel. Since drugs or targets with one interaction are ‘new (2009) and Laarhoven et al. (2011), respectively. The same RLS candidates’ in the simulation, it is also shown from Table 1 that classifier is used for BLM–NII as Laarhoven et al. (2011). We the nuclear receptor contains the largest portion of ‘new candi- measure the quality of the predicted interactions in terms of dates’ while the ion channel contains the least. Thus, by applying AUC curve (or true-positive rate versus false-positive rate the NII procedure, BLM–NII has more chances to improve the curve) and AUPR curve. results for nuclear receptor than for Ion Channel. Table 2 gives the AUC and AUPR scores of the four approaches for the four datasets. The results of BY (2009) and 4.2 Comparison between BLM and BLM–NII Laarhoven et al. (2011) are the best ones reported in Bleakley and Yamanishi (2009) and Laarhoven et al. (2011), respectively. To directly show the improvements attributed to the NII strat- From this table, it is clear that BLM–NII outperforms the other egy, we now compare BLM–NII and BLM, i.e. the results of 242 Bipartite model for learning from local information and neighbors (a)(b)(c) Fig. 2. AUPR of BLM and BLM–NII for nuclear receptor with different types of similarities: (a)  ¼ 1, (b)  ¼ 0 and (c)  ¼ 0.5 (a)(b)(c) Fig. 3. Precision–recall curve of BLM and BLM–NII for nuclear receptor with different similarities: (a)  ¼ 1, (b)  ¼ 0 and (c)  ¼ 0.5 Table 3. Compare 1% and 3% top ranked pairs of BLM and BLM–NII for nuclear receptor Top 1% Top 3% Method Sensitivity PPV MCC Sensitivity PPV MCC 1 BLM 13.3 85.7 32.5 35.6 76.2 50.0 BLM–NII 15.6 100.0 38.3 44.4 95.2 63.7 0 BLM 16.7 93.8 38.3 32.2 67.4 44.3 BLM–NII 18.9 100.0 42.3 45.6 97.6 65.4 0.5 BLM 15.6 100.0 38.3 40.0 85.7 56.9 BLM–NII 15.6 100.0 38.3 45.6 97.6 65.4 BLM–NII where new candidates are treated as existing ones. We the precision–recall curve of BLM and BLM–NII. Table 3 shows applied both BLM and BLM–NII with three different groups of the sensitivity (or recall), PPV (positive predictive value or pre- inputs by setting  in Equations (13) and (14) to 1, 0 and 0.5. cision) and MCC (Matthews correlation coefficient). The two We obtained the AUC and AUPR scores of both methods groups of results in Table 3 are calculated by considering the with g ¼ max. The results of both with g ¼ average or mean 1% and 3% pairs with the highest p values as positive, respect- ij have also been produced, which can be found in Tables S1–S4 ively. It is clearly shown from these results that with NII being of the Supplementary Material. Since the same conclusion can be integrated, the performance of BLM has been improved. drawn with respect to either of the two metrics, we put the AUC scores in the Supplementary Material and plot the AUPR scores of BLM and BLM–NII for the four datasets with three different 4.3 Detailed analysis of the effectiveness of NII types of similarities in Figure 2. It is shown that for any type of To take a close look at the difference in the results attributed to similarities, BLM–NII performs better than BLM for all the the NII strategy, we now compare those top ranked interactions datasets. Again, the improvements made by BLM–NII are more significant for nuclear receptor and GPCR than for the of the nuclear receptor dataset produced by BLM–NII and other two datasets. BLM. Since this dataset has 90 known interactions, we inspect Now using nuclear receptor, we make further comparison of the 90 interactions with the highest probabilities predicted by the performance between BLM and BLM–NII. Figure 3 plots each algorithm. 243 J.-P.Mei et al. As summarized in Table 4 (More detailed results are in Table (nuclear receptor subfamily 1, group I, member 3) and D05341 [Palmitic acid (NF)] – hsa3174 (hepatocyte nuclear factor 4, S11 of the Supplementary Material), among the top 90 predicted gamma), which are assigned extremely low ranks by BLM are interactions, BLM only correctly detected 58 known interactions successfully detected by BLM–NII as shown in Figure 4. After while BLM–NII detected 71, and 57 known interactions are checking, we find that the query drug D00163 of the first pair ranked within 90 by both. Although one interaction detected only has one target which happens to be the query target by BLM is missed by BLM–NII, this one ranks 104 in BLM– hsa9971, and the query target is known to be only interacting NII, which indicates that this pair is still recognized to be inter- with the query drug. The other two pairs have the same situation acting with a highly possibility by BLM–NII. Nevertheless, 14 as this pair. As we left out the true interaction in our simulation, interactions detected by BLM–NII are missed by BLM. The the testing for these three pairs becomes to predict interaction average rank of these 14 interactions produced by BLM is 388 between new drug-candidate compound and new target- as some of them ranks very low. candidate protein. Since training data are absent for both the Among these 14 drug–target pairs, three pairs namely D00163 query drug and query target, BLM fails to detect interactions (Chenodeoxycholic acid) – hsa9971 (nuclear receptor subfamily for those three pairs. Although difficulty is presented for such 1, group H, member 4), D00506 (Phenobarbital) – hsa9970 kind of cases, BLM–NII successfully detected these three pairs to be interacting. This shows the effectiveness of NII for prediction of interaction involving new candidates. Table 4. Performance of BLM and BLM–NII on nuclear receptor Now using D00163 and hsa9971 as an example, we give inter- mediate results to illustrate how NII helps detect the interactions between new drug-candidate compounds and new target- Total known interactions: 90 candidate proteins. Figure 5 shows the local model learned for D00163 with the help of inferred training data. Specifically, Interactions detected by BLM 58 Figure 5a shows the inferred interaction profile of D00163, i.e. Interactions detected by BLM–NII 71 Interactions detected by both BLM and BLM-–NII 57 the weighted interactions between D00163 and 25 non-query tar- gets calculated with Equation (15). It shows that the associations between D00163 and several targets such as hsa2099 are large. This is because many of D00163’s neighbors or similar drugs, (b) (a) (c) such as D00066, interact with this target as seen from Figure 5b. Using this inferred interaction profile as label information, Figure 5c shows the learned local model of D00163, or the pair1 weight of each of the targets learned with the classifier with re- spect to D00163. With this learned model, BLM–NII successfully detected the interaction between D00163 and hsa9971 based on pair2 the similarities between the query target hsa9971 and other tar- gets especially those with large weights in the model of D00163. In the same manner, the local model of the query target which is a ‘new’ candidate can be learned with NII. This example illus- trates the feasibility and effectiveness of the presented approach pair3 to infer training data or label information from the interaction Fig. 4. Drug–target interaction matrix of nuclear receptor. (a)Predicted profiles of neighbors. by BLM, (b) predicted by BLM–NII, (c) real interaction matrix. Each entry of the interaction matrix is plotted as a pixel. The brightness of a pixel represents the interaction possibility of the corresponding pair, i.e. the 5 CONCLUSION AND DISCUSSION brighter the more possible that the pair interacts. Three pairs are circled in We proposed an intuitive solution to the new candidate problem (b). These three pairs which consist of drug candidate and protein candi- of BLM by integrating a NII procedure, i.e. infer training data date that are both ‘new’ in Loo validation are detected by BLM–NII (a)(b)(c)(d) Fig. 5. Local model learning for D00163 with BLM–NII. (a) inferred interaction profile l of D00163, (b) weighted interaction of D00163’s neighbors to hsa2099 calculated with s(D00163, i)  a(i, hsa2099) for each drug i, (c) learned model of D00163 by the RLS classifier, (d) similarities between hsa9971 and other proteins, i.e. s has9971 244 Bipartite model for learning from local information and neighbors from neighbors’ interaction profiles. Through systematic REFERENCES experiments with benchmark datasets, we demonstrated the Bleakley,K. and Yamanishi,Y. (2009) Supervised prediction of drug–target inter- effectiveness of BLM–NII for predicting interactions between actions using bipartite local models. Bioinformatics, 25, 2397–2403. new drug-candidate compounds and new target-candidate Campillos,M. et al. (2008) Drug target identification using side-effect similarity. Science, 321, 263–266. proteins. Chen,X. et al. (2012) Drug–target interaction prediction by random walk on the In the presented approach, we allow all the neighbors to par- heterogeneous network. Mol. BioSyst, 8, 1970–1978. ticipate in training data inferring. To allow only neighbors with Cheng,F. et al. (2012) Prediction of drug–target interactions and drug repositioning large similarities to contribute, a threshold may be used to reduce via network-based inference. PLoS Comput. Biol., 8, e1002503. the impact of those non-important neighbors to 0. Alternately, a Gunther,S. et al. (2008) Supertarget and matador: resources for exploring drug– target relationships. Nucleic Acids Res., 36, D919–D922. Gaussian function may be introduced to gradually decrease the Hattori,M. et al. (2003) Development of a chemical structure comparison method influence of neighbors based on their distances to the new drug/ for integrated analysis of chemical and genomic information in the metabolic target candidate in query. pathways. J. Am. Chem Soc., 125, 11853–11865. In the current work, we only apply the NII procedure for those Haupt,V.J. and Schroeder,M. (2011) Old friends in new guise: repositioning of completely new candidates that have no existing training data at known drugs with structural bioinformatics. Breif. Bioinform, 12, 312–326. He,Z. et al. (2010) Predicting drug–target interaction networks based on functional all, and we find that the results are already good enough to show groups and biological features. PLoS One, 5,e9603. the usefulness of NII. Since it is quite common that drugs only Kanehisa,M. et al. (2006) From genomics to chemical genomics: new developments activate or inhibit a small number of targets and targets are only in kegg. Nucleic Acids Res., 34,D354–D357. activated or inhibited by very limited drugs, the NII procedure Keiser,M.J. et al. (2009) Predicting new molecular targets for known drugs. Nature, 462, 175–181. may be applied to drugs and targets which do not have sufficient Laarhoven,T.V. et al. (2011) Gaussian interaction profile kernels for predicting training data. We expect that more accurate prediction models drug–target interaction. Bioinformatics, 27, 3036–3043. may be build by using neighbors’ information to enhance the Laggner,C. et al. (2012) Chemical informatics and target identification in a zebrafish limited training examples. However, too much emphasis on phenotypic screen. Nat. Chem. Biol., 8, 144–146. neighbors tends to eliminate the local characteristics of each Martin,Y.C. et al. (2002) Do structurally similar molecules have similar biological drug and target and could cause deterioration in the prediction activity? J. Med. Chem., 45, 4350–4358. Perlman,L. et al. (2011) Combining durg and gene similarity measures for drug– performance. Nevertheless, it would be an interesting future target elucidation. J. Comput. Biol., 18, 133–145. work to explore the balance between local information and Schomburg,I. et al. (2004) Brenda, the enzyme database: updates and major new global information in model learning. developments. Nucleic Acids Res., 32, D431–D433. Wishart,D.S. et al. (2008) Drugbank: a knowledgebase for drugs, drug actions and Funding: This research was supported by Singapore MOE AcRF drug targets. Nucleic Acids Res., 36, D901–D906. (MOE2008-T2-1-074) and Startup (M4080108.020) from Xia,Z. et al. (2010) Semi-supervised drug–protein interaction prediction from het- Nanyang Technological University, Singapore. erogeneous biological spaces. BMC Syst. Biol., 4 (Suppl. 2), S6. Yamanishi,Y. et al. (2008) Prediction of drug–target interaction networks from the in- Conflict of Interest: none declared. tegration of chemical and genomic spaces. Bioinformatics, 24, i232–i240.

Journal

BioinformaticsOxford University Press

Published: Nov 17, 2012

There are no references for this article.