A review of databases predicting the effects of SNPs in miRNA genes or miRNA-binding sites

A review of databases predicting the effects of SNPs in miRNA genes or miRNA-binding sites Abstract Modern precision medicine comprises the knowledge and understanding of individual differences in the genomic sequence of patients to provide tailor-made treatments. Regularly, such variants are considered in coding regions only, and their effects are predicted based on their impact on the amino acid sequence of expressed proteins. However, assessing the effects of variants in noncoding elements, in particular microRNAs (miRNAs) and their binding sites, is important as well, as a single miRNA can influence the expression patterns of many genes at the same time. To analyze the effects of variants in miRNAs and their target sites, several databases storing variant impact predictions have been published. In this review, we will compare the core functionalities and features of these databases and discuss the importance of up-to-date data resources in the context of web applications. Finally, we will outline some recommendations for future developments in the field. miRNAs, SNPs, databases, target sites Introduction With the advent of next-generation sequencing, the amount of available biological data sets is continuously increasing [1, 2]. Having these high-throughput technologies, the discovery of single-nucleotide polymorphisms (SNPs) or single-nucleotide variants (SNVs) has been greatly facilitated. It is therefore not surprising that during the past decade, the number of known variants has increased exponentially. The largest resource as of today storing human genetic variations is NCBI’s dbSNP [3], which in its current version (build 150) encompasses over 100 million validated variants, resulting in one variant every 30 bases. Importantly, SNPs have been used as markers for a large panel of diseases, such as cystic fibrosis [4], various cancers [5–7] and neurodegenerative diseases [8, 9]. Indeed, variants in coding regions might directly affect protein formation and expression and are therefore still in the main focus of current variant analysis applications. The effects of variants located in noncoding regions, however, are more difficult to elucidate. In recent years, increasing attention has been paid to the noncoding regions of the human genome. In fact, noncoding regions make up over 98% of the genome [10]. Many regulatory RNA classes have been discovered in these so far, such as long noncoding RNAs, or microRNAs (miRNAs). The latter are endogenous small noncoding RNA molecules that play a central role in posttranscriptional gene regulation [11]. They are evolutionary conserved and expected to regulate a large part of the human protein coding genes and a majority of pathways [12]. Therefore, especially blood-borne miRNAs have been investigated as noninvasive biomarkers for an early detection of multiple diseases [13–15], highlighting their potential for precision medicine. Regarding their general mechanism of action, miRNAs bind to the 3ʹ untranslated region (UTR) of mRNAs, which leads either to mRNA cleavage or translational inhibition [16]. Recent studies revealed that miRNAs can sometimes bind to the 5ʹ UTR as well, leading to an increased mRNA expression [17–20]. Multiple biogenesis pathways have been reported for miRNAs in mammals [21]; however, most miRNAs seem to follow a single pathway. Their maturation process starts with the transcription of a longer primary miRNA (pri-miRNA) molecule containing a local stem-loop structure. This molecule is processed by the Microprocessor complex composed of Drosha and DGCR8, which cleaves the stem-loop to release a small hairpin, the precursor miRNA (pre-miRNA), with a 2 nucleotide (nt) long 3ʹ overhang [22]. The hairpin is then exported into the cytosol, where its loop is cleaved by Dicer resulting in a small RNA duplex [23]. The latter is then loaded onto an Argonaute protein to form the RNA-induced silencing complex (RISC) [24] and subsequently one of the two strands is degraded. The remaining strand then guides the RISC to its mRNA target. The complete binding mechanism of miRNAs is not yet fully understood, though the complementarity of the seed region consisting of six to eight nts starting at the second position from the 5ʹ end of the miRNA to the 3ʹ UTR is playing a crucial role for mRNA target selection [25]. As perfect, or nearly perfect complementarity to such binding sites, also called miRNA response elements (MREs), is required, it is evident that SNVs in these sites or inside the miRNA seed sequence can have a substantial impact on the overall regulation network. Thereby, SNVs in MREs can lead to a loss of binding ability of certain miRNAs, but at the same time increase the binding ability of other miRNAs. Further, SNVs outside of MREs might result in the creation of new MREs. In the same vein, SNVs in miRNAs might lead to the loss of regulation ability of a target gene, but also to a gain of regulation of another target gene. Furthermore, SNVs in pri- or pre-miRNAs could have a large regulatory effect as well, as they could, in rare cases, lead to changes in the secondary structure of the pri-miRNA and thus to reduced cleaving efficiency by Drosha and Dicer [26]. Even though the seed regions of miRNAs are evolutionarily conserved [27] and the occurrence of SNVs in these regions are rare, a multitude of diseases has been found to be associated with such [28–30]. Among the prominent examples are mental disorders like schizophrenia and autism [31], multitudinous cancers and nonsyndromic progressive hearing loss [32]. Similarly, SNVs in 3ʹ UTRs have been correlated with multiple cancers [33–36] and neurodegenerative diseases [37]. These examples highlight the need for a better understanding of the presence and effects of SNVs in miRNAs and UTRs, as they can lead to completely different phenotypes. For analyzing the effects of SNVs on miRNA–target relations on a system-wide basis, several miRNA-target SNP databases such as miRdSNP [38], MirSNP [39], PolymiRTS database [40] and miRNASNP [41] have been developed. However, as the procedure of target prediction is computationally expensive and the number of known SNVs has been increasing substantially over the past years, most of these resources are outdated. In this review, we will compare several state-of-the-art databases that are available and evaluate apparent information gaps to the content of up-to-date resources. Overview of SNP effect prediction web servers In this section, we present four databases helping to predict and assess the effects of SNVs in human miRNAs and their respective target genes. Table 1 presents a compact overview of these databases and their provided features. Table 1. Overview of the features provided by the evaluated databases miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Year of publication 2012 2012 2013 2015 Search miRNAs/genes/SNPs Yes Yes Yes Yes Batch search No Yes Yes No Search-linked SNPs No Yes No No Browse Yes No Yes Yes Binding site locus visualization Yes No No No Binding site miRNA/mRNA visualization Partiala Yes No Yes MRE gain/loss Partialb Yes Yes Yes Binding affinity increase/decrease No Yes No No SNP distance Yes, in 3ʹ UTR No No Yes, in miRNA flanks Filter by MAF No Yes No Yes, in pre-miRNAs Conservation information No Yes Yes No Filter by experimental support Yes No Yes Yes Filter by disease/traits/GO Yes No Yes No Contains INDELs No No Yes No Contains miRNA/gene expression No No No Yes SNPs in miRNAs No No Yes Yes SNPS in 3ʹ UTR Yes Yes Yes Yes miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Year of publication 2012 2012 2013 2015 Search miRNAs/genes/SNPs Yes Yes Yes Yes Batch search No Yes Yes No Search-linked SNPs No Yes No No Browse Yes No Yes Yes Binding site locus visualization Yes No No No Binding site miRNA/mRNA visualization Partiala Yes No Yes MRE gain/loss Partialb Yes Yes Yes Binding affinity increase/decrease No Yes No No SNP distance Yes, in 3ʹ UTR No No Yes, in miRNA flanks Filter by MAF No Yes No Yes, in pre-miRNAs Conservation information No Yes Yes No Filter by experimental support Yes No Yes Yes Filter by disease/traits/GO Yes No Yes No Contains INDELs No No Yes No Contains miRNA/gene expression No No No Yes SNPs in miRNAs No No Yes Yes SNPS in 3ʹ UTR Yes Yes Yes Yes a Depends on the target prediction algorithm used. b Only gains were predicted for dSNPs. Table 1. Overview of the features provided by the evaluated databases miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Year of publication 2012 2012 2013 2015 Search miRNAs/genes/SNPs Yes Yes Yes Yes Batch search No Yes Yes No Search-linked SNPs No Yes No No Browse Yes No Yes Yes Binding site locus visualization Yes No No No Binding site miRNA/mRNA visualization Partiala Yes No Yes MRE gain/loss Partialb Yes Yes Yes Binding affinity increase/decrease No Yes No No SNP distance Yes, in 3ʹ UTR No No Yes, in miRNA flanks Filter by MAF No Yes No Yes, in pre-miRNAs Conservation information No Yes Yes No Filter by experimental support Yes No Yes Yes Filter by disease/traits/GO Yes No Yes No Contains INDELs No No Yes No Contains miRNA/gene expression No No No Yes SNPs in miRNAs No No Yes Yes SNPS in 3ʹ UTR Yes Yes Yes Yes miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Year of publication 2012 2012 2013 2015 Search miRNAs/genes/SNPs Yes Yes Yes Yes Batch search No Yes Yes No Search-linked SNPs No Yes No No Browse Yes No Yes Yes Binding site locus visualization Yes No No No Binding site miRNA/mRNA visualization Partiala Yes No Yes MRE gain/loss Partialb Yes Yes Yes Binding affinity increase/decrease No Yes No No SNP distance Yes, in 3ʹ UTR No No Yes, in miRNA flanks Filter by MAF No Yes No Yes, in pre-miRNAs Conservation information No Yes Yes No Filter by experimental support Yes No Yes Yes Filter by disease/traits/GO Yes No Yes No Contains INDELs No No Yes No Contains miRNA/gene expression No No No Yes SNPs in miRNAs No No Yes Yes SNPS in 3ʹ UTR Yes Yes Yes Yes a Depends on the target prediction algorithm used. b Only gains were predicted for dSNPs. miRdSNP Published in 2012, miRdSNP [38] provides information about the distance between MREs and SNPs from dbSNP (build 130) [3]. It incorporates experimentally validated MREs from TarBase [42], miRTarBase [43], miRecords [44] and miR2disease [45], as well as the MRE predictions yielded by PicTar [46] and TargetScan 5.1 [47] on the wild type 3ʹ UTR sequences. Further, a manually curated set of disease-causing SNPs (dSNPs) is available, allowing to post-filter the provided predictions accordingly. In addition, new MREs induced by dSNPs that were predicted using miRanda [48] are listed. The website provides three tabs for browsing through the collected interactions. On the first tab, a searchable table containing all MREs and dSNPs is printed. Users can search using SNPs by their dbSNP ID, miRNAs by their miRBase name and genes by their symbols. In addition, the table can be filtered according to the SNP distance to a binding site and associated diseases. For each entry in the table, the corresponding binding site of the miRNA can be displayed, as well as the location of the associated SNP. In addition, for each SNP linkage disequilibrium (LD) frequency information is provided. On the second tab, density plots visualizing MREs, SNPs and dSNPs are shown through an interactive plot, allowing to further inspect regions in the UCSC genome browser. In a third tab, a table showing the entire gene set and its associated MREs and SNPs are displayed. This table is however not filterable. For each gene, all binding sites and SNPs can be queried. In conclusion, miRdSNP is focusing on the spatial relationship of SNPs, and thus, its major strength is the distance information provided for all SNPs relative to known MREs. Another major strength is the collection of manually curated dSNPs and the ability to filter interactions accordingly. Its major weaknesses are on one hand the missing information on MRE losses and on the other that SNPs in miRNAs are not considered at all. The database is available at http://mirdsnp.ccr.buffalo.edu/. MirSNP Published in 2012, MirSNP [39] provides a collection of human SNPs in potential MREs located in 3ʹ UTRs, as predicted by miRanda. SNPs were collected from dbSNP build 135. For each reported interaction, the effect induced by the corresponding SNP is reported, i.e. the creation/deletion of an MRE or the increase/decrease of binding affinity. Besides, information on the minor allele frequency (MAF) and LD are provided, if available. Finally, it implements the ability to filter predictions with lists of SNPs from, e.g., GWAS and eQTL studies, including optionally linked SNPs from multiple populations as provided by HapMap [49]. The website provides three search forms allowing to perform single searches, batch searches or searches with a list of disease or trait associated SNPs. Via these forms, the user can query the database using SNPs by their dbSNP ID, genes by their symbols or RefSeq mRNA ID or miRNAs by their miRBase name. While individual searches can be restricted by MAF, linked SNPs can be requested as well. The search functionality was unavailable during our review process; therefore, we can only describe the search results according to the help page of the database. After submitting a request, the user is redirected to a separate page containing a table with predicted effects on MREs, i.e. gains and losses or changes to the binding affinity. This particular table displays information for both the reference and alternative alleles. In addition to the basic effects, multiple measures are listed: mirSVR score, MAF, miRanda score, binding energy, conservation information of phastCons 46way vertebrates from UCSC and a visualization of the miRNA/mRNA binding site. In summary, MirSNP is focusing on MREs predicted by miRanda on 3ʹ UTRs having known SNPs. Its major strength consists in the integration of MAF annotations and conservation scores of miRNA seed motifs. In contrast, major weaknesses are missing information on experimentally validated MREs and missing support of considering SNPs in miRNAs. The database is available at http://bioinfo.bjmu.edu.cn/mirsnp/search/. PolymiRTS database 3.0 The PolymiRTS Database was originally released in 2007 [50] and has now reached its third version [40], published in 2013. It is the most comprehensive database available to-date. In addition to SNPs, it also offers support for considering small insertions or deletions (INDELs) in the genomic regions of miRNAs and their target sites. In its third version, SNPs and INDELs were collected from dbSNP build 137. The predictions of creations or deletions of MREs were performed via TargetScan 6.2 [51]. Following, their likelihood is assessed using their TargetScan context+ score difference to the reference target site. In addition, experimental support information was incorporated from miRecords, TarBase, miRTarBase and multiple studies, and added to the predicted MREs, if available. Furthermore, target sites identified by CLASH (cross linking, ligation and sequencing of hybrids) experiments [52], which allow to directly identify the location of pairs comprising a target site and its binding miRNA, are also provided. Likewise, the database links polymorphisms in MREs with possibly impacted gene pathways from the KEGG database [53], and with various human diseases and traits based on data in the NHGRI GWAS catalog [54], dbGaP [55] and eQTLs from GTEx eQTL browser [56]. The website provides four tabs for browsing and searching interactions with either one or multiple terms and according to the chromosomal location. Users can search SNPs by their dbSNP ID, miRNAs by their miRBase name and genes by their symbol, description or RefSeq mRNA ID. Further, users can start a query by providing traits, such as ‘Metabolic syndrome’ or gene ontology [57] terms. All search results can be filtered to show only gains or losses of MREs and to show only effects having a particular experimental support. PolymiRTS also allows to filter search results by minimum occurrence of miRNA sites in other vertebrate genomes. After submitting, a list of genes fulfilling the requested criteria is shown to the user. By selecting one of these genes, the user is then redirected to a new tab, where all interactions related to this gene are presented. These are split into multiple tables, separating CLASH data, SNPs in miRNA target sites and MRE gains/losses caused by SNPs in miRNA seeds. If available, information on associations with human diseases and pathways is provided. In conclusion, the PolymiRTS Database 3.0 provides a large panoply of features going from simple MRE gains and losses up to pathways and diseases. The major strengths of this database are on one side the variety of features and, importantly, on the other the multitude of experimental evidence integrated, in particular CLASH experiments. Weaknesses of PolymiRTS are missing visualizations of binding sites in the context of the entire 3ʹUTR, and the alignment of a miRNA-mRNA duplex, which is only partly shown. Furthermore, the display of all relevant information is suboptimal, as it is mixed with a lot of other content and therefore confusing. The database is available at http://compbio.uthsc.edu/miRSNP/. miRNASNP v2.0 It was initially published in 2012 [58] and updated in 2015 [41]. It provides information on the gain or loss of potential MREs caused by SNPs in 3ʹUTRs or miRNA seed regions. MREs were predicted by miRanda and TargetScan 6.2, and SNPs collected from dbSNP build 137. Experimental validated targets were retrieved from TarBase, starBase [59], miRecords, miRTarBase and miR2disease, providing the ability to filter the predictions according to these annotations. In addition to its base functionality, the database allows to evaluate the effects of SNPs in pre-miRNAs on their folding energy, and supplies lists of SNPs in flanking regions. To estimate the effect on the expression levels, correlation of miRNAs and target genes expression data from The Cancer Genome Atlas (TCGA) [60] was integrated. Similar to PolymiRTS, miRNASNP incorporates information about SNPs in GWAS-identified trait-associated regions from the NHGRI GWAS Catalog and about LD blocks for multiple populations. Finally, it also provides the possibility to predict the effects of novel SNVs on miRNA target binding, i.e. gain or loss of MREs, as well as the impact on the structure of the pre-miRNA. The website allows to browse different subsections of the database by providing respective links on the homepage. The search functionality is incorporated directly in the browse interface and can also be used from another tab. A search tab allows users to search for MRE gains or losses caused by SNPs in miRNA seed regions or in 3ʹUTRs. Search requests can be performed via dbSNP ID, miRNA miRBase name or gene symbol. Once selected, the user is redirected to a distinct browse interface, where post-filtering according to miRNA or gene expression is possible, as well as to SNPs in LD regions. Lost MREs can be filtered based on the available experimental validation and negative correlation of miRNA expression with gene expression. A resulting table prints information on the expression of miRNAs and genes in different tissues covered by the integrated TCGA samples. Moreover, the miRNA- or mRNA-binding site is displayed and the binding energy changes of the miRNA/mRNA duplexes are shown. Other search functions are provided as well, so that users can search for specific miRNA precursors, the effects of SNPs in flanking regions, seed regions or pre-miRNA except seed regions. As a result of these queries, a list of SNPs and their distances to the pre-miRNA and a list of SNPs with their predicted effect on the mature miRNA expression, as well as the introduced energy changes, are displayed. In addition, for each pre-miRNA, a detailed overview helps to retrieve the location and expression of mature miRNAs, the list of SNPs found in it and their effect on the secondary structure. In particular, for these SNPs, LD regions are shown in different populations and linked diseases or traits are reported. miRNASNP v2.0 offers a tab where users can input custom-mutated UTR sequences or miRNA sequences in addition to mutated pre-miRNA sequences to predict the resulting MRE gains/losses or the effect on the secondary structure. In summary, miRNASNP v2.0 is based on MRE predictions of miRanda and TargetScan and provides many additional features. Its major strengths are the incorporation of expression data, the ability to assess the impact of SNPs on the secondary structure of pre-miRNAs including the potential effects on the mature miRNA expression and the possibility to evaluate novel data. A major weakness is the missing batch search for known and novel data sets. The database is available at http://bioinfo.life.hust.edu.cn/miRNASNP2/index.php. Comparison of SNP effect prediction Web servers In our study, we compared the above described databases based on four criteria: their data sources, target prediction methods, database functionality and integration of experimental information. After the comparison, we assessed their individual performance based on a benchmark on 16 reported and validated SNP effects. Data sources The release versions of the used data sets for each of the databases were retrieved from their respective publications. In cases where release versions for data sources were not specified, they were determined using the year of publication of the respective database. As shown in Table 2, the latest published databases miRNASNP v2.0 and PolymiRTS are the most up-to-date services among other published applications. Among these databases, only miRdSNP uses the provided wild-type MRE predictions from TargetScan and PicTar, which are available on their tool websites. Despite the fact that this approach helps in cutting down the processing time for predictions, this may lead to missed predicted normal interactions, as PicTar used an old assembly of the human genome (hg17) as reference genome for the MRE predictions, which thus needed to be lifted to the subsequent version (hg18), used by all other resources of miRdSNP. Table 2. Overview of the integrated resources and build versions of the databases in this review miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Latest Genome Assembly hg18 hg19 hg19 hg19 hg38 dbSNP 130 135 137 137 150 HapMap 27 28a – – 28 TargetScan 5.1 – – – 7.1 PicTar N.A. – – – N.A. miRBase 17a 18 20 19 21 miRTarBase 2.5a – 4.3+ 4.5a 6.1 MiRecords N.A. – N.A. N.A. N.A. miR2disease N.A. – – N.A. N.A. TarBase 6.0+ – 6.0+ 7.0a 7.0 starBase – – – N.A. N.A. CLASH experiments – – N.A. – N.A. TGCA – – – N.A. N.A. miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Latest Genome Assembly hg18 hg19 hg19 hg19 hg38 dbSNP 130 135 137 137 150 HapMap 27 28a – – 28 TargetScan 5.1 – – – 7.1 PicTar N.A. – – – N.A. miRBase 17a 18 20 19 21 miRTarBase 2.5a – 4.3+ 4.5a 6.1 MiRecords N.A. – N.A. N.A. N.A. miR2disease N.A. – – N.A. N.A. TarBase 6.0+ – 6.0+ 7.0a 7.0 starBase – – – N.A. N.A. CLASH experiments – – N.A. – N.A. TGCA – – – N.A. N.A. N.A. Version not available. – Data source not used. a Data source derived from date of publication. Table 2. Overview of the integrated resources and build versions of the databases in this review miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Latest Genome Assembly hg18 hg19 hg19 hg19 hg38 dbSNP 130 135 137 137 150 HapMap 27 28a – – 28 TargetScan 5.1 – – – 7.1 PicTar N.A. – – – N.A. miRBase 17a 18 20 19 21 miRTarBase 2.5a – 4.3+ 4.5a 6.1 MiRecords N.A. – N.A. N.A. N.A. miR2disease N.A. – – N.A. N.A. TarBase 6.0+ – 6.0+ 7.0a 7.0 starBase – – – N.A. N.A. CLASH experiments – – N.A. – N.A. TGCA – – – N.A. N.A. miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Latest Genome Assembly hg18 hg19 hg19 hg19 hg38 dbSNP 130 135 137 137 150 HapMap 27 28a – – 28 TargetScan 5.1 – – – 7.1 PicTar N.A. – – – N.A. miRBase 17a 18 20 19 21 miRTarBase 2.5a – 4.3+ 4.5a 6.1 MiRecords N.A. – N.A. N.A. N.A. miR2disease N.A. – – N.A. N.A. TarBase 6.0+ – 6.0+ 7.0a 7.0 starBase – – – N.A. N.A. CLASH experiments – – N.A. – N.A. TGCA – – – N.A. N.A. N.A. Version not available. – Data source not used. a Data source derived from date of publication. With the exception of PolymiRTS and miRNASNP, none of the other databases have been updated after their initial publication. Even PolymiRTS and miRNASNP have not been updated since 3 and 2 years, respectively. The current version of dbSNP (build 150) contains 98 million more validated SNPs, i.e. 3.6-fold more SNPs, than the most recent dbSNP version (build 137) used by the here described databases. Furthermore, all of them use outdated miRBase versions. In Table 3, we report the number of considered miRNAs, UTRs and SNPs and the reported interactions of each database accordingly. In addition, we computed the latest numbers based on dbSNP build 149, miRBase version 21 and common predictions of miRanda, and the seed matching step of TargetScan 7.1. As it can be seen in Table 3, the number of 3ʹ UTRs has not changed much over the years. Table 3. Overview number of SNPs, miRNAs and interactions supported by the databases miRdSNP MirSNP PolymiRTS Database 3.0 miRNASNP v2.0 Latesta Number of miRNA (human) N.A. 1921 2578 2042 2588 Number of UTRs 19 834 29 273 18 678b 37 348 19 107 Number of SNPs in miRNA target sites (human) 175 351 414 510 358 874 566 176 1 810 468 Experimentally validated miRNA targets (human) N.A. N.A. 2070 393 936 322 160 Number of SNPs in miRNA seed regions (human) N.A. N.A. 271 227 2525 Number of total interactions 174c 1 562 149c 2 089 646c,d 1 271 259c 15 739 025 Gains of MREs because of polymorphism in miRNA N.A. N.A. N.A. 162 441c 5 559 979 Losses of MREs because of polymorphism in miRNA N.A. N.A. N.A. 153 290c 5 540 025 Gains of MREs because of polymorphism in UTR 174c 497 895c 906 703c 509 791c 1 940 014 Losses of MREs because of polymorphism in UTR N.A. 480 046c 923 082c 445 737c 2 699 007 miRdSNP MirSNP PolymiRTS Database 3.0 miRNASNP v2.0 Latesta Number of miRNA (human) N.A. 1921 2578 2042 2588 Number of UTRs 19 834 29 273 18 678b 37 348 19 107 Number of SNPs in miRNA target sites (human) 175 351 414 510 358 874 566 176 1 810 468 Experimentally validated miRNA targets (human) N.A. N.A. 2070 393 936 322 160 Number of SNPs in miRNA seed regions (human) N.A. N.A. 271 227 2525 Number of total interactions 174c 1 562 149c 2 089 646c,d 1 271 259c 15 739 025 Gains of MREs because of polymorphism in miRNA N.A. N.A. N.A. 162 441c 5 559 979 Losses of MREs because of polymorphism in miRNA N.A. N.A. N.A. 153 290c 5 540 025 Gains of MREs because of polymorphism in UTR 174c 497 895c 906 703c 509 791c 1 940 014 Losses of MREs because of polymorphism in UTR N.A. 480 046c 923 082c 445 737c 2 699 007 N.A. Not available. a According to dbSNP 149 and miRBase 21 and common predictions of miRanda and the seed matching step of TargetScan 7.1. b Data from PolymiRTS 2.0. c Effects for same gene only counted once. d Without MRE effects because of polymorphisms in miRNAs. Table 3. Overview number of SNPs, miRNAs and interactions supported by the databases miRdSNP MirSNP PolymiRTS Database 3.0 miRNASNP v2.0 Latesta Number of miRNA (human) N.A. 1921 2578 2042 2588 Number of UTRs 19 834 29 273 18 678b 37 348 19 107 Number of SNPs in miRNA target sites (human) 175 351 414 510 358 874 566 176 1 810 468 Experimentally validated miRNA targets (human) N.A. N.A. 2070 393 936 322 160 Number of SNPs in miRNA seed regions (human) N.A. N.A. 271 227 2525 Number of total interactions 174c 1 562 149c 2 089 646c,d 1 271 259c 15 739 025 Gains of MREs because of polymorphism in miRNA N.A. N.A. N.A. 162 441c 5 559 979 Losses of MREs because of polymorphism in miRNA N.A. N.A. N.A. 153 290c 5 540 025 Gains of MREs because of polymorphism in UTR 174c 497 895c 906 703c 509 791c 1 940 014 Losses of MREs because of polymorphism in UTR N.A. 480 046c 923 082c 445 737c 2 699 007 miRdSNP MirSNP PolymiRTS Database 3.0 miRNASNP v2.0 Latesta Number of miRNA (human) N.A. 1921 2578 2042 2588 Number of UTRs 19 834 29 273 18 678b 37 348 19 107 Number of SNPs in miRNA target sites (human) 175 351 414 510 358 874 566 176 1 810 468 Experimentally validated miRNA targets (human) N.A. N.A. 2070 393 936 322 160 Number of SNPs in miRNA seed regions (human) N.A. N.A. 271 227 2525 Number of total interactions 174c 1 562 149c 2 089 646c,d 1 271 259c 15 739 025 Gains of MREs because of polymorphism in miRNA N.A. N.A. N.A. 162 441c 5 559 979 Losses of MREs because of polymorphism in miRNA N.A. N.A. N.A. 153 290c 5 540 025 Gains of MREs because of polymorphism in UTR 174c 497 895c 906 703c 509 791c 1 940 014 Losses of MREs because of polymorphism in UTR N.A. 480 046c 923 082c 445 737c 2 699 007 N.A. Not available. a According to dbSNP 149 and miRBase 21 and common predictions of miRanda and the seed matching step of TargetScan 7.1. b Data from PolymiRTS 2.0. c Effects for same gene only counted once. d Without MRE effects because of polymorphisms in miRNAs. Interestingly, MiRNASNP v2.0 considers all reported 3ʹ UTRs, in contrast to the other databases that only consider the longest ones per gene, which explains the observed difference. When comparing the latest statistics with those of PolymiRTS 3.0 and miRNASNP v2.0, which are using the most recent dbSNP build among the four databases, we observe that the number of SNPs in miRNA target sites has increased by 3–5-fold, and the number of SNPs in miRNA seed regions has even grown 9–11-fold. These numbers are also reflected in the total of reported interactions, which have grown 8–12-fold. Of course, these numbers do not only depend on the tools at hand but also on the prediction algorithms. To compute the latest numbers, we have taken a similar approach to MiRNASNP v2.0, except that we did not restrict the predicted sites by TargetScan to the conserved ones. Therefore, our method reports less than PolymiRTS (reports all TargetScan hits) or MiRSNP (reports all miRanda hits), but more than MiRNASNP v2.0, if applied on current data. In Figure 1, all predicted interactions caused by polymorphisms in the 3ʹ UTR of all considered databases are shown. To this end, we collected all interactions from the database files available on their respective websites and converted all miRNA names to their latest in miRBase. To avoid counting identical effects found in different UTRs but for the same gene multiple times, we compared the reported miRNA–gene symbol–SNP triplets of each database instead of their UTR RefSeq identifier. In addition, we excluded the predicted increases or decreases in binding affinity of MirSNP. We can see that 76% (133) of MRE gains covered by miRdSNP are also found in all other databases. On the other hand, 573 442 interactions are also covered by all the other databases, representing between 27 and 60% of the total number of interactions reported by them. Especially, PolymiRTS differs from the others, which is not surprising because of its unfiltered use of TargetScan and reporting of interactions for both allelles. Regarding MirSNP and miRNASNP v2.0, 20% (192 240 and 191 021) of their interactions are unexplained by any other database, hence highlighting their heterogeneity. Figure 1. View largeDownload slide Total number of interactions shared by all evaluated databases. Figure 1. View largeDownload slide Total number of interactions shared by all evaluated databases. Prediction methods In Table 4, we summarized the prediction software used by the different tools. In general, all databases use either TargetScan or miRanda. A recent review by Riffo-Campos et al. [61] compares these algorithms in their latest versions. Interestingly, miRdSNP uses miRanda for the prediction of new MREs induced by SNPs, but uses data sets pre-predicted by TargetScan 5.2 and PicTar for normal interaction. This might be surprising for users, as one would expect the same prediction software used throughout the entire project. MiRNASNP v2.0 uses both miRanda and TargetScan for the prediction of gains and losses of MREs. Table 4. Versions of target prediction algorithms used in the considered databases miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 TargetScan – – 6.2 6.2 miRanda 3.3a 3.3a – 3.3a miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 TargetScan – – 6.2 6.2 miRanda 3.3a 3.3a – 3.3a Table 4. Versions of target prediction algorithms used in the considered databases miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 TargetScan – – 6.2 6.2 miRanda 3.3a 3.3a – 3.3a miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 TargetScan – – 6.2 6.2 miRanda 3.3a 3.3a – 3.3a In the majority of the databases in this review, SNP-induced UTR sequences were preprocessed to remove redundancy of target sites that are not affected by SNPs and to reduce the computational burden. MiRdSNP, miRNASNP and MirSNP consider only 25, 25 and 20 nts upstream and downstream of the SNP, respectively. However, this might exclude target sites with gaps or bulges, as miRNAs do not require perfect complementarity to the UTR sequences to form a miRNA-mRNA complex. We compared the prediction results of miRanda3.3a for shortened UTR sequences (30 nts) to the original complete UTR sequence and noticed that some target sites were not predicted by the algorithm when applied using the shorter sequences. On repeated testing, we found that a length of at least 80 nts (upstream and downstream) was required for the miRanda prediction algorithm to yield the same results as for the whole UTR region as input. Therefore, we chose this threshold for the predictions we made using miRanda as presented in Table 3. For PolymiRTS, we found no information regarding the considered nucleotides upstream and downstream. To assess the impact of shortening the UTR sequences to 25 nts, we computed the number of gains and losses predicted by miRanda with the shortened length and with the full UTR length on our collected data. Using the full UTR length, 11 433 628 gains and 11 953 935 losses were predicted by miRanda, whereas with shortened UTR length, more gains and losses could be found (11 553 901 and 12 074 380). To emphasize, the impact is noncritical (1%), but existent and potential correct predictions could be missed. Database functionality All databases offer the possibility to search for miRNA names, gene symbols or dbSNP identifiers to identify MRE gains or losses. Furthermore, all except miRdSNP and MirSNP consider SNPs and their effects in miRNAs. MirSNP and the PolymiRTS also offer the possibility to query multiple entries at the same time by providing a corresponding list of queries. In addition, all databases, except MirSNP, include experimental evidences collected from multiple miRNA target catalogs. Besides these features, only miRNASNP v2.0 allows to evaluate the effects of novel SNVs. However, miRNASNP accepts queries on the effects of SNP-induced UTR or miRNA sequences only one by one, whereas the possibility of uploading a VCF file would be more convenient, thereby allowing to easily study the effects of detected SNVs by variant-calling pipelines. Querying for SNPs linked to diseases is only implemented in miRdSNP. Even though PolymiRTS provides disease-related information, it allows querying for traits only. Although the majority of databases considers only single point mutations, it is important to note that PolymiRTS 3.0 also considers small INDELs, which form a large part of genomic variations. In addition, PolymiRTS is the only tool that integrates information from the KEGG pathway database, making it easier for the user to study the effects of genomic variations on biological pathways. Another important feature provided by PolymiRTS and MiRSNP is the annotation of evolutionary conservation of target sites. Integration of experimental information Another important factor to consider comprises the inclusion of experimental information for the predicted miRNA-target pairs as supported by the individual databases. As shown in Table 2, all databases except MirSNP integrate information from miRNA–target catalogs, allowing to filter for only experimentally validated miRNA–target relationships. It should however be noted that even if miRNA–target relationships are validated, often the exact location of the binding site is not known. Therefore, there is a higher risk of MRE losses of validated interactions being false positives because the miRNA originally never bound at the specified location. MiRdSNP is the only database containing manually curated disease-associated SNPs. In comparison, PolymiRTS associates genes with polymorphisms in 3ʹ UTRs with diseases and traits by considering the results from genome-wide association studies (GWASs) and expression quantitative trait loci studies (eQTLs). MiRNASNP v2.0 considers GWAS as well and reports LD regions. The traits associated to the LD regions seem however to be only accessible when querying for a particular pre-miRNA. As miRNAs usually upregulate or downregulate their target genes, it makes sense to consider their expression. MiRNASNP takes this into account by including miRNA and mRNA expression data (retrieved from TCGA) and reports their specific correlation. The PolymiRTS database misses this feature; however, it provides annotations from CLASH experiments, which provide explicit evidence for miRNA-mRNA target relationships in the context of SNPs or INDELs. This is an advantage as compared with the information content provided by other databases, where miRNA-mRNA target relationships are only reported based on their original sequences. Benchmark on 16 reported and validated SNP effects We evaluated the ability of the databases to recover 16 experimentally validated effects of 13 variants in the binding sites of miRNAs in 3ʹ UTRs, which were reported by multiple studies [37, 62–64] and summarized in a recent review [65]. We excluded miRSNP from the evaluation process; since at the time of running the benchmark, the search functionality was not available. The results for the remaining three databases are presented in Table 5. Table 5. Results of querying the databases with 16 reported and validated SNP effects mirRdSNP PolymiRTS Database 3.0 miRNASNP v2.0 Interaction  Gains hsa-miR-191-5p, hsa-miR-887-3p with MDM4 because of rs4245739 Not a dSNP, so no gain information Reported Reported, gain/loss inverted hsa-miR-140-5p with TP63 because of rs35592567 Not a dSNP, so no gain information Not reported Reported, gain/loss inverted hsa-miR-214-5p, hsa-miR-550a-5p with HNF1B because of rs2229295 Not a dSNP, so no gain information Reported, gain/loss inverted Reported hsa-miR-4271 with APOC3 because of rs4225 SNP not found Reported Reported hsa-miR-522-3p with PLIN4 because of rs8887 SNP not found Reported/wo gain/loss information Not reported hsa-miR-124-3p with FXN because of rs11145043 SNP not found Reported, gain/loss inverted Reported, gain/loss inverted hsa-miR-629-5p with NBS1 because of rs2735383 SNP not found Not reported Not reported  Losses hsa-miR-34b-3p with SCNA because of rs10024743 – Reported Reported hsa-miR-96-5p and hsa-miR-182-5p with PALLD because of rs1071738 – Reported Reported, but only 182-5p hsa-miR-137 with EFNB2 because of rs550067317 – SNP not found SNP not found hsa-miR-510-5p with HTR3E because of rs56109847 – Reported Reported hsa-miR-433-3p with FGF20 because of rs12720208 – Reported Reported hsa-miR-155-5p with AGTR1 because of rs5186 – Reported Reported Reported interactions 0 13 12 mirRdSNP PolymiRTS Database 3.0 miRNASNP v2.0 Interaction  Gains hsa-miR-191-5p, hsa-miR-887-3p with MDM4 because of rs4245739 Not a dSNP, so no gain information Reported Reported, gain/loss inverted hsa-miR-140-5p with TP63 because of rs35592567 Not a dSNP, so no gain information Not reported Reported, gain/loss inverted hsa-miR-214-5p, hsa-miR-550a-5p with HNF1B because of rs2229295 Not a dSNP, so no gain information Reported, gain/loss inverted Reported hsa-miR-4271 with APOC3 because of rs4225 SNP not found Reported Reported hsa-miR-522-3p with PLIN4 because of rs8887 SNP not found Reported/wo gain/loss information Not reported hsa-miR-124-3p with FXN because of rs11145043 SNP not found Reported, gain/loss inverted Reported, gain/loss inverted hsa-miR-629-5p with NBS1 because of rs2735383 SNP not found Not reported Not reported  Losses hsa-miR-34b-3p with SCNA because of rs10024743 – Reported Reported hsa-miR-96-5p and hsa-miR-182-5p with PALLD because of rs1071738 – Reported Reported, but only 182-5p hsa-miR-137 with EFNB2 because of rs550067317 – SNP not found SNP not found hsa-miR-510-5p with HTR3E because of rs56109847 – Reported Reported hsa-miR-433-3p with FGF20 because of rs12720208 – Reported Reported hsa-miR-155-5p with AGTR1 because of rs5186 – Reported Reported Reported interactions 0 13 12 Table 5. Results of querying the databases with 16 reported and validated SNP effects mirRdSNP PolymiRTS Database 3.0 miRNASNP v2.0 Interaction  Gains hsa-miR-191-5p, hsa-miR-887-3p with MDM4 because of rs4245739 Not a dSNP, so no gain information Reported Reported, gain/loss inverted hsa-miR-140-5p with TP63 because of rs35592567 Not a dSNP, so no gain information Not reported Reported, gain/loss inverted hsa-miR-214-5p, hsa-miR-550a-5p with HNF1B because of rs2229295 Not a dSNP, so no gain information Reported, gain/loss inverted Reported hsa-miR-4271 with APOC3 because of rs4225 SNP not found Reported Reported hsa-miR-522-3p with PLIN4 because of rs8887 SNP not found Reported/wo gain/loss information Not reported hsa-miR-124-3p with FXN because of rs11145043 SNP not found Reported, gain/loss inverted Reported, gain/loss inverted hsa-miR-629-5p with NBS1 because of rs2735383 SNP not found Not reported Not reported  Losses hsa-miR-34b-3p with SCNA because of rs10024743 – Reported Reported hsa-miR-96-5p and hsa-miR-182-5p with PALLD because of rs1071738 – Reported Reported, but only 182-5p hsa-miR-137 with EFNB2 because of rs550067317 – SNP not found SNP not found hsa-miR-510-5p with HTR3E because of rs56109847 – Reported Reported hsa-miR-433-3p with FGF20 because of rs12720208 – Reported Reported hsa-miR-155-5p with AGTR1 because of rs5186 – Reported Reported Reported interactions 0 13 12 mirRdSNP PolymiRTS Database 3.0 miRNASNP v2.0 Interaction  Gains hsa-miR-191-5p, hsa-miR-887-3p with MDM4 because of rs4245739 Not a dSNP, so no gain information Reported Reported, gain/loss inverted hsa-miR-140-5p with TP63 because of rs35592567 Not a dSNP, so no gain information Not reported Reported, gain/loss inverted hsa-miR-214-5p, hsa-miR-550a-5p with HNF1B because of rs2229295 Not a dSNP, so no gain information Reported, gain/loss inverted Reported hsa-miR-4271 with APOC3 because of rs4225 SNP not found Reported Reported hsa-miR-522-3p with PLIN4 because of rs8887 SNP not found Reported/wo gain/loss information Not reported hsa-miR-124-3p with FXN because of rs11145043 SNP not found Reported, gain/loss inverted Reported, gain/loss inverted hsa-miR-629-5p with NBS1 because of rs2735383 SNP not found Not reported Not reported  Losses hsa-miR-34b-3p with SCNA because of rs10024743 – Reported Reported hsa-miR-96-5p and hsa-miR-182-5p with PALLD because of rs1071738 – Reported Reported, but only 182-5p hsa-miR-137 with EFNB2 because of rs550067317 – SNP not found SNP not found hsa-miR-510-5p with HTR3E because of rs56109847 – Reported Reported hsa-miR-433-3p with FGF20 because of rs12720208 – Reported Reported hsa-miR-155-5p with AGTR1 because of rs5186 – Reported Reported Reported interactions 0 13 12 First of all, we noticed a usability problem when searching SNPs in 3ʹ UTRs using miRNASNP. The results for the same query were different when using the entry search page located at http://bioinfo.life.hust.edu.cn/miRNASNP2/search.php, or the results search page that is available after querying the database using the entry search page. An example is rs4245739, for which six target gains were reported when using the entry search page, whereas no target gains were reported if searching via the results search page. The reason for this might lie within the results search page, as there only miRNAs with any annotated expression data are shown. Additionally, the database provides the possibility to show results for miRNAs with a minimum average expression. However, the default option, which is to show interactions for all, does not show all miRNAs but filters out every miRNA for which no expression data are available, resulting in the observed difference above. As shown in Table 5, none of the queried effects were found in miRdSNP. This is because of the fact that MRE gains are only reported for their specifically determined subset of disease-associated SNPs and that MRE losses are not reported. Of the 16 evaluated effects, 13 were found in PolymiRTS and 12 in miRNASNP. We found that MRE gains or losses were often inverted in comparison with the reported effects. The reason for this is that the PolymiRTS database considers as reference nucleotide the ancestral allele, whereas miRNASNP considers the nucleotide in the reference genome. An example of an MRE loss not found in any database is induced by rs550067317. It is not reported because it is present in dbSNP only since build 142, which is newer than any builds used by the other databases. A further example of an MRE gain reported by no database is caused by rs2735383. This can be explained by the unusual binding of hsa-miR-629-5p, which has no complementary base pairing at the first position of the seed, and is therefore discarded by TargetScan. Another SNP rs35592567 for which PolymiRTS did not report any interaction forms a new binding site found by TargetScan because the first base of the seed can then bind, resulting in a perfect 8mer match. We could not determine why PolymiRTS is not reporting this interaction, as it is using TargetScan. MiRNASNP reports this interaction because not only TargetScan but also miRanda predicts it. The MRE gain of hsa-miR-522-3p with PLIN4 caused by rs8887 is reported by PolymiRTS but not by miRNASNP. This is because of the fact that TargetScan predicts the binding site with the A allele because of an 8mer seed match. MiRanda does predict the binding site as well, however, only with a binding energy of −8.96 kCal/Mol, which is filtered out by miRNASNP. Therefore, as both tools need to predict the MRE gain in miRNASNP, nothing is reported. The last different prediction of PolymiRTS and MiRNASNP is for the MRE loss of hsa-miR-96-5p with PALLD caused by rs1071738, which is only reported by PolymiRTS. This can again be explained by the different predictions of TargetScan and miRanda. TargetScan predicts in the original sequence a binding site with a 6mer followed by an A, which is disrupted by the SNP at the third base with a change of C to G. This MRE loss is not detected when using miRanda because it predicts different target sites in this gene, which are not affected by the SNP. When evaluating the binding sites of the predicted interactions, we noticed that the MRE gain in the 3ʹ UTR of HNF1B because of rs2229295 for hsa-miR-214-5p and for hsa-miR-500a-5p would not be detected anymore in hg38 by conventional prediction algorithms, as the nucleotide adjacent to the annotated SNP changed from G to A between hg19 and hg38, prohibiting any binding, as illustrated in Figure 2. This example also highlights the importance of the reference genomes used by these databases. Figure 2. View largeDownload slide Difference in the sequence of human genome in build hg19 and hg38 leading to undetected binding sites for hsa-miR-214-5p and hsa-miR-500a-5p. Figure 2. View largeDownload slide Difference in the sequence of human genome in build hg19 and hg38 leading to undetected binding sites for hsa-miR-214-5p and hsa-miR-500a-5p. In summary, PolymiRTS and miRNASNP cover our set of experimentally validated MRE gains and losses well, whereas miRdSNP did not find any MRE because of its restrictions to dSNPs. PolymiRTS performs minimally better than miRNASNP, as it covers one effect more, which is why we see it as a close winner. Of course, the size of this benchmark compared with the complete set of predictions is small, and therefore, a much larger benchmark comprising thousands of interactions would be required to evaluate the performance in a fair manner. However, as the number of experimentally validated effects of SNPs in MREs is small, we preferred to set the focus of the benchmark on a high-confidence set instead of including potential false positives. Which database to choose? As shown in the previous sections, all databases have some exclusive features. However, the search functionality of MirSNP was not available at the time of creating this review, which is why we cannot recommend it currently. The PolymiRTS Database 3.0 covers nearly all features of other databases and should therefore be the database of choice per default. If expression correlations of miRNAs and their mRNA targets are important or effects of SNPs on the pre-miRNAs, miRNASNP v2.0 is the database of choice. Furthermore, if users are studying novel SNVs, they should consider miRNASNP v2.0 as well, after reducing their SNVs to a small subset. We provide the up-to-date data we collected for this review in a database called miRSNPdb, which is reachable under www.ccb.uni-saarland.de/mirsnp. Users relying on more recent SNPs or having their own novel SNVs can use it to retrieve MRE gains and losses. Future challenges The substantial increase in annotated miRNAs and SNPs has made it extremely computationally intensive to predict novel target sites. We found that the first step performed by TargetScan took ∼15 h for 2588 miRNAs from miRBase v21 and all 19 107 3ʹ UTRs from Ensembl 85 [66]. As the number of SNPs in miRNA target sites is nearly 100-fold higher than the number of 3ʹ UTRs, even when reducing the predictions to the considered regions, the number of miRNA-mRNA target pairs rises substantially. Therefore, the runtime of target prediction programs and/or the algorithms assessing the impact of SNPs need to be improved to be able to keep up with the increasing amount of available data. All presented databases focus exclusively on the impact of SNPs in miRNAs and 3ʹ UTRs. However, it has been shown that miRNAs can also bind to 5ʹ UTRs or even to coding regions. The extent of these interactions is still unclear; therefore, including them into relevant databases could promote their investigation. More specifically, focusing on already validated interactions should be considered as the first step, as the effects in 5ʹ UTRs are expected to be less frequent, and therefore, including all predictions would also include a large set of false positives. Until now, all available databases focus on single SNPs or INDELs in either the 3ʹ UTR sequences or in miRNAs. However, it is not unlikely that multiple variants occur at the same time and induce other effects. The inherent exponential increase in variant combinations is a major challenge in this regard. With the recent advances of high-throughput miRNA-mRNA mapping via CLASH experiments, new miRNA target prediction tools, such as TarPmiR [67], have been developed. The progresses in the target prediction field will allow to improve the predictions of gains and losses induced by variants. Combined with the continuous increase in SNV data, keeping databases up-to-date with the latest software and reference data is important. We believe that because of the ever-growing amount of annotated SNPs and the thereby resulting substantial growth of predicted MRE, gains and losses a larger focus should be put on the curation of high-confidence sets. These could be narrowed down at first by considering the common predictions of more target prediction tools. The resulting predictions could then further be refined by experimental evidence, stemming, for example, from CLASH experiments. In addition, annotations for explicitly experimentally validated MRE gains or losses, such as the ones collected by Moszynska et al. [65], would highly improve the quality of such sets to eventually form a reliable gold standard. Overall, we think that such high-confidence data would increase the usefulness of such databases for precision medicine to a reasonable extent. Key Points A substantial increase has been observed in the number of reported SNPs and the thereby induced MREs over the past years. All currently available databases are based on outdated resources. PolymiRTS is the most complete available database followed by miRNASNP v2.0. Users studying novel SNVs should consider miRNASNP v2.0. Tobias Fehlmann is a PhD student at the Chair for Clinical Bioinformatics, Saarland University, Germany. He has been working in the field of miRNAs in Bioinformatics since 2014. Shashwat Sahay is a Master student at the Chair for Clinical Bioinformatics, Saarland University, Germany. He has been working in the field of miRNAs in Bioinformatics since 2016. Andreas Keller is a Professor and head of the Chair for Clinical Bioinformatics at Saarland University. He has been working in the field of miRNAs in Bioinformatics since 2008. Christina Backes is a Postdoc at the Chair for Clinical Bioinformatics at Saarland University. She has been working in the field of miRNAs in Bioinformatics since 2009. References 1 Cook CE , Bergman MT , Finn RD. The European Bioinformatics Institute in 2016: data growth and integration . Nucleic Acids Res 2016 ; 44 : D20 – 6 . Google Scholar CrossRef Search ADS PubMed 2 Kodama Y , Shumway M , Leinonen R. The sequence read archive: explosive growth of sequencing data . Nucleic Acids Res 2012 ; 40 : D54 – 6 . Google Scholar CrossRef Search ADS PubMed 3 Sherry ST , Ward MH , Kholodov M , et al. dbSNP: the NCBI database of genetic variation . Nucleic Acids Res 2001 ; 29 ( 1 ): 308 – 11 . http://dx.doi.org/10.1093/nar/29.1.308 Google Scholar CrossRef Search ADS PubMed 4 Bartoszewski RA , Jablonsky M , Bartoszewska S , et al. A synonymous single nucleotide polymorphism in DeltaF508 CFTR alters the secondary structure of the mRNA and the expression of the mutant protein . J Biol Chem 2010 ; 285 ( 37 ): 28741 – 8 . Google Scholar CrossRef Search ADS PubMed 5 Stracquadanio G , Wang X , Wallace MD , et al. The importance of p53 pathway genetics in inherited and somatic cancer genomes . Nat Rev Cancer 2016 ; 16 ( 4 ): 251 – 65 . http://dx.doi.org/10.1038/nrc.2016.15 Google Scholar CrossRef Search ADS PubMed 6 Zhang L , Long X. Association of three SNPs in TOX3 and breast cancer risk: evidence from 97275 cases and 128686 controls . Sci Rep 2015 ; 5 : 12773 . http://dx.doi.org/10.1038/srep12773 Google Scholar CrossRef Search ADS PubMed 7 Huang CY , Huang SP , Lin VC , et al. Genetic variants of the autophagy pathway as prognostic indicators for prostate cancer . Sci Rep 2015 ; 5 ( 1 ): 14045 . http://dx.doi.org/10.1038/srep14045 Google Scholar CrossRef Search ADS PubMed 8 Lambert JC , Ibrahim-Verbaas CA , Harold D , et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease . Nat Genet 2013 ; 45 : 1452 – 8 . http://dx.doi.org/10.1038/ng.2802 Google Scholar CrossRef Search ADS PubMed 9 De Marchi F , Carecchio M , Cantello R , et al. Predicting cognitive decline in Parkinson's disease: can we ask the genes? Front Neurol 2014 ; 5 : 224 . Google Scholar CrossRef Search ADS PubMed 10 Mattick JS. Non-coding RNAs: the architects of eukaryotic complexity . EMBO Rep 2001 ; 2 ( 11 ): 986 – 91 . http://dx.doi.org/10.1093/embo-reports/kve230 Google Scholar CrossRef Search ADS PubMed 11 Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function . Cell 2004 ; 116 ( 2 ): 281 – 97 . http://dx.doi.org/10.1016/S0092-8674(04)00045-5 Google Scholar CrossRef Search ADS PubMed 12 Friedman RC , Farh KK , Burge CB , et al. Most mammalian mRNAs are conserved targets of microRNAs . Genome Res 2009 ; 19 : 92 – 105 . Google Scholar CrossRef Search ADS PubMed 13 Leidinger P , Backes C , Deutscher S , et al. A blood based 12-miRNA signature of Alzheimer disease patients . Genome Biol 2013 ; 14 ( 7 ): R78 . Google Scholar CrossRef Search ADS PubMed 14 Mitchell PS , Parkin RK , Kroh EM , et al. Circulating microRNAs as stable blood-based markers for cancer detection . Proc Natl Acad Sci USA 2008 ; 105 ( 30 ): 10513 – 18 . http://dx.doi.org/10.1073/pnas.0804549105 Google Scholar CrossRef Search ADS PubMed 15 Roth P , Keller A , Hoheisel JD , et al. Differentially regulated miRNAs as prognostic biomarkers in the blood of primary CNS lymphoma patients . Eur J Cancer 2015 ; 51 ( 3 ): 382 – 90 . http://dx.doi.org/10.1016/j.ejca.2014.10.028 Google Scholar CrossRef Search ADS PubMed 16 Pillai RS. MicroRNA function: multiple mechanisms for a tiny RNA? RNA 2005 ; 11 ( 12 ): 1753 – 61 . Google Scholar CrossRef Search ADS PubMed 17 Zhou H , Rigoutsos I. MiR-103a-3p targets the 5' UTR of GPRC5A in pancreatic cells . RNA 2014 ; 20 ( 9 ): 1431 – 9 . http://dx.doi.org/10.1261/rna.045757.114 Google Scholar CrossRef Search ADS PubMed 18 Henke JI , Goergen D , Zheng J , et al. microRNA-122 stimulates translation of hepatitis C virus RNA . EMBO J 2008 ; 27 ( 24 ): 3300 – 10 . http://dx.doi.org/10.1038/emboj.2008.244 Google Scholar CrossRef Search ADS PubMed 19 Orom UA , Nielsen FC , Lund AH. MicroRNA-10a binds the 5'UTR of ribosomal protein mRNAs and enhances their translation . Mol Cell 2008 ; 30 : 460 – 71 . http://dx.doi.org/10.1016/j.molcel.2008.05.001 Google Scholar CrossRef Search ADS PubMed 20 Sacco L , Masotti A. Recent insights and novel bioinformatics tools to understand the role of microRNAs binding to 5' untranslated region . Int J Mol Sci 2012 ; 14 ( 1 ): 480 – 95 . http://dx.doi.org/10.3390/ijms14010480 Google Scholar CrossRef Search ADS PubMed 21 Ha M , Kim VN. Regulation of microRNA biogenesis . Nat Rev Mol Cell Biol 2014 ; 15 ( 8 ): 509 – 24 . http://dx.doi.org/10.1038/nrm3838 Google Scholar CrossRef Search ADS PubMed 22 Lee Y , Ahn C , Han J , et al. The nuclear RNase III Drosha initiates microRNA processing . Nature 2003 ; 425 ( 6956 ): 415 – 19 . http://dx.doi.org/10.1038/nature01957 Google Scholar CrossRef Search ADS PubMed 23 Hutvagner G , McLachlan J , Pasquinelli AE , et al. A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7 small temporal RNA . Science 2001 ; 293 ( 5531 ): 834 – 8 . http://dx.doi.org/10.1126/science.1062961 Google Scholar CrossRef Search ADS PubMed 24 Hammond SM , Boettcher S , Caudy AA , et al. Argonaute2, a link between genetic and biochemical analyses of RNAi . Science 2001 ; 293 ( 5532 ): 1146 – 50 . http://dx.doi.org/10.1126/science.1064023 Google Scholar CrossRef Search ADS PubMed 25 Bartel DP. MicroRNAs: target recognition and regulatory functions . Cell 2009 ; 136 ( 2 ): 215 – 33 . http://dx.doi.org/10.1016/j.cell.2009.01.002 Google Scholar CrossRef Search ADS PubMed 26 Duan R , Pak C , Jin P. Single nucleotide polymorphism associated with mature miR-125a alters the processing of pri-miRNA . Hum Mol Genet 2007 ; 16 ( 9 ): 1124 – 31 . http://dx.doi.org/10.1093/hmg/ddm062 Google Scholar CrossRef Search ADS PubMed 27 Lewis BP , Shih IH , Jones-Rhoades MW , et al. Prediction of mammalian microRNA targets . Cell 2003 ; 115 ( 7 ): 787 – 98 . http://dx.doi.org/10.1016/S0092-8674(03)01018-3 Google Scholar CrossRef Search ADS PubMed 28 Jazdzewski K , Murray EL , Franssila K , et al. Common SNP in pre-miR-146a decreases mature miR expression and predisposes to papillary thyroid carcinoma . Proc Natl Acad Sci USA 2008 ; 105 ( 20 ): 7269 – 74 . http://dx.doi.org/10.1073/pnas.0802682105 Google Scholar CrossRef Search ADS PubMed 29 Shen J , Ambrosone CB , DiCioccio RA , et al. A functional polymorphism in the miR-146a gene and age of familial breast/ovarian cancer diagnosis . Carcinogenesis 2008 ; 29 ( 10 ): 1963 – 6 . http://dx.doi.org/10.1093/carcin/bgn172 Google Scholar CrossRef Search ADS PubMed 30 Xu T , Zhu Y , Wei QK , et al. A functional polymorphism in the miR-146a gene is associated with the risk for hepatocellular carcinoma . Carcinogenesis 2008 ; 29 ( 11 ): 2126 – 31 . http://dx.doi.org/10.1093/carcin/bgn195 Google Scholar CrossRef Search ADS PubMed 31 Sun G , Yan J , Noltner K , et al. SNPs in human miRNA genes affect biogenesis and function . RNA 2009 ; 15 ( 9 ): 1640 – 51 . http://dx.doi.org/10.1261/rna.1560209 Google Scholar CrossRef Search ADS PubMed 32 Mencia A , Modamio-Hoybjor S , Redshaw N , et al. Mutations in the seed region of human miR-96 are responsible for nonsyndromic progressive hearing loss . Nat Genet 2009 ; 41 : 609 – 13 . http://dx.doi.org/10.1038/ng.355 Google Scholar CrossRef Search ADS PubMed 33 Zhou L , Zhang X , Li Z , et al. Association of a genetic variation in a miR-191 binding site in MDM4 with risk of esophageal squamous cell carcinoma . PLoS One 2013 ; 8 ( 5 ): e64331 . Google Scholar CrossRef Search ADS PubMed 34 Gao F , Xiong X , Pan W , et al. A regulatory MDM4 genetic variant locating in the binding sequence of multiple MicroRNAs contributes to susceptibility of small cell lung cancer . PLoS One 2015 ; 10 ( 8 ): e0135647 . Google Scholar CrossRef Search ADS PubMed 35 Stegeman S , Moya L , Selth LA , et al. A genetic variant of MDM4 influences regulation by multiple microRNAs in prostate cancer . Endocr Relat Cancer 2015 ; 22 ( 2 ): 265 – 76 . http://dx.doi.org/10.1530/ERC-15-0013 Google Scholar CrossRef Search ADS PubMed 36 Wang M , Du M , Ma L , et al. A functional variant in TP63 at 3q28 associated with bladder cancer risk by creating an miR-140-5p binding site . Int J Cancer 2016 ; 139 ( 1 ): 65 – 74 . http://dx.doi.org/10.1002/ijc.29978 Google Scholar CrossRef Search ADS PubMed 37 Wang G , van der Walt JM , Mayhew G , et al. Variation in the miRNA-433 binding site of FGF20 confers risk for Parkinson disease by overexpression of alpha-synuclein . Am J Hum Genet 2008 ; 82 ( 2 ): 283 – 9 . http://dx.doi.org/10.1016/j.ajhg.2007.09.021 Google Scholar CrossRef Search ADS PubMed 38 Bruno AE , Li L , Kalabus JL , et al. miRdSNP: a database of disease-associated SNPs and microRNA target sites on 3'UTRs of human genes . BMC Genomics 2012 ; 13 ( 1 ): 44 . http://dx.doi.org/10.1186/1471-2164-13-44 Google Scholar CrossRef Search ADS PubMed 39 Liu C , Zhang F , Li T , et al. MirSNP, a database of polymorphisms altering miRNA target sites, identifies miRNA-related SNPs in GWAS SNPs and eQTLs . BMC Genomics 2012 ; 13 ( 1 ): 661 . http://dx.doi.org/10.1186/1471-2164-13-661 Google Scholar CrossRef Search ADS PubMed 40 Bhattacharya A , Ziebarth JD , Cui Y. PolymiRTS database 3.0: linking polymorphisms in microRNAs and their target sites with human diseases and biological pathways . Nucleic Acids Res 2014 ; 42 : D86 – 91 . Google Scholar CrossRef Search ADS PubMed 41 Gong J , Liu C , Liu W , et al. An update of miRNASNP database for better SNP selection by GWAS data, miRNA expression and online tools . Database 2015 ; 2015 : bav029 . Google Scholar CrossRef Search ADS PubMed 42 Sethupathy P , Corda B , Hatzigeorgiou AG. TarBase: a comprehensive database of experimentally supported animal microRNA targets . RNA 2006 ; 12 ( 2 ): 192 – 7 . Google Scholar CrossRef Search ADS PubMed 43 Hsu SD , Lin FM , Wu WY , et al. miRTarBase: a database curates experimentally validated microRNA-target interactions . Nucleic Acids Res 2011 ; 39 : D163 – 9 . Google Scholar CrossRef Search ADS PubMed 44 Xiao F , Zuo Z , Cai G , et al. miRecords: an integrated resource for microRNA-target interactions . Nucleic Acids Res 2009 ; 37 : D105 – 10 . Google Scholar CrossRef Search ADS PubMed 45 Jiang Q , Wang Y , Hao Y , et al. miR2Disease: a manually curated database for microRNA deregulation in human disease . Nucleic Acids Res 2009 ; 37 : D98 – 104 . Google Scholar CrossRef Search ADS PubMed 46 Krek A , Grun D , Poy MN , et al. Combinatorial microRNA target predictions . Nat Genet 2005 ; 37 ( 5 ): 495 – 500 . http://dx.doi.org/10.1038/ng1536 Google Scholar CrossRef Search ADS PubMed 47 Lewis BP , Burge CB , Bartel DP. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets . Cell 2005 ; 120 ( 1 ): 15 – 20 . http://dx.doi.org/10.1016/j.cell.2004.12.035 Google Scholar CrossRef Search ADS PubMed 48 Enright AJ , John B , Gaul U , et al. MicroRNA targets in Drosophila . Genome Biol 2003 ; 5 ( 1 ): R1 . Google Scholar CrossRef Search ADS PubMed 49 International HapMap Consortium ; Frazer KA , Ballinger DG , et al. A second generation human haplotype map of over 3.1 million SNPs . Nature 2007 ; 449 : 851 – 61 . Google Scholar CrossRef Search ADS PubMed 50 Bao L , Zhou M , Wu L , et al. PolymiRTS database: linking polymorphisms in microRNA target sites with complex traits . Nucleic Acids Res 2007 ; 35 : D51 – 4 . Google Scholar CrossRef Search ADS PubMed 51 Garcia DM , Baek D , Shin C , et al. Weak seed-pairing stability and high target-site abundance decrease the proficiency of lsy-6 and other microRNAs . Nat Struct Mol Biol 2011 ; 18 ( 10 ): 1139 – 46 . http://dx.doi.org/10.1038/nsmb.2115 Google Scholar CrossRef Search ADS PubMed 52 Helwak A , Kudla G , Dudnakova T , et al. Mapping the human miRNA interactome by CLASH reveals frequent noncanonical binding . Cell 2013 ; 153 ( 3 ): 654 – 65 . http://dx.doi.org/10.1016/j.cell.2013.03.043 Google Scholar CrossRef Search ADS PubMed 53 Kanehisa M , Goto S , Sato Y , et al. KEGG for integration and interpretation of large-scale molecular data sets . Nucleic Acids Res 2012 ; 40 : D109 – 14 . Google Scholar CrossRef Search ADS PubMed 54 Hindorff LA , Sethupathy P , Junkins HA , et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits . Proc Natl Acad Sci USA 2009 ; 106 ( 23 ): 9362 – 7 . http://dx.doi.org/10.1073/pnas.0903103106 Google Scholar CrossRef Search ADS PubMed 55 Mailman MD , Feolo M , Jin Y , et al. The NCBI dbGaP database of genotypes and phenotypes . Nat Genet 2007 ; 39 ( 10 ): 1181 – 6 . Google Scholar CrossRef Search ADS PubMed 56 GTEx Consortium . The Genotype-Tissue Expression (GTEx) Project . Nat Genet 2013 ; 45 : 580 – 5 . http://dx.doi.org/10.1038/ng.2653 CrossRef Search ADS PubMed 57 Ashburner M , Ball CA , Blake JA , et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium . Nat Genet 2000 ; 25 ( 1 ): 25 – 9 . Google Scholar CrossRef Search ADS PubMed 58 Gong J , Tong Y , Zhang HM , et al. Genome-wide identification of SNPs in microRNA genes and the SNP effects on microRNA target binding and biogenesis . Hum Mutat 2012 ; 33 ( 1 ): 254 – 63 . http://dx.doi.org/10.1002/humu.21641 Google Scholar CrossRef Search ADS PubMed 59 Li JH , Liu S , Zhou H , et al. starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data . Nucleic Acids Res 2014 ; 42 : D92 – 7 . Google Scholar CrossRef Search ADS PubMed 60 Cancer Genome Atlas Research Network . Comprehensive genomic characterization defines human glioblastoma genes and core pathways . Nature 2008 ; 455 : 1061 – 8 . http://dx.doi.org/10.1038/nature07385 CrossRef Search ADS PubMed 61 Riffo-Campos AL , Riquelme I , Brebi-Mieville P. Tools for sequence-based miRNA target prediction: what to choose? Int J Mol Sci 2016 ; 17 ( 12 ): 1987 . Google Scholar CrossRef Search ADS 62 Yang L , Li Y , Cheng M , et al. A functional polymorphism at microRNA-629-binding site in the 3'-untranslated region of NBS1 gene confers an increased risk of lung cancer in Southern and Eastern Chinese population . Carcinogenesis 2012 ; 33 ( 2 ): 338 – 47 . http://dx.doi.org/10.1093/carcin/bgr272 Google Scholar CrossRef Search ADS PubMed 63 Kapeller J , Houghton LA , Monnikes H , et al. First evidence for an association of a functional variant in the microRNA-510 target site of the serotonin receptor-type 3E gene with diarrhea predominant irritable bowel syndrome . Hum Mol Genet 2008 ; 17 ( 19 ): 2967 – 77 . http://dx.doi.org/10.1093/hmg/ddn195 Google Scholar CrossRef Search ADS PubMed 64 Sethupathy P , Borel C , Gagnebin M , et al. Human microRNA-155 on chromosome 21 differentially interacts with its polymorphic target in the AGTR1 3' untranslated region: a mechanism for functional single-nucleotide polymorphisms related to phenotypes . Am J Hum Genet 2007 ; 81 ( 2 ): 405 – 13 . http://dx.doi.org/10.1086/519979 Google Scholar CrossRef Search ADS PubMed 65 Moszynska A , Gebert M , Collawn JF , et al. SNPs in microRNA target sites and their potential role in human disease . Open Biol 2017 ; 7 : 170019 . http://dx.doi.org/10.1098/rsob.170019 Google Scholar CrossRef Search ADS PubMed 66 Yates A , Akanni W , Amode MR , et al. Ensembl 2016 . Nucleic Acids Res 2016 ; 44 : D710 – 16 . Google Scholar CrossRef Search ADS PubMed 67 Ding J , Li X , Hu H. TarPmiR: a new approach for microRNA target site prediction . Bioinformatics 2016 ; 32 ( 18 ): 2768 – 75 . http://dx.doi.org/10.1093/bioinformatics/btw318 Google Scholar CrossRef Search ADS PubMed © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Briefings in Bioinformatics Oxford University Press

A review of databases predicting the effects of SNPs in miRNA genes or miRNA-binding sites

Loading next page...
 
/lp/ou_press/a-review-of-databases-predicting-the-effects-of-snps-in-mirna-genes-or-yr9M5FWNYZ
Publisher
Oxford University Press
Copyright
© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ISSN
1467-5463
eISSN
1477-4054
D.O.I.
10.1093/bib/bbx155
Publisher site
See Article on Publisher Site

Abstract

Abstract Modern precision medicine comprises the knowledge and understanding of individual differences in the genomic sequence of patients to provide tailor-made treatments. Regularly, such variants are considered in coding regions only, and their effects are predicted based on their impact on the amino acid sequence of expressed proteins. However, assessing the effects of variants in noncoding elements, in particular microRNAs (miRNAs) and their binding sites, is important as well, as a single miRNA can influence the expression patterns of many genes at the same time. To analyze the effects of variants in miRNAs and their target sites, several databases storing variant impact predictions have been published. In this review, we will compare the core functionalities and features of these databases and discuss the importance of up-to-date data resources in the context of web applications. Finally, we will outline some recommendations for future developments in the field. miRNAs, SNPs, databases, target sites Introduction With the advent of next-generation sequencing, the amount of available biological data sets is continuously increasing [1, 2]. Having these high-throughput technologies, the discovery of single-nucleotide polymorphisms (SNPs) or single-nucleotide variants (SNVs) has been greatly facilitated. It is therefore not surprising that during the past decade, the number of known variants has increased exponentially. The largest resource as of today storing human genetic variations is NCBI’s dbSNP [3], which in its current version (build 150) encompasses over 100 million validated variants, resulting in one variant every 30 bases. Importantly, SNPs have been used as markers for a large panel of diseases, such as cystic fibrosis [4], various cancers [5–7] and neurodegenerative diseases [8, 9]. Indeed, variants in coding regions might directly affect protein formation and expression and are therefore still in the main focus of current variant analysis applications. The effects of variants located in noncoding regions, however, are more difficult to elucidate. In recent years, increasing attention has been paid to the noncoding regions of the human genome. In fact, noncoding regions make up over 98% of the genome [10]. Many regulatory RNA classes have been discovered in these so far, such as long noncoding RNAs, or microRNAs (miRNAs). The latter are endogenous small noncoding RNA molecules that play a central role in posttranscriptional gene regulation [11]. They are evolutionary conserved and expected to regulate a large part of the human protein coding genes and a majority of pathways [12]. Therefore, especially blood-borne miRNAs have been investigated as noninvasive biomarkers for an early detection of multiple diseases [13–15], highlighting their potential for precision medicine. Regarding their general mechanism of action, miRNAs bind to the 3ʹ untranslated region (UTR) of mRNAs, which leads either to mRNA cleavage or translational inhibition [16]. Recent studies revealed that miRNAs can sometimes bind to the 5ʹ UTR as well, leading to an increased mRNA expression [17–20]. Multiple biogenesis pathways have been reported for miRNAs in mammals [21]; however, most miRNAs seem to follow a single pathway. Their maturation process starts with the transcription of a longer primary miRNA (pri-miRNA) molecule containing a local stem-loop structure. This molecule is processed by the Microprocessor complex composed of Drosha and DGCR8, which cleaves the stem-loop to release a small hairpin, the precursor miRNA (pre-miRNA), with a 2 nucleotide (nt) long 3ʹ overhang [22]. The hairpin is then exported into the cytosol, where its loop is cleaved by Dicer resulting in a small RNA duplex [23]. The latter is then loaded onto an Argonaute protein to form the RNA-induced silencing complex (RISC) [24] and subsequently one of the two strands is degraded. The remaining strand then guides the RISC to its mRNA target. The complete binding mechanism of miRNAs is not yet fully understood, though the complementarity of the seed region consisting of six to eight nts starting at the second position from the 5ʹ end of the miRNA to the 3ʹ UTR is playing a crucial role for mRNA target selection [25]. As perfect, or nearly perfect complementarity to such binding sites, also called miRNA response elements (MREs), is required, it is evident that SNVs in these sites or inside the miRNA seed sequence can have a substantial impact on the overall regulation network. Thereby, SNVs in MREs can lead to a loss of binding ability of certain miRNAs, but at the same time increase the binding ability of other miRNAs. Further, SNVs outside of MREs might result in the creation of new MREs. In the same vein, SNVs in miRNAs might lead to the loss of regulation ability of a target gene, but also to a gain of regulation of another target gene. Furthermore, SNVs in pri- or pre-miRNAs could have a large regulatory effect as well, as they could, in rare cases, lead to changes in the secondary structure of the pri-miRNA and thus to reduced cleaving efficiency by Drosha and Dicer [26]. Even though the seed regions of miRNAs are evolutionarily conserved [27] and the occurrence of SNVs in these regions are rare, a multitude of diseases has been found to be associated with such [28–30]. Among the prominent examples are mental disorders like schizophrenia and autism [31], multitudinous cancers and nonsyndromic progressive hearing loss [32]. Similarly, SNVs in 3ʹ UTRs have been correlated with multiple cancers [33–36] and neurodegenerative diseases [37]. These examples highlight the need for a better understanding of the presence and effects of SNVs in miRNAs and UTRs, as they can lead to completely different phenotypes. For analyzing the effects of SNVs on miRNA–target relations on a system-wide basis, several miRNA-target SNP databases such as miRdSNP [38], MirSNP [39], PolymiRTS database [40] and miRNASNP [41] have been developed. However, as the procedure of target prediction is computationally expensive and the number of known SNVs has been increasing substantially over the past years, most of these resources are outdated. In this review, we will compare several state-of-the-art databases that are available and evaluate apparent information gaps to the content of up-to-date resources. Overview of SNP effect prediction web servers In this section, we present four databases helping to predict and assess the effects of SNVs in human miRNAs and their respective target genes. Table 1 presents a compact overview of these databases and their provided features. Table 1. Overview of the features provided by the evaluated databases miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Year of publication 2012 2012 2013 2015 Search miRNAs/genes/SNPs Yes Yes Yes Yes Batch search No Yes Yes No Search-linked SNPs No Yes No No Browse Yes No Yes Yes Binding site locus visualization Yes No No No Binding site miRNA/mRNA visualization Partiala Yes No Yes MRE gain/loss Partialb Yes Yes Yes Binding affinity increase/decrease No Yes No No SNP distance Yes, in 3ʹ UTR No No Yes, in miRNA flanks Filter by MAF No Yes No Yes, in pre-miRNAs Conservation information No Yes Yes No Filter by experimental support Yes No Yes Yes Filter by disease/traits/GO Yes No Yes No Contains INDELs No No Yes No Contains miRNA/gene expression No No No Yes SNPs in miRNAs No No Yes Yes SNPS in 3ʹ UTR Yes Yes Yes Yes miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Year of publication 2012 2012 2013 2015 Search miRNAs/genes/SNPs Yes Yes Yes Yes Batch search No Yes Yes No Search-linked SNPs No Yes No No Browse Yes No Yes Yes Binding site locus visualization Yes No No No Binding site miRNA/mRNA visualization Partiala Yes No Yes MRE gain/loss Partialb Yes Yes Yes Binding affinity increase/decrease No Yes No No SNP distance Yes, in 3ʹ UTR No No Yes, in miRNA flanks Filter by MAF No Yes No Yes, in pre-miRNAs Conservation information No Yes Yes No Filter by experimental support Yes No Yes Yes Filter by disease/traits/GO Yes No Yes No Contains INDELs No No Yes No Contains miRNA/gene expression No No No Yes SNPs in miRNAs No No Yes Yes SNPS in 3ʹ UTR Yes Yes Yes Yes a Depends on the target prediction algorithm used. b Only gains were predicted for dSNPs. Table 1. Overview of the features provided by the evaluated databases miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Year of publication 2012 2012 2013 2015 Search miRNAs/genes/SNPs Yes Yes Yes Yes Batch search No Yes Yes No Search-linked SNPs No Yes No No Browse Yes No Yes Yes Binding site locus visualization Yes No No No Binding site miRNA/mRNA visualization Partiala Yes No Yes MRE gain/loss Partialb Yes Yes Yes Binding affinity increase/decrease No Yes No No SNP distance Yes, in 3ʹ UTR No No Yes, in miRNA flanks Filter by MAF No Yes No Yes, in pre-miRNAs Conservation information No Yes Yes No Filter by experimental support Yes No Yes Yes Filter by disease/traits/GO Yes No Yes No Contains INDELs No No Yes No Contains miRNA/gene expression No No No Yes SNPs in miRNAs No No Yes Yes SNPS in 3ʹ UTR Yes Yes Yes Yes miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Year of publication 2012 2012 2013 2015 Search miRNAs/genes/SNPs Yes Yes Yes Yes Batch search No Yes Yes No Search-linked SNPs No Yes No No Browse Yes No Yes Yes Binding site locus visualization Yes No No No Binding site miRNA/mRNA visualization Partiala Yes No Yes MRE gain/loss Partialb Yes Yes Yes Binding affinity increase/decrease No Yes No No SNP distance Yes, in 3ʹ UTR No No Yes, in miRNA flanks Filter by MAF No Yes No Yes, in pre-miRNAs Conservation information No Yes Yes No Filter by experimental support Yes No Yes Yes Filter by disease/traits/GO Yes No Yes No Contains INDELs No No Yes No Contains miRNA/gene expression No No No Yes SNPs in miRNAs No No Yes Yes SNPS in 3ʹ UTR Yes Yes Yes Yes a Depends on the target prediction algorithm used. b Only gains were predicted for dSNPs. miRdSNP Published in 2012, miRdSNP [38] provides information about the distance between MREs and SNPs from dbSNP (build 130) [3]. It incorporates experimentally validated MREs from TarBase [42], miRTarBase [43], miRecords [44] and miR2disease [45], as well as the MRE predictions yielded by PicTar [46] and TargetScan 5.1 [47] on the wild type 3ʹ UTR sequences. Further, a manually curated set of disease-causing SNPs (dSNPs) is available, allowing to post-filter the provided predictions accordingly. In addition, new MREs induced by dSNPs that were predicted using miRanda [48] are listed. The website provides three tabs for browsing through the collected interactions. On the first tab, a searchable table containing all MREs and dSNPs is printed. Users can search using SNPs by their dbSNP ID, miRNAs by their miRBase name and genes by their symbols. In addition, the table can be filtered according to the SNP distance to a binding site and associated diseases. For each entry in the table, the corresponding binding site of the miRNA can be displayed, as well as the location of the associated SNP. In addition, for each SNP linkage disequilibrium (LD) frequency information is provided. On the second tab, density plots visualizing MREs, SNPs and dSNPs are shown through an interactive plot, allowing to further inspect regions in the UCSC genome browser. In a third tab, a table showing the entire gene set and its associated MREs and SNPs are displayed. This table is however not filterable. For each gene, all binding sites and SNPs can be queried. In conclusion, miRdSNP is focusing on the spatial relationship of SNPs, and thus, its major strength is the distance information provided for all SNPs relative to known MREs. Another major strength is the collection of manually curated dSNPs and the ability to filter interactions accordingly. Its major weaknesses are on one hand the missing information on MRE losses and on the other that SNPs in miRNAs are not considered at all. The database is available at http://mirdsnp.ccr.buffalo.edu/. MirSNP Published in 2012, MirSNP [39] provides a collection of human SNPs in potential MREs located in 3ʹ UTRs, as predicted by miRanda. SNPs were collected from dbSNP build 135. For each reported interaction, the effect induced by the corresponding SNP is reported, i.e. the creation/deletion of an MRE or the increase/decrease of binding affinity. Besides, information on the minor allele frequency (MAF) and LD are provided, if available. Finally, it implements the ability to filter predictions with lists of SNPs from, e.g., GWAS and eQTL studies, including optionally linked SNPs from multiple populations as provided by HapMap [49]. The website provides three search forms allowing to perform single searches, batch searches or searches with a list of disease or trait associated SNPs. Via these forms, the user can query the database using SNPs by their dbSNP ID, genes by their symbols or RefSeq mRNA ID or miRNAs by their miRBase name. While individual searches can be restricted by MAF, linked SNPs can be requested as well. The search functionality was unavailable during our review process; therefore, we can only describe the search results according to the help page of the database. After submitting a request, the user is redirected to a separate page containing a table with predicted effects on MREs, i.e. gains and losses or changes to the binding affinity. This particular table displays information for both the reference and alternative alleles. In addition to the basic effects, multiple measures are listed: mirSVR score, MAF, miRanda score, binding energy, conservation information of phastCons 46way vertebrates from UCSC and a visualization of the miRNA/mRNA binding site. In summary, MirSNP is focusing on MREs predicted by miRanda on 3ʹ UTRs having known SNPs. Its major strength consists in the integration of MAF annotations and conservation scores of miRNA seed motifs. In contrast, major weaknesses are missing information on experimentally validated MREs and missing support of considering SNPs in miRNAs. The database is available at http://bioinfo.bjmu.edu.cn/mirsnp/search/. PolymiRTS database 3.0 The PolymiRTS Database was originally released in 2007 [50] and has now reached its third version [40], published in 2013. It is the most comprehensive database available to-date. In addition to SNPs, it also offers support for considering small insertions or deletions (INDELs) in the genomic regions of miRNAs and their target sites. In its third version, SNPs and INDELs were collected from dbSNP build 137. The predictions of creations or deletions of MREs were performed via TargetScan 6.2 [51]. Following, their likelihood is assessed using their TargetScan context+ score difference to the reference target site. In addition, experimental support information was incorporated from miRecords, TarBase, miRTarBase and multiple studies, and added to the predicted MREs, if available. Furthermore, target sites identified by CLASH (cross linking, ligation and sequencing of hybrids) experiments [52], which allow to directly identify the location of pairs comprising a target site and its binding miRNA, are also provided. Likewise, the database links polymorphisms in MREs with possibly impacted gene pathways from the KEGG database [53], and with various human diseases and traits based on data in the NHGRI GWAS catalog [54], dbGaP [55] and eQTLs from GTEx eQTL browser [56]. The website provides four tabs for browsing and searching interactions with either one or multiple terms and according to the chromosomal location. Users can search SNPs by their dbSNP ID, miRNAs by their miRBase name and genes by their symbol, description or RefSeq mRNA ID. Further, users can start a query by providing traits, such as ‘Metabolic syndrome’ or gene ontology [57] terms. All search results can be filtered to show only gains or losses of MREs and to show only effects having a particular experimental support. PolymiRTS also allows to filter search results by minimum occurrence of miRNA sites in other vertebrate genomes. After submitting, a list of genes fulfilling the requested criteria is shown to the user. By selecting one of these genes, the user is then redirected to a new tab, where all interactions related to this gene are presented. These are split into multiple tables, separating CLASH data, SNPs in miRNA target sites and MRE gains/losses caused by SNPs in miRNA seeds. If available, information on associations with human diseases and pathways is provided. In conclusion, the PolymiRTS Database 3.0 provides a large panoply of features going from simple MRE gains and losses up to pathways and diseases. The major strengths of this database are on one side the variety of features and, importantly, on the other the multitude of experimental evidence integrated, in particular CLASH experiments. Weaknesses of PolymiRTS are missing visualizations of binding sites in the context of the entire 3ʹUTR, and the alignment of a miRNA-mRNA duplex, which is only partly shown. Furthermore, the display of all relevant information is suboptimal, as it is mixed with a lot of other content and therefore confusing. The database is available at http://compbio.uthsc.edu/miRSNP/. miRNASNP v2.0 It was initially published in 2012 [58] and updated in 2015 [41]. It provides information on the gain or loss of potential MREs caused by SNPs in 3ʹUTRs or miRNA seed regions. MREs were predicted by miRanda and TargetScan 6.2, and SNPs collected from dbSNP build 137. Experimental validated targets were retrieved from TarBase, starBase [59], miRecords, miRTarBase and miR2disease, providing the ability to filter the predictions according to these annotations. In addition to its base functionality, the database allows to evaluate the effects of SNPs in pre-miRNAs on their folding energy, and supplies lists of SNPs in flanking regions. To estimate the effect on the expression levels, correlation of miRNAs and target genes expression data from The Cancer Genome Atlas (TCGA) [60] was integrated. Similar to PolymiRTS, miRNASNP incorporates information about SNPs in GWAS-identified trait-associated regions from the NHGRI GWAS Catalog and about LD blocks for multiple populations. Finally, it also provides the possibility to predict the effects of novel SNVs on miRNA target binding, i.e. gain or loss of MREs, as well as the impact on the structure of the pre-miRNA. The website allows to browse different subsections of the database by providing respective links on the homepage. The search functionality is incorporated directly in the browse interface and can also be used from another tab. A search tab allows users to search for MRE gains or losses caused by SNPs in miRNA seed regions or in 3ʹUTRs. Search requests can be performed via dbSNP ID, miRNA miRBase name or gene symbol. Once selected, the user is redirected to a distinct browse interface, where post-filtering according to miRNA or gene expression is possible, as well as to SNPs in LD regions. Lost MREs can be filtered based on the available experimental validation and negative correlation of miRNA expression with gene expression. A resulting table prints information on the expression of miRNAs and genes in different tissues covered by the integrated TCGA samples. Moreover, the miRNA- or mRNA-binding site is displayed and the binding energy changes of the miRNA/mRNA duplexes are shown. Other search functions are provided as well, so that users can search for specific miRNA precursors, the effects of SNPs in flanking regions, seed regions or pre-miRNA except seed regions. As a result of these queries, a list of SNPs and their distances to the pre-miRNA and a list of SNPs with their predicted effect on the mature miRNA expression, as well as the introduced energy changes, are displayed. In addition, for each pre-miRNA, a detailed overview helps to retrieve the location and expression of mature miRNAs, the list of SNPs found in it and their effect on the secondary structure. In particular, for these SNPs, LD regions are shown in different populations and linked diseases or traits are reported. miRNASNP v2.0 offers a tab where users can input custom-mutated UTR sequences or miRNA sequences in addition to mutated pre-miRNA sequences to predict the resulting MRE gains/losses or the effect on the secondary structure. In summary, miRNASNP v2.0 is based on MRE predictions of miRanda and TargetScan and provides many additional features. Its major strengths are the incorporation of expression data, the ability to assess the impact of SNPs on the secondary structure of pre-miRNAs including the potential effects on the mature miRNA expression and the possibility to evaluate novel data. A major weakness is the missing batch search for known and novel data sets. The database is available at http://bioinfo.life.hust.edu.cn/miRNASNP2/index.php. Comparison of SNP effect prediction Web servers In our study, we compared the above described databases based on four criteria: their data sources, target prediction methods, database functionality and integration of experimental information. After the comparison, we assessed their individual performance based on a benchmark on 16 reported and validated SNP effects. Data sources The release versions of the used data sets for each of the databases were retrieved from their respective publications. In cases where release versions for data sources were not specified, they were determined using the year of publication of the respective database. As shown in Table 2, the latest published databases miRNASNP v2.0 and PolymiRTS are the most up-to-date services among other published applications. Among these databases, only miRdSNP uses the provided wild-type MRE predictions from TargetScan and PicTar, which are available on their tool websites. Despite the fact that this approach helps in cutting down the processing time for predictions, this may lead to missed predicted normal interactions, as PicTar used an old assembly of the human genome (hg17) as reference genome for the MRE predictions, which thus needed to be lifted to the subsequent version (hg18), used by all other resources of miRdSNP. Table 2. Overview of the integrated resources and build versions of the databases in this review miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Latest Genome Assembly hg18 hg19 hg19 hg19 hg38 dbSNP 130 135 137 137 150 HapMap 27 28a – – 28 TargetScan 5.1 – – – 7.1 PicTar N.A. – – – N.A. miRBase 17a 18 20 19 21 miRTarBase 2.5a – 4.3+ 4.5a 6.1 MiRecords N.A. – N.A. N.A. N.A. miR2disease N.A. – – N.A. N.A. TarBase 6.0+ – 6.0+ 7.0a 7.0 starBase – – – N.A. N.A. CLASH experiments – – N.A. – N.A. TGCA – – – N.A. N.A. miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Latest Genome Assembly hg18 hg19 hg19 hg19 hg38 dbSNP 130 135 137 137 150 HapMap 27 28a – – 28 TargetScan 5.1 – – – 7.1 PicTar N.A. – – – N.A. miRBase 17a 18 20 19 21 miRTarBase 2.5a – 4.3+ 4.5a 6.1 MiRecords N.A. – N.A. N.A. N.A. miR2disease N.A. – – N.A. N.A. TarBase 6.0+ – 6.0+ 7.0a 7.0 starBase – – – N.A. N.A. CLASH experiments – – N.A. – N.A. TGCA – – – N.A. N.A. N.A. Version not available. – Data source not used. a Data source derived from date of publication. Table 2. Overview of the integrated resources and build versions of the databases in this review miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Latest Genome Assembly hg18 hg19 hg19 hg19 hg38 dbSNP 130 135 137 137 150 HapMap 27 28a – – 28 TargetScan 5.1 – – – 7.1 PicTar N.A. – – – N.A. miRBase 17a 18 20 19 21 miRTarBase 2.5a – 4.3+ 4.5a 6.1 MiRecords N.A. – N.A. N.A. N.A. miR2disease N.A. – – N.A. N.A. TarBase 6.0+ – 6.0+ 7.0a 7.0 starBase – – – N.A. N.A. CLASH experiments – – N.A. – N.A. TGCA – – – N.A. N.A. miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 Latest Genome Assembly hg18 hg19 hg19 hg19 hg38 dbSNP 130 135 137 137 150 HapMap 27 28a – – 28 TargetScan 5.1 – – – 7.1 PicTar N.A. – – – N.A. miRBase 17a 18 20 19 21 miRTarBase 2.5a – 4.3+ 4.5a 6.1 MiRecords N.A. – N.A. N.A. N.A. miR2disease N.A. – – N.A. N.A. TarBase 6.0+ – 6.0+ 7.0a 7.0 starBase – – – N.A. N.A. CLASH experiments – – N.A. – N.A. TGCA – – – N.A. N.A. N.A. Version not available. – Data source not used. a Data source derived from date of publication. With the exception of PolymiRTS and miRNASNP, none of the other databases have been updated after their initial publication. Even PolymiRTS and miRNASNP have not been updated since 3 and 2 years, respectively. The current version of dbSNP (build 150) contains 98 million more validated SNPs, i.e. 3.6-fold more SNPs, than the most recent dbSNP version (build 137) used by the here described databases. Furthermore, all of them use outdated miRBase versions. In Table 3, we report the number of considered miRNAs, UTRs and SNPs and the reported interactions of each database accordingly. In addition, we computed the latest numbers based on dbSNP build 149, miRBase version 21 and common predictions of miRanda, and the seed matching step of TargetScan 7.1. As it can be seen in Table 3, the number of 3ʹ UTRs has not changed much over the years. Table 3. Overview number of SNPs, miRNAs and interactions supported by the databases miRdSNP MirSNP PolymiRTS Database 3.0 miRNASNP v2.0 Latesta Number of miRNA (human) N.A. 1921 2578 2042 2588 Number of UTRs 19 834 29 273 18 678b 37 348 19 107 Number of SNPs in miRNA target sites (human) 175 351 414 510 358 874 566 176 1 810 468 Experimentally validated miRNA targets (human) N.A. N.A. 2070 393 936 322 160 Number of SNPs in miRNA seed regions (human) N.A. N.A. 271 227 2525 Number of total interactions 174c 1 562 149c 2 089 646c,d 1 271 259c 15 739 025 Gains of MREs because of polymorphism in miRNA N.A. N.A. N.A. 162 441c 5 559 979 Losses of MREs because of polymorphism in miRNA N.A. N.A. N.A. 153 290c 5 540 025 Gains of MREs because of polymorphism in UTR 174c 497 895c 906 703c 509 791c 1 940 014 Losses of MREs because of polymorphism in UTR N.A. 480 046c 923 082c 445 737c 2 699 007 miRdSNP MirSNP PolymiRTS Database 3.0 miRNASNP v2.0 Latesta Number of miRNA (human) N.A. 1921 2578 2042 2588 Number of UTRs 19 834 29 273 18 678b 37 348 19 107 Number of SNPs in miRNA target sites (human) 175 351 414 510 358 874 566 176 1 810 468 Experimentally validated miRNA targets (human) N.A. N.A. 2070 393 936 322 160 Number of SNPs in miRNA seed regions (human) N.A. N.A. 271 227 2525 Number of total interactions 174c 1 562 149c 2 089 646c,d 1 271 259c 15 739 025 Gains of MREs because of polymorphism in miRNA N.A. N.A. N.A. 162 441c 5 559 979 Losses of MREs because of polymorphism in miRNA N.A. N.A. N.A. 153 290c 5 540 025 Gains of MREs because of polymorphism in UTR 174c 497 895c 906 703c 509 791c 1 940 014 Losses of MREs because of polymorphism in UTR N.A. 480 046c 923 082c 445 737c 2 699 007 N.A. Not available. a According to dbSNP 149 and miRBase 21 and common predictions of miRanda and the seed matching step of TargetScan 7.1. b Data from PolymiRTS 2.0. c Effects for same gene only counted once. d Without MRE effects because of polymorphisms in miRNAs. Table 3. Overview number of SNPs, miRNAs and interactions supported by the databases miRdSNP MirSNP PolymiRTS Database 3.0 miRNASNP v2.0 Latesta Number of miRNA (human) N.A. 1921 2578 2042 2588 Number of UTRs 19 834 29 273 18 678b 37 348 19 107 Number of SNPs in miRNA target sites (human) 175 351 414 510 358 874 566 176 1 810 468 Experimentally validated miRNA targets (human) N.A. N.A. 2070 393 936 322 160 Number of SNPs in miRNA seed regions (human) N.A. N.A. 271 227 2525 Number of total interactions 174c 1 562 149c 2 089 646c,d 1 271 259c 15 739 025 Gains of MREs because of polymorphism in miRNA N.A. N.A. N.A. 162 441c 5 559 979 Losses of MREs because of polymorphism in miRNA N.A. N.A. N.A. 153 290c 5 540 025 Gains of MREs because of polymorphism in UTR 174c 497 895c 906 703c 509 791c 1 940 014 Losses of MREs because of polymorphism in UTR N.A. 480 046c 923 082c 445 737c 2 699 007 miRdSNP MirSNP PolymiRTS Database 3.0 miRNASNP v2.0 Latesta Number of miRNA (human) N.A. 1921 2578 2042 2588 Number of UTRs 19 834 29 273 18 678b 37 348 19 107 Number of SNPs in miRNA target sites (human) 175 351 414 510 358 874 566 176 1 810 468 Experimentally validated miRNA targets (human) N.A. N.A. 2070 393 936 322 160 Number of SNPs in miRNA seed regions (human) N.A. N.A. 271 227 2525 Number of total interactions 174c 1 562 149c 2 089 646c,d 1 271 259c 15 739 025 Gains of MREs because of polymorphism in miRNA N.A. N.A. N.A. 162 441c 5 559 979 Losses of MREs because of polymorphism in miRNA N.A. N.A. N.A. 153 290c 5 540 025 Gains of MREs because of polymorphism in UTR 174c 497 895c 906 703c 509 791c 1 940 014 Losses of MREs because of polymorphism in UTR N.A. 480 046c 923 082c 445 737c 2 699 007 N.A. Not available. a According to dbSNP 149 and miRBase 21 and common predictions of miRanda and the seed matching step of TargetScan 7.1. b Data from PolymiRTS 2.0. c Effects for same gene only counted once. d Without MRE effects because of polymorphisms in miRNAs. Interestingly, MiRNASNP v2.0 considers all reported 3ʹ UTRs, in contrast to the other databases that only consider the longest ones per gene, which explains the observed difference. When comparing the latest statistics with those of PolymiRTS 3.0 and miRNASNP v2.0, which are using the most recent dbSNP build among the four databases, we observe that the number of SNPs in miRNA target sites has increased by 3–5-fold, and the number of SNPs in miRNA seed regions has even grown 9–11-fold. These numbers are also reflected in the total of reported interactions, which have grown 8–12-fold. Of course, these numbers do not only depend on the tools at hand but also on the prediction algorithms. To compute the latest numbers, we have taken a similar approach to MiRNASNP v2.0, except that we did not restrict the predicted sites by TargetScan to the conserved ones. Therefore, our method reports less than PolymiRTS (reports all TargetScan hits) or MiRSNP (reports all miRanda hits), but more than MiRNASNP v2.0, if applied on current data. In Figure 1, all predicted interactions caused by polymorphisms in the 3ʹ UTR of all considered databases are shown. To this end, we collected all interactions from the database files available on their respective websites and converted all miRNA names to their latest in miRBase. To avoid counting identical effects found in different UTRs but for the same gene multiple times, we compared the reported miRNA–gene symbol–SNP triplets of each database instead of their UTR RefSeq identifier. In addition, we excluded the predicted increases or decreases in binding affinity of MirSNP. We can see that 76% (133) of MRE gains covered by miRdSNP are also found in all other databases. On the other hand, 573 442 interactions are also covered by all the other databases, representing between 27 and 60% of the total number of interactions reported by them. Especially, PolymiRTS differs from the others, which is not surprising because of its unfiltered use of TargetScan and reporting of interactions for both allelles. Regarding MirSNP and miRNASNP v2.0, 20% (192 240 and 191 021) of their interactions are unexplained by any other database, hence highlighting their heterogeneity. Figure 1. View largeDownload slide Total number of interactions shared by all evaluated databases. Figure 1. View largeDownload slide Total number of interactions shared by all evaluated databases. Prediction methods In Table 4, we summarized the prediction software used by the different tools. In general, all databases use either TargetScan or miRanda. A recent review by Riffo-Campos et al. [61] compares these algorithms in their latest versions. Interestingly, miRdSNP uses miRanda for the prediction of new MREs induced by SNPs, but uses data sets pre-predicted by TargetScan 5.2 and PicTar for normal interaction. This might be surprising for users, as one would expect the same prediction software used throughout the entire project. MiRNASNP v2.0 uses both miRanda and TargetScan for the prediction of gains and losses of MREs. Table 4. Versions of target prediction algorithms used in the considered databases miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 TargetScan – – 6.2 6.2 miRanda 3.3a 3.3a – 3.3a miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 TargetScan – – 6.2 6.2 miRanda 3.3a 3.3a – 3.3a Table 4. Versions of target prediction algorithms used in the considered databases miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 TargetScan – – 6.2 6.2 miRanda 3.3a 3.3a – 3.3a miRdSNP MirSNP PolymiRTS Database 3.0 MiRNASNP v2.0 TargetScan – – 6.2 6.2 miRanda 3.3a 3.3a – 3.3a In the majority of the databases in this review, SNP-induced UTR sequences were preprocessed to remove redundancy of target sites that are not affected by SNPs and to reduce the computational burden. MiRdSNP, miRNASNP and MirSNP consider only 25, 25 and 20 nts upstream and downstream of the SNP, respectively. However, this might exclude target sites with gaps or bulges, as miRNAs do not require perfect complementarity to the UTR sequences to form a miRNA-mRNA complex. We compared the prediction results of miRanda3.3a for shortened UTR sequences (30 nts) to the original complete UTR sequence and noticed that some target sites were not predicted by the algorithm when applied using the shorter sequences. On repeated testing, we found that a length of at least 80 nts (upstream and downstream) was required for the miRanda prediction algorithm to yield the same results as for the whole UTR region as input. Therefore, we chose this threshold for the predictions we made using miRanda as presented in Table 3. For PolymiRTS, we found no information regarding the considered nucleotides upstream and downstream. To assess the impact of shortening the UTR sequences to 25 nts, we computed the number of gains and losses predicted by miRanda with the shortened length and with the full UTR length on our collected data. Using the full UTR length, 11 433 628 gains and 11 953 935 losses were predicted by miRanda, whereas with shortened UTR length, more gains and losses could be found (11 553 901 and 12 074 380). To emphasize, the impact is noncritical (1%), but existent and potential correct predictions could be missed. Database functionality All databases offer the possibility to search for miRNA names, gene symbols or dbSNP identifiers to identify MRE gains or losses. Furthermore, all except miRdSNP and MirSNP consider SNPs and their effects in miRNAs. MirSNP and the PolymiRTS also offer the possibility to query multiple entries at the same time by providing a corresponding list of queries. In addition, all databases, except MirSNP, include experimental evidences collected from multiple miRNA target catalogs. Besides these features, only miRNASNP v2.0 allows to evaluate the effects of novel SNVs. However, miRNASNP accepts queries on the effects of SNP-induced UTR or miRNA sequences only one by one, whereas the possibility of uploading a VCF file would be more convenient, thereby allowing to easily study the effects of detected SNVs by variant-calling pipelines. Querying for SNPs linked to diseases is only implemented in miRdSNP. Even though PolymiRTS provides disease-related information, it allows querying for traits only. Although the majority of databases considers only single point mutations, it is important to note that PolymiRTS 3.0 also considers small INDELs, which form a large part of genomic variations. In addition, PolymiRTS is the only tool that integrates information from the KEGG pathway database, making it easier for the user to study the effects of genomic variations on biological pathways. Another important feature provided by PolymiRTS and MiRSNP is the annotation of evolutionary conservation of target sites. Integration of experimental information Another important factor to consider comprises the inclusion of experimental information for the predicted miRNA-target pairs as supported by the individual databases. As shown in Table 2, all databases except MirSNP integrate information from miRNA–target catalogs, allowing to filter for only experimentally validated miRNA–target relationships. It should however be noted that even if miRNA–target relationships are validated, often the exact location of the binding site is not known. Therefore, there is a higher risk of MRE losses of validated interactions being false positives because the miRNA originally never bound at the specified location. MiRdSNP is the only database containing manually curated disease-associated SNPs. In comparison, PolymiRTS associates genes with polymorphisms in 3ʹ UTRs with diseases and traits by considering the results from genome-wide association studies (GWASs) and expression quantitative trait loci studies (eQTLs). MiRNASNP v2.0 considers GWAS as well and reports LD regions. The traits associated to the LD regions seem however to be only accessible when querying for a particular pre-miRNA. As miRNAs usually upregulate or downregulate their target genes, it makes sense to consider their expression. MiRNASNP takes this into account by including miRNA and mRNA expression data (retrieved from TCGA) and reports their specific correlation. The PolymiRTS database misses this feature; however, it provides annotations from CLASH experiments, which provide explicit evidence for miRNA-mRNA target relationships in the context of SNPs or INDELs. This is an advantage as compared with the information content provided by other databases, where miRNA-mRNA target relationships are only reported based on their original sequences. Benchmark on 16 reported and validated SNP effects We evaluated the ability of the databases to recover 16 experimentally validated effects of 13 variants in the binding sites of miRNAs in 3ʹ UTRs, which were reported by multiple studies [37, 62–64] and summarized in a recent review [65]. We excluded miRSNP from the evaluation process; since at the time of running the benchmark, the search functionality was not available. The results for the remaining three databases are presented in Table 5. Table 5. Results of querying the databases with 16 reported and validated SNP effects mirRdSNP PolymiRTS Database 3.0 miRNASNP v2.0 Interaction  Gains hsa-miR-191-5p, hsa-miR-887-3p with MDM4 because of rs4245739 Not a dSNP, so no gain information Reported Reported, gain/loss inverted hsa-miR-140-5p with TP63 because of rs35592567 Not a dSNP, so no gain information Not reported Reported, gain/loss inverted hsa-miR-214-5p, hsa-miR-550a-5p with HNF1B because of rs2229295 Not a dSNP, so no gain information Reported, gain/loss inverted Reported hsa-miR-4271 with APOC3 because of rs4225 SNP not found Reported Reported hsa-miR-522-3p with PLIN4 because of rs8887 SNP not found Reported/wo gain/loss information Not reported hsa-miR-124-3p with FXN because of rs11145043 SNP not found Reported, gain/loss inverted Reported, gain/loss inverted hsa-miR-629-5p with NBS1 because of rs2735383 SNP not found Not reported Not reported  Losses hsa-miR-34b-3p with SCNA because of rs10024743 – Reported Reported hsa-miR-96-5p and hsa-miR-182-5p with PALLD because of rs1071738 – Reported Reported, but only 182-5p hsa-miR-137 with EFNB2 because of rs550067317 – SNP not found SNP not found hsa-miR-510-5p with HTR3E because of rs56109847 – Reported Reported hsa-miR-433-3p with FGF20 because of rs12720208 – Reported Reported hsa-miR-155-5p with AGTR1 because of rs5186 – Reported Reported Reported interactions 0 13 12 mirRdSNP PolymiRTS Database 3.0 miRNASNP v2.0 Interaction  Gains hsa-miR-191-5p, hsa-miR-887-3p with MDM4 because of rs4245739 Not a dSNP, so no gain information Reported Reported, gain/loss inverted hsa-miR-140-5p with TP63 because of rs35592567 Not a dSNP, so no gain information Not reported Reported, gain/loss inverted hsa-miR-214-5p, hsa-miR-550a-5p with HNF1B because of rs2229295 Not a dSNP, so no gain information Reported, gain/loss inverted Reported hsa-miR-4271 with APOC3 because of rs4225 SNP not found Reported Reported hsa-miR-522-3p with PLIN4 because of rs8887 SNP not found Reported/wo gain/loss information Not reported hsa-miR-124-3p with FXN because of rs11145043 SNP not found Reported, gain/loss inverted Reported, gain/loss inverted hsa-miR-629-5p with NBS1 because of rs2735383 SNP not found Not reported Not reported  Losses hsa-miR-34b-3p with SCNA because of rs10024743 – Reported Reported hsa-miR-96-5p and hsa-miR-182-5p with PALLD because of rs1071738 – Reported Reported, but only 182-5p hsa-miR-137 with EFNB2 because of rs550067317 – SNP not found SNP not found hsa-miR-510-5p with HTR3E because of rs56109847 – Reported Reported hsa-miR-433-3p with FGF20 because of rs12720208 – Reported Reported hsa-miR-155-5p with AGTR1 because of rs5186 – Reported Reported Reported interactions 0 13 12 Table 5. Results of querying the databases with 16 reported and validated SNP effects mirRdSNP PolymiRTS Database 3.0 miRNASNP v2.0 Interaction  Gains hsa-miR-191-5p, hsa-miR-887-3p with MDM4 because of rs4245739 Not a dSNP, so no gain information Reported Reported, gain/loss inverted hsa-miR-140-5p with TP63 because of rs35592567 Not a dSNP, so no gain information Not reported Reported, gain/loss inverted hsa-miR-214-5p, hsa-miR-550a-5p with HNF1B because of rs2229295 Not a dSNP, so no gain information Reported, gain/loss inverted Reported hsa-miR-4271 with APOC3 because of rs4225 SNP not found Reported Reported hsa-miR-522-3p with PLIN4 because of rs8887 SNP not found Reported/wo gain/loss information Not reported hsa-miR-124-3p with FXN because of rs11145043 SNP not found Reported, gain/loss inverted Reported, gain/loss inverted hsa-miR-629-5p with NBS1 because of rs2735383 SNP not found Not reported Not reported  Losses hsa-miR-34b-3p with SCNA because of rs10024743 – Reported Reported hsa-miR-96-5p and hsa-miR-182-5p with PALLD because of rs1071738 – Reported Reported, but only 182-5p hsa-miR-137 with EFNB2 because of rs550067317 – SNP not found SNP not found hsa-miR-510-5p with HTR3E because of rs56109847 – Reported Reported hsa-miR-433-3p with FGF20 because of rs12720208 – Reported Reported hsa-miR-155-5p with AGTR1 because of rs5186 – Reported Reported Reported interactions 0 13 12 mirRdSNP PolymiRTS Database 3.0 miRNASNP v2.0 Interaction  Gains hsa-miR-191-5p, hsa-miR-887-3p with MDM4 because of rs4245739 Not a dSNP, so no gain information Reported Reported, gain/loss inverted hsa-miR-140-5p with TP63 because of rs35592567 Not a dSNP, so no gain information Not reported Reported, gain/loss inverted hsa-miR-214-5p, hsa-miR-550a-5p with HNF1B because of rs2229295 Not a dSNP, so no gain information Reported, gain/loss inverted Reported hsa-miR-4271 with APOC3 because of rs4225 SNP not found Reported Reported hsa-miR-522-3p with PLIN4 because of rs8887 SNP not found Reported/wo gain/loss information Not reported hsa-miR-124-3p with FXN because of rs11145043 SNP not found Reported, gain/loss inverted Reported, gain/loss inverted hsa-miR-629-5p with NBS1 because of rs2735383 SNP not found Not reported Not reported  Losses hsa-miR-34b-3p with SCNA because of rs10024743 – Reported Reported hsa-miR-96-5p and hsa-miR-182-5p with PALLD because of rs1071738 – Reported Reported, but only 182-5p hsa-miR-137 with EFNB2 because of rs550067317 – SNP not found SNP not found hsa-miR-510-5p with HTR3E because of rs56109847 – Reported Reported hsa-miR-433-3p with FGF20 because of rs12720208 – Reported Reported hsa-miR-155-5p with AGTR1 because of rs5186 – Reported Reported Reported interactions 0 13 12 First of all, we noticed a usability problem when searching SNPs in 3ʹ UTRs using miRNASNP. The results for the same query were different when using the entry search page located at http://bioinfo.life.hust.edu.cn/miRNASNP2/search.php, or the results search page that is available after querying the database using the entry search page. An example is rs4245739, for which six target gains were reported when using the entry search page, whereas no target gains were reported if searching via the results search page. The reason for this might lie within the results search page, as there only miRNAs with any annotated expression data are shown. Additionally, the database provides the possibility to show results for miRNAs with a minimum average expression. However, the default option, which is to show interactions for all, does not show all miRNAs but filters out every miRNA for which no expression data are available, resulting in the observed difference above. As shown in Table 5, none of the queried effects were found in miRdSNP. This is because of the fact that MRE gains are only reported for their specifically determined subset of disease-associated SNPs and that MRE losses are not reported. Of the 16 evaluated effects, 13 were found in PolymiRTS and 12 in miRNASNP. We found that MRE gains or losses were often inverted in comparison with the reported effects. The reason for this is that the PolymiRTS database considers as reference nucleotide the ancestral allele, whereas miRNASNP considers the nucleotide in the reference genome. An example of an MRE loss not found in any database is induced by rs550067317. It is not reported because it is present in dbSNP only since build 142, which is newer than any builds used by the other databases. A further example of an MRE gain reported by no database is caused by rs2735383. This can be explained by the unusual binding of hsa-miR-629-5p, which has no complementary base pairing at the first position of the seed, and is therefore discarded by TargetScan. Another SNP rs35592567 for which PolymiRTS did not report any interaction forms a new binding site found by TargetScan because the first base of the seed can then bind, resulting in a perfect 8mer match. We could not determine why PolymiRTS is not reporting this interaction, as it is using TargetScan. MiRNASNP reports this interaction because not only TargetScan but also miRanda predicts it. The MRE gain of hsa-miR-522-3p with PLIN4 caused by rs8887 is reported by PolymiRTS but not by miRNASNP. This is because of the fact that TargetScan predicts the binding site with the A allele because of an 8mer seed match. MiRanda does predict the binding site as well, however, only with a binding energy of −8.96 kCal/Mol, which is filtered out by miRNASNP. Therefore, as both tools need to predict the MRE gain in miRNASNP, nothing is reported. The last different prediction of PolymiRTS and MiRNASNP is for the MRE loss of hsa-miR-96-5p with PALLD caused by rs1071738, which is only reported by PolymiRTS. This can again be explained by the different predictions of TargetScan and miRanda. TargetScan predicts in the original sequence a binding site with a 6mer followed by an A, which is disrupted by the SNP at the third base with a change of C to G. This MRE loss is not detected when using miRanda because it predicts different target sites in this gene, which are not affected by the SNP. When evaluating the binding sites of the predicted interactions, we noticed that the MRE gain in the 3ʹ UTR of HNF1B because of rs2229295 for hsa-miR-214-5p and for hsa-miR-500a-5p would not be detected anymore in hg38 by conventional prediction algorithms, as the nucleotide adjacent to the annotated SNP changed from G to A between hg19 and hg38, prohibiting any binding, as illustrated in Figure 2. This example also highlights the importance of the reference genomes used by these databases. Figure 2. View largeDownload slide Difference in the sequence of human genome in build hg19 and hg38 leading to undetected binding sites for hsa-miR-214-5p and hsa-miR-500a-5p. Figure 2. View largeDownload slide Difference in the sequence of human genome in build hg19 and hg38 leading to undetected binding sites for hsa-miR-214-5p and hsa-miR-500a-5p. In summary, PolymiRTS and miRNASNP cover our set of experimentally validated MRE gains and losses well, whereas miRdSNP did not find any MRE because of its restrictions to dSNPs. PolymiRTS performs minimally better than miRNASNP, as it covers one effect more, which is why we see it as a close winner. Of course, the size of this benchmark compared with the complete set of predictions is small, and therefore, a much larger benchmark comprising thousands of interactions would be required to evaluate the performance in a fair manner. However, as the number of experimentally validated effects of SNPs in MREs is small, we preferred to set the focus of the benchmark on a high-confidence set instead of including potential false positives. Which database to choose? As shown in the previous sections, all databases have some exclusive features. However, the search functionality of MirSNP was not available at the time of creating this review, which is why we cannot recommend it currently. The PolymiRTS Database 3.0 covers nearly all features of other databases and should therefore be the database of choice per default. If expression correlations of miRNAs and their mRNA targets are important or effects of SNPs on the pre-miRNAs, miRNASNP v2.0 is the database of choice. Furthermore, if users are studying novel SNVs, they should consider miRNASNP v2.0 as well, after reducing their SNVs to a small subset. We provide the up-to-date data we collected for this review in a database called miRSNPdb, which is reachable under www.ccb.uni-saarland.de/mirsnp. Users relying on more recent SNPs or having their own novel SNVs can use it to retrieve MRE gains and losses. Future challenges The substantial increase in annotated miRNAs and SNPs has made it extremely computationally intensive to predict novel target sites. We found that the first step performed by TargetScan took ∼15 h for 2588 miRNAs from miRBase v21 and all 19 107 3ʹ UTRs from Ensembl 85 [66]. As the number of SNPs in miRNA target sites is nearly 100-fold higher than the number of 3ʹ UTRs, even when reducing the predictions to the considered regions, the number of miRNA-mRNA target pairs rises substantially. Therefore, the runtime of target prediction programs and/or the algorithms assessing the impact of SNPs need to be improved to be able to keep up with the increasing amount of available data. All presented databases focus exclusively on the impact of SNPs in miRNAs and 3ʹ UTRs. However, it has been shown that miRNAs can also bind to 5ʹ UTRs or even to coding regions. The extent of these interactions is still unclear; therefore, including them into relevant databases could promote their investigation. More specifically, focusing on already validated interactions should be considered as the first step, as the effects in 5ʹ UTRs are expected to be less frequent, and therefore, including all predictions would also include a large set of false positives. Until now, all available databases focus on single SNPs or INDELs in either the 3ʹ UTR sequences or in miRNAs. However, it is not unlikely that multiple variants occur at the same time and induce other effects. The inherent exponential increase in variant combinations is a major challenge in this regard. With the recent advances of high-throughput miRNA-mRNA mapping via CLASH experiments, new miRNA target prediction tools, such as TarPmiR [67], have been developed. The progresses in the target prediction field will allow to improve the predictions of gains and losses induced by variants. Combined with the continuous increase in SNV data, keeping databases up-to-date with the latest software and reference data is important. We believe that because of the ever-growing amount of annotated SNPs and the thereby resulting substantial growth of predicted MRE, gains and losses a larger focus should be put on the curation of high-confidence sets. These could be narrowed down at first by considering the common predictions of more target prediction tools. The resulting predictions could then further be refined by experimental evidence, stemming, for example, from CLASH experiments. In addition, annotations for explicitly experimentally validated MRE gains or losses, such as the ones collected by Moszynska et al. [65], would highly improve the quality of such sets to eventually form a reliable gold standard. Overall, we think that such high-confidence data would increase the usefulness of such databases for precision medicine to a reasonable extent. Key Points A substantial increase has been observed in the number of reported SNPs and the thereby induced MREs over the past years. All currently available databases are based on outdated resources. PolymiRTS is the most complete available database followed by miRNASNP v2.0. Users studying novel SNVs should consider miRNASNP v2.0. Tobias Fehlmann is a PhD student at the Chair for Clinical Bioinformatics, Saarland University, Germany. He has been working in the field of miRNAs in Bioinformatics since 2014. Shashwat Sahay is a Master student at the Chair for Clinical Bioinformatics, Saarland University, Germany. He has been working in the field of miRNAs in Bioinformatics since 2016. Andreas Keller is a Professor and head of the Chair for Clinical Bioinformatics at Saarland University. He has been working in the field of miRNAs in Bioinformatics since 2008. Christina Backes is a Postdoc at the Chair for Clinical Bioinformatics at Saarland University. She has been working in the field of miRNAs in Bioinformatics since 2009. References 1 Cook CE , Bergman MT , Finn RD. The European Bioinformatics Institute in 2016: data growth and integration . Nucleic Acids Res 2016 ; 44 : D20 – 6 . Google Scholar CrossRef Search ADS PubMed 2 Kodama Y , Shumway M , Leinonen R. The sequence read archive: explosive growth of sequencing data . Nucleic Acids Res 2012 ; 40 : D54 – 6 . Google Scholar CrossRef Search ADS PubMed 3 Sherry ST , Ward MH , Kholodov M , et al. dbSNP: the NCBI database of genetic variation . Nucleic Acids Res 2001 ; 29 ( 1 ): 308 – 11 . http://dx.doi.org/10.1093/nar/29.1.308 Google Scholar CrossRef Search ADS PubMed 4 Bartoszewski RA , Jablonsky M , Bartoszewska S , et al. A synonymous single nucleotide polymorphism in DeltaF508 CFTR alters the secondary structure of the mRNA and the expression of the mutant protein . J Biol Chem 2010 ; 285 ( 37 ): 28741 – 8 . Google Scholar CrossRef Search ADS PubMed 5 Stracquadanio G , Wang X , Wallace MD , et al. The importance of p53 pathway genetics in inherited and somatic cancer genomes . Nat Rev Cancer 2016 ; 16 ( 4 ): 251 – 65 . http://dx.doi.org/10.1038/nrc.2016.15 Google Scholar CrossRef Search ADS PubMed 6 Zhang L , Long X. Association of three SNPs in TOX3 and breast cancer risk: evidence from 97275 cases and 128686 controls . Sci Rep 2015 ; 5 : 12773 . http://dx.doi.org/10.1038/srep12773 Google Scholar CrossRef Search ADS PubMed 7 Huang CY , Huang SP , Lin VC , et al. Genetic variants of the autophagy pathway as prognostic indicators for prostate cancer . Sci Rep 2015 ; 5 ( 1 ): 14045 . http://dx.doi.org/10.1038/srep14045 Google Scholar CrossRef Search ADS PubMed 8 Lambert JC , Ibrahim-Verbaas CA , Harold D , et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease . Nat Genet 2013 ; 45 : 1452 – 8 . http://dx.doi.org/10.1038/ng.2802 Google Scholar CrossRef Search ADS PubMed 9 De Marchi F , Carecchio M , Cantello R , et al. Predicting cognitive decline in Parkinson's disease: can we ask the genes? Front Neurol 2014 ; 5 : 224 . Google Scholar CrossRef Search ADS PubMed 10 Mattick JS. Non-coding RNAs: the architects of eukaryotic complexity . EMBO Rep 2001 ; 2 ( 11 ): 986 – 91 . http://dx.doi.org/10.1093/embo-reports/kve230 Google Scholar CrossRef Search ADS PubMed 11 Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function . Cell 2004 ; 116 ( 2 ): 281 – 97 . http://dx.doi.org/10.1016/S0092-8674(04)00045-5 Google Scholar CrossRef Search ADS PubMed 12 Friedman RC , Farh KK , Burge CB , et al. Most mammalian mRNAs are conserved targets of microRNAs . Genome Res 2009 ; 19 : 92 – 105 . Google Scholar CrossRef Search ADS PubMed 13 Leidinger P , Backes C , Deutscher S , et al. A blood based 12-miRNA signature of Alzheimer disease patients . Genome Biol 2013 ; 14 ( 7 ): R78 . Google Scholar CrossRef Search ADS PubMed 14 Mitchell PS , Parkin RK , Kroh EM , et al. Circulating microRNAs as stable blood-based markers for cancer detection . Proc Natl Acad Sci USA 2008 ; 105 ( 30 ): 10513 – 18 . http://dx.doi.org/10.1073/pnas.0804549105 Google Scholar CrossRef Search ADS PubMed 15 Roth P , Keller A , Hoheisel JD , et al. Differentially regulated miRNAs as prognostic biomarkers in the blood of primary CNS lymphoma patients . Eur J Cancer 2015 ; 51 ( 3 ): 382 – 90 . http://dx.doi.org/10.1016/j.ejca.2014.10.028 Google Scholar CrossRef Search ADS PubMed 16 Pillai RS. MicroRNA function: multiple mechanisms for a tiny RNA? RNA 2005 ; 11 ( 12 ): 1753 – 61 . Google Scholar CrossRef Search ADS PubMed 17 Zhou H , Rigoutsos I. MiR-103a-3p targets the 5' UTR of GPRC5A in pancreatic cells . RNA 2014 ; 20 ( 9 ): 1431 – 9 . http://dx.doi.org/10.1261/rna.045757.114 Google Scholar CrossRef Search ADS PubMed 18 Henke JI , Goergen D , Zheng J , et al. microRNA-122 stimulates translation of hepatitis C virus RNA . EMBO J 2008 ; 27 ( 24 ): 3300 – 10 . http://dx.doi.org/10.1038/emboj.2008.244 Google Scholar CrossRef Search ADS PubMed 19 Orom UA , Nielsen FC , Lund AH. MicroRNA-10a binds the 5'UTR of ribosomal protein mRNAs and enhances their translation . Mol Cell 2008 ; 30 : 460 – 71 . http://dx.doi.org/10.1016/j.molcel.2008.05.001 Google Scholar CrossRef Search ADS PubMed 20 Sacco L , Masotti A. Recent insights and novel bioinformatics tools to understand the role of microRNAs binding to 5' untranslated region . Int J Mol Sci 2012 ; 14 ( 1 ): 480 – 95 . http://dx.doi.org/10.3390/ijms14010480 Google Scholar CrossRef Search ADS PubMed 21 Ha M , Kim VN. Regulation of microRNA biogenesis . Nat Rev Mol Cell Biol 2014 ; 15 ( 8 ): 509 – 24 . http://dx.doi.org/10.1038/nrm3838 Google Scholar CrossRef Search ADS PubMed 22 Lee Y , Ahn C , Han J , et al. The nuclear RNase III Drosha initiates microRNA processing . Nature 2003 ; 425 ( 6956 ): 415 – 19 . http://dx.doi.org/10.1038/nature01957 Google Scholar CrossRef Search ADS PubMed 23 Hutvagner G , McLachlan J , Pasquinelli AE , et al. A cellular function for the RNA-interference enzyme Dicer in the maturation of the let-7 small temporal RNA . Science 2001 ; 293 ( 5531 ): 834 – 8 . http://dx.doi.org/10.1126/science.1062961 Google Scholar CrossRef Search ADS PubMed 24 Hammond SM , Boettcher S , Caudy AA , et al. Argonaute2, a link between genetic and biochemical analyses of RNAi . Science 2001 ; 293 ( 5532 ): 1146 – 50 . http://dx.doi.org/10.1126/science.1064023 Google Scholar CrossRef Search ADS PubMed 25 Bartel DP. MicroRNAs: target recognition and regulatory functions . Cell 2009 ; 136 ( 2 ): 215 – 33 . http://dx.doi.org/10.1016/j.cell.2009.01.002 Google Scholar CrossRef Search ADS PubMed 26 Duan R , Pak C , Jin P. Single nucleotide polymorphism associated with mature miR-125a alters the processing of pri-miRNA . Hum Mol Genet 2007 ; 16 ( 9 ): 1124 – 31 . http://dx.doi.org/10.1093/hmg/ddm062 Google Scholar CrossRef Search ADS PubMed 27 Lewis BP , Shih IH , Jones-Rhoades MW , et al. Prediction of mammalian microRNA targets . Cell 2003 ; 115 ( 7 ): 787 – 98 . http://dx.doi.org/10.1016/S0092-8674(03)01018-3 Google Scholar CrossRef Search ADS PubMed 28 Jazdzewski K , Murray EL , Franssila K , et al. Common SNP in pre-miR-146a decreases mature miR expression and predisposes to papillary thyroid carcinoma . Proc Natl Acad Sci USA 2008 ; 105 ( 20 ): 7269 – 74 . http://dx.doi.org/10.1073/pnas.0802682105 Google Scholar CrossRef Search ADS PubMed 29 Shen J , Ambrosone CB , DiCioccio RA , et al. A functional polymorphism in the miR-146a gene and age of familial breast/ovarian cancer diagnosis . Carcinogenesis 2008 ; 29 ( 10 ): 1963 – 6 . http://dx.doi.org/10.1093/carcin/bgn172 Google Scholar CrossRef Search ADS PubMed 30 Xu T , Zhu Y , Wei QK , et al. A functional polymorphism in the miR-146a gene is associated with the risk for hepatocellular carcinoma . Carcinogenesis 2008 ; 29 ( 11 ): 2126 – 31 . http://dx.doi.org/10.1093/carcin/bgn195 Google Scholar CrossRef Search ADS PubMed 31 Sun G , Yan J , Noltner K , et al. SNPs in human miRNA genes affect biogenesis and function . RNA 2009 ; 15 ( 9 ): 1640 – 51 . http://dx.doi.org/10.1261/rna.1560209 Google Scholar CrossRef Search ADS PubMed 32 Mencia A , Modamio-Hoybjor S , Redshaw N , et al. Mutations in the seed region of human miR-96 are responsible for nonsyndromic progressive hearing loss . Nat Genet 2009 ; 41 : 609 – 13 . http://dx.doi.org/10.1038/ng.355 Google Scholar CrossRef Search ADS PubMed 33 Zhou L , Zhang X , Li Z , et al. Association of a genetic variation in a miR-191 binding site in MDM4 with risk of esophageal squamous cell carcinoma . PLoS One 2013 ; 8 ( 5 ): e64331 . Google Scholar CrossRef Search ADS PubMed 34 Gao F , Xiong X , Pan W , et al. A regulatory MDM4 genetic variant locating in the binding sequence of multiple MicroRNAs contributes to susceptibility of small cell lung cancer . PLoS One 2015 ; 10 ( 8 ): e0135647 . Google Scholar CrossRef Search ADS PubMed 35 Stegeman S , Moya L , Selth LA , et al. A genetic variant of MDM4 influences regulation by multiple microRNAs in prostate cancer . Endocr Relat Cancer 2015 ; 22 ( 2 ): 265 – 76 . http://dx.doi.org/10.1530/ERC-15-0013 Google Scholar CrossRef Search ADS PubMed 36 Wang M , Du M , Ma L , et al. A functional variant in TP63 at 3q28 associated with bladder cancer risk by creating an miR-140-5p binding site . Int J Cancer 2016 ; 139 ( 1 ): 65 – 74 . http://dx.doi.org/10.1002/ijc.29978 Google Scholar CrossRef Search ADS PubMed 37 Wang G , van der Walt JM , Mayhew G , et al. Variation in the miRNA-433 binding site of FGF20 confers risk for Parkinson disease by overexpression of alpha-synuclein . Am J Hum Genet 2008 ; 82 ( 2 ): 283 – 9 . http://dx.doi.org/10.1016/j.ajhg.2007.09.021 Google Scholar CrossRef Search ADS PubMed 38 Bruno AE , Li L , Kalabus JL , et al. miRdSNP: a database of disease-associated SNPs and microRNA target sites on 3'UTRs of human genes . BMC Genomics 2012 ; 13 ( 1 ): 44 . http://dx.doi.org/10.1186/1471-2164-13-44 Google Scholar CrossRef Search ADS PubMed 39 Liu C , Zhang F , Li T , et al. MirSNP, a database of polymorphisms altering miRNA target sites, identifies miRNA-related SNPs in GWAS SNPs and eQTLs . BMC Genomics 2012 ; 13 ( 1 ): 661 . http://dx.doi.org/10.1186/1471-2164-13-661 Google Scholar CrossRef Search ADS PubMed 40 Bhattacharya A , Ziebarth JD , Cui Y. PolymiRTS database 3.0: linking polymorphisms in microRNAs and their target sites with human diseases and biological pathways . Nucleic Acids Res 2014 ; 42 : D86 – 91 . Google Scholar CrossRef Search ADS PubMed 41 Gong J , Liu C , Liu W , et al. An update of miRNASNP database for better SNP selection by GWAS data, miRNA expression and online tools . Database 2015 ; 2015 : bav029 . Google Scholar CrossRef Search ADS PubMed 42 Sethupathy P , Corda B , Hatzigeorgiou AG. TarBase: a comprehensive database of experimentally supported animal microRNA targets . RNA 2006 ; 12 ( 2 ): 192 – 7 . Google Scholar CrossRef Search ADS PubMed 43 Hsu SD , Lin FM , Wu WY , et al. miRTarBase: a database curates experimentally validated microRNA-target interactions . Nucleic Acids Res 2011 ; 39 : D163 – 9 . Google Scholar CrossRef Search ADS PubMed 44 Xiao F , Zuo Z , Cai G , et al. miRecords: an integrated resource for microRNA-target interactions . Nucleic Acids Res 2009 ; 37 : D105 – 10 . Google Scholar CrossRef Search ADS PubMed 45 Jiang Q , Wang Y , Hao Y , et al. miR2Disease: a manually curated database for microRNA deregulation in human disease . Nucleic Acids Res 2009 ; 37 : D98 – 104 . Google Scholar CrossRef Search ADS PubMed 46 Krek A , Grun D , Poy MN , et al. Combinatorial microRNA target predictions . Nat Genet 2005 ; 37 ( 5 ): 495 – 500 . http://dx.doi.org/10.1038/ng1536 Google Scholar CrossRef Search ADS PubMed 47 Lewis BP , Burge CB , Bartel DP. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets . Cell 2005 ; 120 ( 1 ): 15 – 20 . http://dx.doi.org/10.1016/j.cell.2004.12.035 Google Scholar CrossRef Search ADS PubMed 48 Enright AJ , John B , Gaul U , et al. MicroRNA targets in Drosophila . Genome Biol 2003 ; 5 ( 1 ): R1 . Google Scholar CrossRef Search ADS PubMed 49 International HapMap Consortium ; Frazer KA , Ballinger DG , et al. A second generation human haplotype map of over 3.1 million SNPs . Nature 2007 ; 449 : 851 – 61 . Google Scholar CrossRef Search ADS PubMed 50 Bao L , Zhou M , Wu L , et al. PolymiRTS database: linking polymorphisms in microRNA target sites with complex traits . Nucleic Acids Res 2007 ; 35 : D51 – 4 . Google Scholar CrossRef Search ADS PubMed 51 Garcia DM , Baek D , Shin C , et al. Weak seed-pairing stability and high target-site abundance decrease the proficiency of lsy-6 and other microRNAs . Nat Struct Mol Biol 2011 ; 18 ( 10 ): 1139 – 46 . http://dx.doi.org/10.1038/nsmb.2115 Google Scholar CrossRef Search ADS PubMed 52 Helwak A , Kudla G , Dudnakova T , et al. Mapping the human miRNA interactome by CLASH reveals frequent noncanonical binding . Cell 2013 ; 153 ( 3 ): 654 – 65 . http://dx.doi.org/10.1016/j.cell.2013.03.043 Google Scholar CrossRef Search ADS PubMed 53 Kanehisa M , Goto S , Sato Y , et al. KEGG for integration and interpretation of large-scale molecular data sets . Nucleic Acids Res 2012 ; 40 : D109 – 14 . Google Scholar CrossRef Search ADS PubMed 54 Hindorff LA , Sethupathy P , Junkins HA , et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits . Proc Natl Acad Sci USA 2009 ; 106 ( 23 ): 9362 – 7 . http://dx.doi.org/10.1073/pnas.0903103106 Google Scholar CrossRef Search ADS PubMed 55 Mailman MD , Feolo M , Jin Y , et al. The NCBI dbGaP database of genotypes and phenotypes . Nat Genet 2007 ; 39 ( 10 ): 1181 – 6 . Google Scholar CrossRef Search ADS PubMed 56 GTEx Consortium . The Genotype-Tissue Expression (GTEx) Project . Nat Genet 2013 ; 45 : 580 – 5 . http://dx.doi.org/10.1038/ng.2653 CrossRef Search ADS PubMed 57 Ashburner M , Ball CA , Blake JA , et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium . Nat Genet 2000 ; 25 ( 1 ): 25 – 9 . Google Scholar CrossRef Search ADS PubMed 58 Gong J , Tong Y , Zhang HM , et al. Genome-wide identification of SNPs in microRNA genes and the SNP effects on microRNA target binding and biogenesis . Hum Mutat 2012 ; 33 ( 1 ): 254 – 63 . http://dx.doi.org/10.1002/humu.21641 Google Scholar CrossRef Search ADS PubMed 59 Li JH , Liu S , Zhou H , et al. starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data . Nucleic Acids Res 2014 ; 42 : D92 – 7 . Google Scholar CrossRef Search ADS PubMed 60 Cancer Genome Atlas Research Network . Comprehensive genomic characterization defines human glioblastoma genes and core pathways . Nature 2008 ; 455 : 1061 – 8 . http://dx.doi.org/10.1038/nature07385 CrossRef Search ADS PubMed 61 Riffo-Campos AL , Riquelme I , Brebi-Mieville P. Tools for sequence-based miRNA target prediction: what to choose? Int J Mol Sci 2016 ; 17 ( 12 ): 1987 . Google Scholar CrossRef Search ADS 62 Yang L , Li Y , Cheng M , et al. A functional polymorphism at microRNA-629-binding site in the 3'-untranslated region of NBS1 gene confers an increased risk of lung cancer in Southern and Eastern Chinese population . Carcinogenesis 2012 ; 33 ( 2 ): 338 – 47 . http://dx.doi.org/10.1093/carcin/bgr272 Google Scholar CrossRef Search ADS PubMed 63 Kapeller J , Houghton LA , Monnikes H , et al. First evidence for an association of a functional variant in the microRNA-510 target site of the serotonin receptor-type 3E gene with diarrhea predominant irritable bowel syndrome . Hum Mol Genet 2008 ; 17 ( 19 ): 2967 – 77 . http://dx.doi.org/10.1093/hmg/ddn195 Google Scholar CrossRef Search ADS PubMed 64 Sethupathy P , Borel C , Gagnebin M , et al. Human microRNA-155 on chromosome 21 differentially interacts with its polymorphic target in the AGTR1 3' untranslated region: a mechanism for functional single-nucleotide polymorphisms related to phenotypes . Am J Hum Genet 2007 ; 81 ( 2 ): 405 – 13 . http://dx.doi.org/10.1086/519979 Google Scholar CrossRef Search ADS PubMed 65 Moszynska A , Gebert M , Collawn JF , et al. SNPs in microRNA target sites and their potential role in human disease . Open Biol 2017 ; 7 : 170019 . http://dx.doi.org/10.1098/rsob.170019 Google Scholar CrossRef Search ADS PubMed 66 Yates A , Akanni W , Amode MR , et al. Ensembl 2016 . Nucleic Acids Res 2016 ; 44 : D710 – 16 . Google Scholar CrossRef Search ADS PubMed 67 Ding J , Li X , Hu H. TarPmiR: a new approach for microRNA target site prediction . Bioinformatics 2016 ; 32 ( 18 ): 2768 – 75 . http://dx.doi.org/10.1093/bioinformatics/btw318 Google Scholar CrossRef Search ADS PubMed © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Journal

Briefings in BioinformaticsOxford University Press

Published: Nov 27, 2017

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off