Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Transform-MinER: transforming molecules in enzyme reactions

Transform-MinER: transforming molecules in enzyme reactions Motivation: One goal of synthetic biology is to make new enzymes to generate new products, but identifying the starting enzymes for further investigation is often elusive and relies on expert know- ledge, intensive literature searching and trial and error. Results: We present Transform Molecules in Enzyme Reactions, an online computational tool that transforms query substrate molecules into products using enzyme reactions. The most similar na- tive enzyme reactions for each transformation are found, highlighting those that may be of most interest for enzyme design and directed evolution approaches. Availability and implementation: https://www.ebi.ac.uk/thornton-srv/transform-miner Contact: tyzack@ebi.ac.uk applies transformations to produce products. The methodology and 1 Introduction user interface are described below, with further explanation pro- One goal of synthetic biology is to design or evolve new enzymes to vided in the ‘About Transform-MinER’ section of the website. perform new reactions or generate new products (Renata et al., 2015). However, identifying the most similar enzyme reactions 2.1 Reaction centres with potential for promiscuous enzyme-substrate interactions The data behind Transform-MinER was obtained from the Kyoto (Khersonsky and Tawfik, 2010; Nobeli et al., 2009) to act as start Encyclopedia of Genes and Genomes (KEGG) database (containing c. points for further investigation is often challenging, relying on ex- 11k enzyme reactions) (Kanehisa et al.,2016) and balanced Reaction pert knowledge and intensive literature searching. (RXN) files were generated for all reactions where molecular struc- We present Transform-MinER (Transform Molecules in Enzyme tures were available. RCs were defined as atoms that change connect- Reactions), an online computational tool that transforms query sub- ivity, neighbours or stereochemistry and were identified from mapped strate molecules into products by applying known enzyme reactions at RXN files after performing atom–atom mapping (Rahman et al., potential reaction centres (RCs) and retrieves the most similar native 2014). Canonical SMiles ARbitrary Target Specification (SMARTS) enzyme reactions for each. It can be used in two modes: (i) Molecule patterns were generated (Landrum, 2018) to represent these RCs, ena- Search identifies potential enzyme transformations acting on a submit- bling query molecules to be searched for matching fragments. ted query substrate; and (ii) Path Search attempts to link submitted source and target molecules with enzyme transformations. 2.2 Similarity of RC MolEnvs The MolEnv of each RC was defined as all atoms in the substrate 2 Materials and methods outside the RC to a bond depth of 10 bonds and represented using Transform-MinER is based on three main tasks: Section 2.1 identi- MolPrint2D circular fingerprints (Bender et al., 2004). This enabled fies potential RCs in query substrates; Section 2.2 calculates the RC Tanimoto similarity scores between query and native RC MolEnvs molecular environment (MolEnv) similarity in query and native sub- to be calculated, allowing rank ordered lists of similar enzyme reac- strates in order to generate rank ordered lists; and Section 2.3 tions to be generated. V The Author(s) 2018. Published by Oxford University Press. 3597 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. 3598 J.D.Tyzack et al. SMARTS patterns to generate products where query and native RC MolEnv similarity is above MinSimThresh. Path View shows all transformations radiating from the submit- ted query substrate (source) which can be selected to populate the Similar Enzymes Reactions Data Table. The edges are labelled with the best RC MolEnv similarity score for that transformation and a hover-over box identifying the most similar KEGG reaction and compound. Molecule View has hover-over functionality embedded in the RC shapes that reveals simplified Reaction SMARTS patterns describing potential transformations and RC MolEnv similarity scores. Selecting a simplified Reaction SMARTS pattern produces the prod- uct by applying the selected transformation at the selected RC, and populates the Similar Enzyme Reactions Data Table. A detailed list Fig. 1. Part of the Accordion View for propan-1-ol with similarity threshold of transformations is provided beneath the Interactive Molecule 0.6. For example, 2.1 shows the transformation of propan-1-ol to propanal, with RC MolEnv similarity of 1.0 to KEGG compound C05979 in KEGG reac- View in an expandable Accordion View. tion R02377 being the best match in the data table. The RC is highlighted in green in the source molecule, the grey highlighting in the product molecule 3.2 Path search maps to the source molecule If a target molecule has been submitted, Transform-MinER applies transformations iteratively, retaining only those paths that move 2.3 Reaction SMARTS closer to the target molecule and discarding others. The algorithm RDKit Reaction SMARTS patterns (Landrum, 2018) were generated prioritizes exploration of paths by taking the product of the from the mapped RXN files (2.1), describing how each substrate is transformation-product/target-molecule similarity (calculated using transformed into the product at each RC. For example, KEGG reac- Morgan fingerprints in RDKit) and the highest RC MolEnv similar- tion R02377 involving the oxidation of propan-1-ol to propanal can ity, exploring paths with higher combined scores first. The Molecule be expressed as: view is the same as described previously, showing RCs and transfor- ½C&X4&H2 : 1 ½O&X2&H1 : 2 mations from the source molecule. ð½C&X3&H1 : 1¼ ½O&X1&H0 : 2Þ 3.3 Example where &X represents the total number of neighbours, &H repre- A screenshot of part of the Accordion View obtained when submit- sents the number of bound hydrogens and atom identifiers are given ting propan-1-ol in a Molecule Search with similarity threshold of after the colon. The Reaction SMARTS can then be applied to po- 0.6 is shown in Figure 1. tential RCs in a query substrate to generate the products from the transformation. 4 Conclusions 3 User interface Transform-MinER provides an interactive way of applying enzyme The potential RCs identified in the query substrate, RC MolEnv transformations to query substrates, finding the most similar native similarity scores (with hyper-links to the KEGG database) and trans- enzyme reactions for each, and is complementary to other computa- formation products are returned to the user in an interactive web ap- tional tools for predicting enzyme transformations (Dele ´ pine et al., plication (Django Software Foundation, 2018). This allows a query 2018; Moriya et al., 2010). It is anticipated that this tool will help substrate to be submitted (Molecule Search) from the MarvinSketch to identify substrates that may show promiscuous activity with plugin (ChemAxon, 2016) along with a target molecule if known enzymes, acting as start points for further development using syn- (Path Search). thetic biology methods. Results are presented to the user using two main interactive views: (i) Path View, with nodes representing molecules and edges representing transformations; and (ii) Molecule View, with inter- Funding active shapes representing RCs. A similarity slider allows the user to This work was supported by the European Molecular Biology Laboratory vary the similarity threshold between the submitted minimum simi- (EMBL). larity threshold (MinSimThresh) and 1.0 to control the number of transformations presented to the user. When selecting a transform- Conflict of Interest: none declared. ation (an edge in Path View or a simplified Reaction SMARTS embedded in an RC shape in Molecule View) a Similar Enzyme References Reactions Data Table is populated showing matching native enzyme reactions and substrates in descending RC MolEnv similarity. Bender,A. et al. (2004) Molecular similarity searching using atom environ- ments, information-based feature selection, and a Naı¨ve Bayesian classifier. Hyperlinks take the user to the associated KEGG reaction and J. Chem. Inf. Comput. Sci., 44, 170–178. KEGG compound, with the native KEGG compound structure in a ChemAxon. (2016) MarvinSketch, Version 16.6.13.0. https://chemaxon.com/ hover-over box. products/marvin (21 May 2018, date last accessed). Dele ´ pine,B. et al. (2018) RetroPath2.0: a retrosynthesis workflow for metabol- 3.1 Molecule search ic engineers. Metab. Eng., 45, 158–170. The algorithm uses SMARTS pattern matching to identify fragments Django Software Foundation. (2018) https://www.djangoproject.com (21 in the query substrate that match native RCs, and applies Reaction May 2018, date last accessed). Transform-MinER 3599 Kanehisa,M. et al. (2016) KEGG as a reference resource for gene and protein Rahman,S.A. et al. (2014) EC-BLAST: a tool to automatically search and com- annotation. Nucleic Acids Res., 44, D457–D462. pare enzyme reactions. Nat. Methods, 11, 171–174. Khersonsky,O. and Tawfik,D.S. (2010) Enzyme promiscuity: a mechanistic Renata,H. et al. (2015) Expanding the enzyme universe: accessing non-natural and evolutionary perspective. Annu. Rev. Biochem., 79, 471–505. reactions by mechanism-guided directed evolution. Angew. Chemie Int. Ed., Moriya,Y. et al. (2010) PathPred: an enzyme-catalyzed metabolic pathway 54, 3351–3367. prediction server. Nucleic Acids Res., 38, W138–W143. Landrum,G. et al. (2018) RDKit. Open-Source Cheminformatics, Nobeli,I. et al. (2009) Protein promiscuity and its implications for biotechnol- Version 2018.03.1, http://www/rdkit.org (21 May 2018, date last ogy. Nat. Biotechnol., 27, 157–167. accessed). http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Bioinformatics Oxford University Press

Transform-MinER: transforming molecules in enzyme reactions

Loading next page...
1
 
/lp/ou_press/transform-miner-transforming-molecules-in-enzyme-reactions-4S44jmz3Kg

References (13)

Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press.
ISSN
1367-4803
eISSN
1460-2059
DOI
10.1093/bioinformatics/bty394
Publisher site
See Article on Publisher Site

Abstract

Motivation: One goal of synthetic biology is to make new enzymes to generate new products, but identifying the starting enzymes for further investigation is often elusive and relies on expert know- ledge, intensive literature searching and trial and error. Results: We present Transform Molecules in Enzyme Reactions, an online computational tool that transforms query substrate molecules into products using enzyme reactions. The most similar na- tive enzyme reactions for each transformation are found, highlighting those that may be of most interest for enzyme design and directed evolution approaches. Availability and implementation: https://www.ebi.ac.uk/thornton-srv/transform-miner Contact: tyzack@ebi.ac.uk applies transformations to produce products. The methodology and 1 Introduction user interface are described below, with further explanation pro- One goal of synthetic biology is to design or evolve new enzymes to vided in the ‘About Transform-MinER’ section of the website. perform new reactions or generate new products (Renata et al., 2015). However, identifying the most similar enzyme reactions 2.1 Reaction centres with potential for promiscuous enzyme-substrate interactions The data behind Transform-MinER was obtained from the Kyoto (Khersonsky and Tawfik, 2010; Nobeli et al., 2009) to act as start Encyclopedia of Genes and Genomes (KEGG) database (containing c. points for further investigation is often challenging, relying on ex- 11k enzyme reactions) (Kanehisa et al.,2016) and balanced Reaction pert knowledge and intensive literature searching. (RXN) files were generated for all reactions where molecular struc- We present Transform-MinER (Transform Molecules in Enzyme tures were available. RCs were defined as atoms that change connect- Reactions), an online computational tool that transforms query sub- ivity, neighbours or stereochemistry and were identified from mapped strate molecules into products by applying known enzyme reactions at RXN files after performing atom–atom mapping (Rahman et al., potential reaction centres (RCs) and retrieves the most similar native 2014). Canonical SMiles ARbitrary Target Specification (SMARTS) enzyme reactions for each. It can be used in two modes: (i) Molecule patterns were generated (Landrum, 2018) to represent these RCs, ena- Search identifies potential enzyme transformations acting on a submit- bling query molecules to be searched for matching fragments. ted query substrate; and (ii) Path Search attempts to link submitted source and target molecules with enzyme transformations. 2.2 Similarity of RC MolEnvs The MolEnv of each RC was defined as all atoms in the substrate 2 Materials and methods outside the RC to a bond depth of 10 bonds and represented using Transform-MinER is based on three main tasks: Section 2.1 identi- MolPrint2D circular fingerprints (Bender et al., 2004). This enabled fies potential RCs in query substrates; Section 2.2 calculates the RC Tanimoto similarity scores between query and native RC MolEnvs molecular environment (MolEnv) similarity in query and native sub- to be calculated, allowing rank ordered lists of similar enzyme reac- strates in order to generate rank ordered lists; and Section 2.3 tions to be generated. V The Author(s) 2018. Published by Oxford University Press. 3597 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. 3598 J.D.Tyzack et al. SMARTS patterns to generate products where query and native RC MolEnv similarity is above MinSimThresh. Path View shows all transformations radiating from the submit- ted query substrate (source) which can be selected to populate the Similar Enzymes Reactions Data Table. The edges are labelled with the best RC MolEnv similarity score for that transformation and a hover-over box identifying the most similar KEGG reaction and compound. Molecule View has hover-over functionality embedded in the RC shapes that reveals simplified Reaction SMARTS patterns describing potential transformations and RC MolEnv similarity scores. Selecting a simplified Reaction SMARTS pattern produces the prod- uct by applying the selected transformation at the selected RC, and populates the Similar Enzyme Reactions Data Table. A detailed list Fig. 1. Part of the Accordion View for propan-1-ol with similarity threshold of transformations is provided beneath the Interactive Molecule 0.6. For example, 2.1 shows the transformation of propan-1-ol to propanal, with RC MolEnv similarity of 1.0 to KEGG compound C05979 in KEGG reac- View in an expandable Accordion View. tion R02377 being the best match in the data table. The RC is highlighted in green in the source molecule, the grey highlighting in the product molecule 3.2 Path search maps to the source molecule If a target molecule has been submitted, Transform-MinER applies transformations iteratively, retaining only those paths that move 2.3 Reaction SMARTS closer to the target molecule and discarding others. The algorithm RDKit Reaction SMARTS patterns (Landrum, 2018) were generated prioritizes exploration of paths by taking the product of the from the mapped RXN files (2.1), describing how each substrate is transformation-product/target-molecule similarity (calculated using transformed into the product at each RC. For example, KEGG reac- Morgan fingerprints in RDKit) and the highest RC MolEnv similar- tion R02377 involving the oxidation of propan-1-ol to propanal can ity, exploring paths with higher combined scores first. The Molecule be expressed as: view is the same as described previously, showing RCs and transfor- ½C&X4&H2 : 1 ½O&X2&H1 : 2 mations from the source molecule. ð½C&X3&H1 : 1¼ ½O&X1&H0 : 2Þ 3.3 Example where &X represents the total number of neighbours, &H repre- A screenshot of part of the Accordion View obtained when submit- sents the number of bound hydrogens and atom identifiers are given ting propan-1-ol in a Molecule Search with similarity threshold of after the colon. The Reaction SMARTS can then be applied to po- 0.6 is shown in Figure 1. tential RCs in a query substrate to generate the products from the transformation. 4 Conclusions 3 User interface Transform-MinER provides an interactive way of applying enzyme The potential RCs identified in the query substrate, RC MolEnv transformations to query substrates, finding the most similar native similarity scores (with hyper-links to the KEGG database) and trans- enzyme reactions for each, and is complementary to other computa- formation products are returned to the user in an interactive web ap- tional tools for predicting enzyme transformations (Dele ´ pine et al., plication (Django Software Foundation, 2018). This allows a query 2018; Moriya et al., 2010). It is anticipated that this tool will help substrate to be submitted (Molecule Search) from the MarvinSketch to identify substrates that may show promiscuous activity with plugin (ChemAxon, 2016) along with a target molecule if known enzymes, acting as start points for further development using syn- (Path Search). thetic biology methods. Results are presented to the user using two main interactive views: (i) Path View, with nodes representing molecules and edges representing transformations; and (ii) Molecule View, with inter- Funding active shapes representing RCs. A similarity slider allows the user to This work was supported by the European Molecular Biology Laboratory vary the similarity threshold between the submitted minimum simi- (EMBL). larity threshold (MinSimThresh) and 1.0 to control the number of transformations presented to the user. When selecting a transform- Conflict of Interest: none declared. ation (an edge in Path View or a simplified Reaction SMARTS embedded in an RC shape in Molecule View) a Similar Enzyme References Reactions Data Table is populated showing matching native enzyme reactions and substrates in descending RC MolEnv similarity. Bender,A. et al. (2004) Molecular similarity searching using atom environ- ments, information-based feature selection, and a Naı¨ve Bayesian classifier. Hyperlinks take the user to the associated KEGG reaction and J. Chem. Inf. Comput. Sci., 44, 170–178. KEGG compound, with the native KEGG compound structure in a ChemAxon. (2016) MarvinSketch, Version 16.6.13.0. https://chemaxon.com/ hover-over box. products/marvin (21 May 2018, date last accessed). Dele ´ pine,B. et al. (2018) RetroPath2.0: a retrosynthesis workflow for metabol- 3.1 Molecule search ic engineers. Metab. Eng., 45, 158–170. The algorithm uses SMARTS pattern matching to identify fragments Django Software Foundation. (2018) https://www.djangoproject.com (21 in the query substrate that match native RCs, and applies Reaction May 2018, date last accessed). Transform-MinER 3599 Kanehisa,M. et al. (2016) KEGG as a reference resource for gene and protein Rahman,S.A. et al. (2014) EC-BLAST: a tool to automatically search and com- annotation. Nucleic Acids Res., 44, D457–D462. pare enzyme reactions. Nat. Methods, 11, 171–174. Khersonsky,O. and Tawfik,D.S. (2010) Enzyme promiscuity: a mechanistic Renata,H. et al. (2015) Expanding the enzyme universe: accessing non-natural and evolutionary perspective. Annu. Rev. Biochem., 79, 471–505. reactions by mechanism-guided directed evolution. Angew. Chemie Int. Ed., Moriya,Y. et al. (2010) PathPred: an enzyme-catalyzed metabolic pathway 54, 3351–3367. prediction server. Nucleic Acids Res., 38, W138–W143. Landrum,G. et al. (2018) RDKit. Open-Source Cheminformatics, Nobeli,I. et al. (2009) Protein promiscuity and its implications for biotechnol- Version 2018.03.1, http://www/rdkit.org (21 May 2018, date last ogy. Nat. Biotechnol., 27, 157–167. accessed).

Journal

BioinformaticsOxford University Press

Published: May 14, 2018

There are no references for this article.