Valdés, Julio J.; Tchagang, Alain B.
doi: 10.1002/jcc.27295pmid: 38329198
This paper (i) explores the internal structure of two quantum mechanics datasets (QM7b, QM9), composed of several thousands of organic molecules and described in terms of electronic properties, and (ii) further explores an inverse design approach to molecular design consisting of using machine learning methods to approximate the atomic composition of molecules, using QM9 data. Understanding the structure and characteristics of this kind of data is important when predicting the atomic composition from physical‐chemical properties in inverse molecular designs. Intrinsic dimension analysis, clustering, and outlier detection methods were used in the study. They revealed that for both datasets the intrinsic dimensionality is several times smaller than the descriptive dimensions. The QM7b data is composed of well‐defined clusters related to atomic composition. The QM9 data consists of an outer region predominantly composed of outliers, and an inner, core region that concentrates clustered inliner objects. A significant relationship exists between the number of atoms in the molecule and its outlier/inliner nature. The spatial structure exhibits a relationship with molecular weight. Despite the structural differences between the two datasets, the predictability of variables of interest for inverse molecular design is high. This is exemplified by models estimating the number of atoms of the molecule from both the original properties and from lower dimensional embedding spaces. In the generative approach the input is given by a set of desired properties of the molecule and the output is an approximation of the atomic composition in terms of its constituent chemical elements. This could serve as the starting region for further search in the huge space determined by the set of possible chemical compounds. The quantum mechanic's dataset QM9 is used in the study, composed of 133,885 small organic molecules and 19 electronic properties. Different multi‐target regression approaches were considered for predicting the atomic composition from the properties, including feature engineering techniques in an auto‐machine learning framework. High‐quality models were found that predict the atomic composition of the molecules from their electronic properties, as well as from a subset of only 52.6% size. Feature selection worked better than feature generation. The results validate the generative approach to inverse molecular design.
Grazioli, Laura; Schleicher, Luca T.; Stopkowicz, Stella; Gauss, Jürgen
doi: 10.1002/jcc.27305pmid: 38334014
Following chemical intuition, one would expect that all closed‐shell molecules are diamagnetic. However, it is known that this is not the case for some second‐row hydrides with low‐lying unoccupied π orbitals due to an unquenching of the total angular momentum in the presence of an external magnetic field. In this article, the transition‐metal hydrides ScH and YH are investigated, assuming a similar unquenching effect involving low‐lying unoccupied π and δ orbitals formed from the metal d orbitals rather than the p orbitals. We are comparing results obtained with various quantum‐chemical methods (HF, CCSD, CCSD(T), CCSDT) and basis sets. The obtained positive values for the magnetizabilities clearly indicate paramagnetic behavior. Vibrational effects on the magnetizability tensor are also considered, but these effects are small and do not change the overall conclusion that both ScH and YH are further examples for closed‐shell paramagnetism.
Vervust, Wouter; Zhang, Daniel T.; Ghysels, An; Roet, Sander; Erp, Titus S.; Riccardi, Enrico
doi: 10.1002/jcc.27319pmid: 38345082
We present and discuss the advancements made in PyRETIS 3, the third instalment of our Python library for an efficient and user‐friendly rare event simulation, focused to execute molecular simulations with replica exchange transition interface sampling (RETIS) and its variations. Apart from a general rewiring of the internal code towards a more modular structure, several recently developed sampling strategies have been implemented. These include recently developed Monte Carlo moves to increase path decorrelation and convergence rate, and new ensemble definitions to handle the challenges of long‐lived metastable states and transitions with unbounded reactant and product states. Additionally, the post‐analysis software PyVisa is now embedded in the main code, allowing fast use of machine‐learning algorithms for clustering and visualising collective variables in the simulation data.
Manchev, Yulian T.; Popelier, Paul L. A.
doi: 10.1002/jcc.27323pmid: 38345165
Machine learning (ML) force fields are revolutionizing molecular dynamics (MD) simulations as they bypass the computational cost associated with ab initio methods but do not sacrifice accuracy in the process. In this work, the GPyTorch library is used to create Gaussian process regression (GPR) models that are interfaced with the next‐generation ML force field FFLUX. These models predict atomic properties of different molecular configurations that appear in a progressing MD simulation. An improved kernel function is utilized to correctly capture the periodicity of the input descriptors. The first FFLUX molecular simulations of ammonia, methanol, and malondialdehyde with the updated kernel are performed. Geometry optimizations with the GPR models result in highly accurate final structures with a maximum root‐mean‐squared deviation of 0.064 Å and sub‐kJ mol−1 total energy predictions. Additionally, the models are tested in 298 K MD simulations with FFLUX to benchmark for robustness. The resulting energy and force predictions throughout the simulation are in excellent agreement with ab initio data for ammonia and methanol but decrease in quality for malondialdehyde due to the increased system complexity. GPR model improvements are discussed, which will ensure the future scalability to larger systems.
Heßelmann, Andreas; Giner, Emmanuel; Reinhardt, Peter; Knowles, Peter J.; Werner, Hans‐Joachim; Toulouse, Julien
doi: 10.1002/jcc.27325pmid: 38348951
This work reports an efficient density‐fitting implementation of the density‐based basis‐set correction (DBBSC) method in the MOLPRO software. This method consists in correcting the energy calculated by a wave‐function method with a given basis set by an adapted basis‐set correction density functional incorporating the short‐range electron correlation effects missing in the basis set, resulting in an accelerated convergence to the complete‐basis‐set limit. Different basis‐set correction density‐functional approximations are explored and the complementary‐auxiliary‐basis‐set single‐excitation correction is added. The method is tested on a benchmark set of reaction energies at the second‐order Møller–Plesset (MP2) level and a comparison with the explicitly correlated MP2‐F12 method is provided. The results show that the DBBSC method greatly accelerates the basis convergence of MP2 reaction energies, without reaching the accuracy of the MP2‐F12 method but with a lower computational cost.
Ritacca, Alessandra Gilda; Prejanò, Mario; Alberto, Marta Erminia; Marino, Tiziana; Toscano, Marirosa; Russo, Nino
doi: 10.1002/jcc.27326pmid: 38351736
A DFT and TDDFT study has been carried out on monomeric anthraquinones Emodin and Dermocybin (Em, Derm) recently proposed as natural antibacterial photosensitizers able to act also against gram‐negative microbes. The computational study has been performed considering the relative amount of neutral and ionic forms of each compound in water, with the variation of pH. The occurrence of both Type I and Type II photoreactions has been explored computing the absorption properties of each species, the spin‐orbit coupling constants (SOC), the vertical ionization potentials and the vertical electron affinities. The most plausible deactivation channels leading to the population of excited triplet states have been proposed. Our data indicate Emodin as more active than Dermocybin in antimicrobial photodynamic therapy throughout the Type II mechanism. Our data support a dual TypeI/II activity of the monomeric anthraquinones Emodin and Dermccybin in water, in all the considered protonation states.
Aarabi, Mahdi; Pandey, Ankit; Poirier, Bill
doi: 10.1002/jcc.27324pmid: 38635333
In this work, the Crystal code, developed previously by the authors to find “holes” as well as legitimate transition states in existing potential energy surface (PES) functions [JPC Lett. 11, 6468 (2020)], is retooled to perform on‐the‐fly “direct dynamics”‐type PES explorations, as well as automatic construction of new PES functions. In all of these contexts, the chief advantage of Crystal over other methods is its ability to globally map the PES, thereby determining the most relevant regions of configuration space quickly and reliably—even when the dimensionality is rather large. Here, Crystal is used to generate a uniformly spaced grid of density functional theory (DFT) or ab initio points, truncated over the relevant regions, which can then be used to either: (a) hone in precisely on PES features such as minima and transition states, or; (b) create a new PES function automatically, via interpolation. Proof of concept is demonstrated via application to three molecular systems: water (H2O), (reduced‐dimensional) methane (CH4), and methylene imine (CH2NH).
Maya, Josué; Malloum, Alhadji; Fifen, Jean Jules; Dhaouadi, Zoubeida; Fouda, Henri Paul Ekobena; Conradie, Jeanet
doi: 10.1002/jcc.27327pmid: 38353541
Through this paper, the authors propose using the quantum cluster equilibrium (QCE) theory to reinvestigate ammonia clusters in the liquid phase. The ammonia clusters from size monomer to hexadecamer were considered to simulate the liquid ammonia in this approach. The clusterset used to model the liquid ammonia is an ensemble of different structures of ammonia clusters. After studious research of the representative configurations of ammonia clusters through the cluster research program ABCluster, the configurations have been optimized at the MN15/6‐31++G(d,p) level of theory. These optimizations lead to geometries and frequencies as inputs for the Peacemaker code. The QCE study of this molecular system permits us to get the liquid phase populations in a temperature range of 190–260 K, covering the temperatures from the melting point to the boiling point. The results show that the population of liquid ammonia comprises mainly the ammonia hexadecamer followed by pentadecamer, tetradecamer, and tridecamer. We noted that the small‐sized ammonia clusters do not contribute to the population of liquid ammonia. In addition, the thermodynamic properties, such as heat of vaporization, heat capacity, entropy, enthalpy, and free energies, obtained by the QCE theory have been compared to the experiment given some relatively good agreements in the gas phase and show considerable discrepancies in liquid phase except the density. Finally, based on the predicted population, we calculated the infrared spectrum of liquid ammonia at 215 K temperature. It comes out that the calculated infrared spectrum qualitatively agrees with the experiment.
Lourenço, Maicon Pierre; Hostaš, Jiří; Bellinger, Colin; Tchagang, Alain; Salahub, Dennis R.
doi: 10.1002/jcc.27322pmid: 38357973
Reinforcement learning (RL) methods have helped to define the state of the art in the field of modern artificial intelligence, mostly after the breakthrough involving AlphaGo and the discovery of novel algorithms. In this work, we present a RL method, based on Q‐learning, for the structural determination of adsorbate@substrate models in silico, where the minimization of the energy landscape resulting from adsorbate interactions with a substrate is made by actions on states (translations and rotations) chosen from an agent's policy. The proposed RL method is implemented in an early version of the reinforcement learning software for materials design and discovery (RLMaterial), developed in Python3.x. RLMaterial interfaces with deMon2k, DFTB+, ORCA, and Quantum Espresso codes to compute the adsorbate@substrate energies. The RL method was applied for the structural determination of (i) the amino acid glycine and (ii) 2‐amino‐acetaldehyde, both interacting with a boron nitride (BN) monolayer, (iii) host‐guest interactions between phenylboronic acid and β‐cyclodextrin and (iv) ammonia on naphthalene. Density functional tight binding calculations were used to build the complex search surfaces with a reasonably low computational cost for systems (i)–(iii) and DFT for system (iv). Artificial neural network and gradient boosting regression techniques were employed to approximate the Q‐matrix or Q‐table for better decision making (policy) on next actions. Finally, we have developed a transfer‐learning protocol within the RL framework that allows learning from one chemical system and transferring the experience to another, as well as from different DFT or DFTB levels.
Showing 1 to 10 of 11 Articles