EDITORIAL Freshly Printed Symbolic Regression for Knowledge Discovery Bloat, Over tting, and Variable Interaction Networks Dipl.-Ing. Dr. Gabriel Kronberger This work describes an approach for data analysis based on symbolic regression and genetic programming, that produces an overall view of the dependencies of all variables of a system. The identi ed dependencies are represented in form of a variable interaction network. In the rst part of this work, this approach is described in detail. Important issues are the prevention of bloat and over tting, the simpli cation of models, and the identi cation of relevant input variables. In this context, different methods for bloat control are presented and compared. In addition, a novel way to detect and reduce over tting is presented and analyzed. The second part of this work demonstrates how comprehensive symbolic regression can be applied for analysis of real-world systems. Variable interaction networks for a blast furnace process and an industrial chemical process are presented and discussed. Additionally, the same approach is also applied on an economic data set to identify macro-economic dependencies. Gabriel Kronberger: Symbolic Regressionfor Knowledge Discovery: Bloat, Over tting, and Vari- able Interaction Networks - 1. Edition 2011, 214 pages, A5, paperback, ISBN 978-3-85499-875-4 SIGEVOlution Volume 5, Issue 4
/lp/association-for-computing-machinery/symbolic-regression-for-knowledge-discovery-bloat-overfitting-and-oFw2w3Ggww