edoc-vmtest

Communication: Understanding molecular representations in machine learning: The role of uniqueness and target similarity

Huang, Bing and von Lilienfeld, O. Anatole. (2016) Communication: Understanding molecular representations in machine learning: The role of uniqueness and target similarity. Journal of Chemical Physics, 145 (16). p. 161102.

[img]
Preview
PDF - Published Version
867Kb

Official URL: http://edoc.unibas.ch/53277/

Downloads: Statistics Overview

Abstract

The predictive accuracy of Machine Learning (ML) models of molecular properties depends on the choice of the molecular representation. Inspired by the postulates of quantum mechanics, we introduce a hierarchy of representations which meet uniqueness and target similarity criteria. To systematically control target similarity, we simply rely on interatomic many body expansions, as implemented in universal force-fields, including Bonding, Angular (BA), and higher order terms. Addition of higher order contributions systematically increases similarity to the true potential energy and predictive accuracy of the resulting ML models. We report numerical evidence for the performance of BAML models trained on molecular properties pre-calculated at electron-correlated and density functional theory level of theory for thousands of small organic molecules. Properties studied include enthalpies and free energies of atomization, heat capacity, zero-point vibrational energies, dipole-moment, polarizability, HOMO/LUMO energies and gap, ionization potential, electron affinity, and electronic excitations. After training, BAML predicts energies or electronic properties of out-of-sample molecules with unprecedented accuracy and speed. Published by AIP Publishing.
Faculties and Departments:05 Faculty of Science > Departement Chemie > Former Organization Units Chemistry > Physikalische Chemie (Lilienfeld)
UniBasel Contributors:von Lilienfeld, Anatole
Item Type:Article, refereed
Article Subtype:Research Article
Publisher:AIP Publishing
ISSN:0021-9606
e-ISSN:1089-7690
Note:Publication type according to Uni Basel Research Database: Journal article
Language:English
Identification Number:
edoc DOI:
Last Modified:24 Apr 2017 09:32
Deposited On:25 Jan 2017 14:34

Repository Staff Only: item control page