BCCMS  /  2019  /  CECAM OUTBOX  /  Description
English  /  Deutsch

Further details

The field of machine learning (ML) is already making rapid and tremendous impact at the interfaces of the traditional disciplines of Chemistry, Physics, Biology and materials science. Its ability to use existing examples to rapidly make meaningful predictions in new cases offers a new way to screen wide ranges of structures and to estimate the results of highly accurate methods at much reduced cost. However, there are several issues which require careful thought in deploying these tools. Firstly, reproducibility in the training of models is a current topic of active debate receiving substantial attention and within the last year calls for more physical based approaches are beginning to appear. Then issues of the explainability and explicability of the predictions also matter, particularly with some of the more powerful ML methods. Finally there are problems with additivity to models: learning new cases tends to overwrite existing expertise and predicting properties and responses outside of the original model are not usually possible.

A counterpoint to these methods is the experiences of the past 20 years with approximate quantum mechanical methods [1], which now represent an essential part of computational tools for a solid atomistic understanding of a broad range of physical, chemical and biological problems for both large and challenging systems. These methods are parameterized, but can provide a clear physical understanding of complex structures and processes. Additionally, they can readily be extended to calculate properties and systems outside of their original parameters and fitting sets. However, this commonly comes at the cost of substantial Human effort to parameterize and test these models, providing substantial opportunities for ML.

Semi-empirical molecular orbital methods come in many forms (MNDO,AM1,PM3,Omx,…). Although being (usually) less accurate than DFT and ab initio methods on average, their main advantage is a greater computational efficiency, which can be 2-3 orders of magnitude faster when compared to DFT (and Hartree-Fock) using medium sized basis sets – this enables access to both larger system sizes and/or sufficiently long sampling times for more meaningful molecular dynamics. The self-consistent-charge density-functional based tight-binding method (DFTB) provides a special bridge between full density functional theory (DFT) and faster approximate methods. It is derived by careful approximation and parametrization of DFT interactions [2] and with care can approach DFT results to near chemical accuracy [3, 4]. DFTB can also readily adopt methods developed for DFT, including spin-polarization of collinear and non-collinear magnetism [5,6], time-dependent methods for excited state [7,8] both as linear response and in the time domain, dispersion corrections for weak interactions [9], LDA+U correlation treatment [10], various QM/MM-coupling methods [3], implicit solvents [11] and non-equilibrium Green's function DFTB for charge transport [12]. This makes the method attractive to broad range of applications in diverse fields of sciences. At the same time further systematic corrections yield considerable improvements in accuracy for the description of ground state and excited state properties [13] and the construction of electronic DFTB integrals throughout the periodic table [14]. DFTB has also has something of a history of attempts to automate the parameterization process over the last decade [15] including recent machine learning [16, 17, 18] approaches to generate and improve parameters or to apply in multiscale/multimethod models.

 

State of the Art

DFTB

The DFTB approach provides modular components within other academic and/or commercial software products, including DFTB+[19], ADF [20], ATK [21], DeMon [22], Gaussian [23] and Materials Studio [24], and several MM-force fields tools, eg. CHARMM [25]. This considerably enhances the spreading of the method to potential applicants in both academic settings and in the R&D of industrial companies. Overviews of some of the range of DFTB developments and extensions in the species issues of the Journal of Physical Chemistry A 111, Number 26 (2007) and Physica Status Solidi b 249, Issue 2 (2012).

The most recent DFTB developers meeting was in November 2016 to report and discuss the present status of DFTB developments in the different software products and to join forces for further improvements in accuracy, parameterization of new systems and extensions of functionality.

Trends in Machine Learning

The Journal of Chemical Physics has recently invited a special issue on ``Data-enabled theoretical chemistry'' which provides a comprehensive contemporary view on the field with over 40 contributions from leading scientists actively working on the integration of modern machine learning techniques into quantum chemistry [26]. The issue was motivated by preceding successes in the field such as the systematic fitting of potential energies for molecular dynamics simulations or vibrational spectroscopy [27,28]. As also reviewed recently [29], laws of Physics have been rediscovered with ML [30], atomization energies and other electronic ground-state properties of organic molecules can now be predicted with hybrid DFT accuracy [31], and clusters can be identified [32] and compounds mapped [33]. ML can also be used to discover new molecules [34] or crystals [35], and even new reactions [36]. Various properties and systems have been studied with ML, including electrons [37], chemical potentials [38], ionic forces [39], or NMR shifts [40]. By now, neural networks and Gaussian processes have demonstrably surpassed DFT accuracy when it comes to the prediction of electronic ground-state properties of organic materials [41]. Efforts to further improve and assess ML models for their application throughout compositional space are ongoing [42]. When it comes to the improvement of well established QM methods, however, ML based investigations, such as in Refs. [43], are sparse.

 

References

[1] H. M. Senn and W. Thiel, Top. Curr. Chem. 268 (2007) 173.

[2] M. Elstner, D. Porezag, G. Jungnickel, J. Elsner, Phys. Rev. B, 58 (1998) 7260.

[3] Q. Cui, M. Elstner, T. Frauenheim, M. Karplus et al., J. Phys. Chem. B 105 (2001) 569.

[4] A. Dominguez, B. Aradi, T. Frauenheim, V. Lutsker, T. A. Niehaus, J. Chem. Theory Comput. 9 (2013) 4901.

[5] C. Koehler, G. Seifert, T. Frauenheim, Chem. Phys. 309 (2005) 23.

[6] C. Köhler, Th. Frauenheim, B Hourahine et al., J. Phys. Chem. A 111 (2007) 5622.

[7] T. A. Niehaus, S. Suhai, F. Della Sala, P. Lugli et al., Phys. Rev. B, 63 (2001) 085108.

[8] T. A. Niehaus, J. Mol. Str. THEOCHEM, 914 (2009) 38.

[9] M. Elstner, P. Hobza, T. Frauenheim et al., J. Chem. Phys., 114 (2001) 5149.

[10] B. Hourahine, S. Sanna, B. Aradi, C. Koehler, T.A. Niehaus, T. Frauenheim, J. Phys. Chem. A 111 (2007) 5671.

[11] JG. Hou, X. Zhu and Q. Cui. Chem. Theory Comput. 6 (2010) 2303.

[12] J. Reimers, G. Solomon, A. Gagliardi, et.al., J. Phys. Chem. A 111 (2007) 5692.

[13] A. Dominguez, B. Aradi, T. Frauenheim, V. Lutsker, T. A. Niehaus, J. Chem. Theory Comput. 9 (2013) 4901.

[14] M. Wahiduzzaman, A. F. Oliveira, P. Philipsen, L. Zhechkov, E. van Lenthe, H. A. Witek, T. Heine. J. Chem. Theory Comput. 9 (2013) 4006.

[15] J. M. Knaup, B. Hourahine and Th. Frauenheim J. Phys. Chem. A 111 (2007) 5637; M. Gaus, C.-P. Chou, H. Witek, M. Elstner J. Phys. Chem. A 113 (2009) 11866; Z. Bodrog, B. Aradi and T. Frauenheim J. Chem. Theory Comput. 7 (2011) 2654; M. Doemer, E. Liberatore, J. M. Knaup, I. Tavernelli, U. Rothlisberger Molecular Physics 111 (2013) 3595; M. P. Lourenço, M. C. da Silva, A. F. Oliveira, M. C. Quintão, H. A. Duarte Theoretical Chem. Accounts 135 (2016) 11; C.-P. Chou, Y. Nishimura, C.-C. Fan, G. Mazur, S. Irle, H. A. Witek J. Chem. Theory Comput. 12 (2016) 53.

[16] J. J. Kranz, M. Kubillus, R. Ramakrishnan, O. A. von Lilienfeld , and M. Elstner J. Chem. Theory Comput. 14 (2018) 2341.

[17] A. W. Huran, C. Steigemann, T. Frauenheim, B. Aradi, and M. A. L. Marques J. Chem. Theory Comput. 14 (2018) 2947.

[18] L. Shen abd W. Yang J. Chem. Theory Comput. 14 (2018) 1442.

[19] https://www.dftb.org

[20] https://www.scm.com/product/adf/

[21] http://www.quantumwise.com/documents/tutorials/ATK-11.8/DFTB/index.html/

[22] http://demon-nano.ups-tlse.fr/

[23] http://www.gaussian.com/g_tech/g_ur/k_dftb.htm

[24] http://accelrys.com/products/materials-studio/quantum-and-catalysis-software.html

[25] http://www.charmm.org/documentation/c37b1/sccdftb.html

[26] J. Chem. Phys, volume 148, issue 24 (2018).

[27] J. Behler and M. Parrinello, Phys. Rev. Lett. 98 (207) 146401.

[28] A. P. Bartok, M. C. Payne, R. Kondor, and G. Csanyi, Phys. Rev. Lett. 104 (2010)

136403.

[29] O. A. von Lilienfeld, Angew. Chem. Int. Ed. 57 (2018) 4164.

[30] M. Schmidt and H. Lipson, Science 324 (2009) 81.

[31] M. Rupp, A. Tkatchenko, K.-R. Mueller, and O. A. von Lilienfeld, Phys. Rev. Lett. 108 (2012) 058301; G. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, A. Tkatchenko, K-R. Mueller, O. A. von Lilienfeld, New J. Phys. 15 (2013) 095003.

[32] A. Rodriguez and A. Laio, Science 344 (2014) 1492.

[33] S. De, F. Musil, T. Ingram, C. Baldauf, and M. Ceriotti, J. Cheminf. 9 (2017) 6.

[34] E. O. Pyzer-Knapp, K. Li, and A. Aspuru-Guzik, Adv. Fun. Mat. 25 (2015) 6495.

[35] F. A. Faber, A. Lindmaa, O. A. von Lilienfeld, and R. Armiento, Phys. Rev. Lett. 117

(2016) 135502.

[36] P. Raccuglia, K. C. Elbert, P. D. F. Adler, C. Falk, M. B. Wenny, A. Mollo, M. Zeller, S. A. Friedler, J. Schrier, and A. J. Norquist, Nature 533 (2016) 73.

[37] G. Carleo and M. Troyer, Science 355 (2017) 602.

[38] K. T. Schutt, F. Arbabzadah, S. Chmiela, K. R. Muller, and A. Tkatchenko, Nat. Commun. 8 (2017) 13890.

[39] S. Chmiela, A. Tkatchenko, H. E. Sauceda, I. Poltavsky, K. T. Schutt, and K.-R. Muller,

Sci. Adv. 3 (2017) e1603015.

[40] M. Rupp, R. Ramakrishnan, O. A. von Lilienfeld, J. Phys. Chem. Lett. 6 3309 (2015); F.

M. Paruzzo, A. Hofstetter, F. Musil, S. De, M. Ceriotti, L. Emsley,

https://arxiv.org/abs/1805.11541 (2018).

[41] F. A. Faber, L. Hutchison, B. Huang, J. Gilmer, S. S. Schoenholz, G. E. Dahl, O. Vinyals, S. Kearnes, P. F. Riley, and O. A. von Lilienfeld, J. Chem. Theory Comput., 13 (2017)

5255.

[42] B. Huang and O. A. von Lilienfeld, arXiv preprint arXiv:1707.04146 (2017); F. Faber, A. Christensen, B. Huang, O. A. von Lilienfeld, J. Chem. Phys. 148 (2018) 241717; K. T. Schuett, H. E. Sauceda, P.-J. Kindermans, A. Tkatchenko, and K.-R. Mueller, J. Chem. Phys. 148 (2018) 241722.

[43] R. Ramakrishnan, P. O. Dral, M. Rupp, O. A. von Lilienfeld, J. Chem. Theory Comput. 11 (2015) 2087; P. Dral, O. A. von Lilienfeld, W. Thiel, J. Chem. Theory Comput. 11 (2015)

2120; T. Bereau, R. A. DiStasio, A. Tkatchenko, O. A. von Lilienfeld, J. Chem. Phys. 148

(2018) 241706.