Big Data of Materials Science: Critical Role of the Descriptor

Nov 26, 2014
5 pages
Published in:
  • Phys.Rev.Lett. 114 (2015) 10, 105503
  • Published: Mar 10, 2015
e-Print:

Citations per year

201620182020202220241302
Abstract: (APS)
Statistical learning of materials properties or functions so far starts with a largely silent, nonchallenged step: the choice of the set of descriptive parameters (termed descriptor). However, when the scientific connection between the descriptor and the actuating mechanisms is unclear, the causality of the learned descriptor-property relation is uncertain. Thus, a trustful prediction of new promising materials, identification of anomalies, and scientific advancement are doubtful. We analyze this issue and define requirements for a suitable descriptor. For a classic example, the energy difference of zinc blende or wurtzite and rocksalt semiconductors, we demonstrate how a meaningful descriptor can be found systematically.
  • 61.50.-f
  • 02.60.Ed
  • 71.15.Mb
  • 89.20.Ff
  • [1]
    • J.A. van Vechten
      • Phys.Rev. 182 (1969) 891
  • [3]
    • A. Zunger
      • Phys.Rev.B 22 (1980) 5839
  • [4]
    • D.G. Pettifor
      • Solid State Commun. 51 (1984) 31
  • [5]
    • Y. Saad
      ,
    • D. Gao
      ,
    • T. Ngo
      ,
    • S. Bobbitt
      ,
    • J.R. Chelikowsky
    et al.
      • Phys.Rev.B 85 (2012) 104104
  • [6]
    • C.C. Fischer
      ,
    • K.J. Tibbetts
      ,
    • D. Morgan
      ,
    • G. Ceder
      • Nature Materials 5 (2006) 641
  • [7]
    Annu. Rev. Mater. Res. 38, 299
    • K. Rajan
  • [8]
    Scientific Reports 3, 2810
    • G. Pilania
      ,
    • C. Wang
      ,
    • X. Jiang
      ,
    • S. Rajasekaran
      ,
    • R. Ramprasad
  • [9]
    and V. Ozoli¸n˘s
    • L.J. Nelson
      ,
    • G.L.W. Hart
      ,
    • F. Zhou
      • Phys.Rev.B 87 (2013) 035125
  • [10]
    T
    • E. Johlin
      ,
    • J.C. Grossman
      • Phys.Rev.B 89 (2014) 115202
  • [11]
    • M. Rupp
      ,
    • A. Tkatchenko
      ,
    • K.-R. Müller
      ,
    • O.A. von Lilienfeld
      • Phys.Rev.Lett. 108 (2012) 058301
  • [12]
    The elements of statistical learning New York,)
    • T. Hastie
      ,
    • R. Tibshirani
      ,
    • J. Friedman
  • [13]
    Soc. B 58, 267
    • R. Tibshirani
      ,
    • J. Royal Statist
  • [14]
    • V. Blum
      ,
    • R. Gehrke
      ,
    • F. Hanke
      ,
    • P. Havu
      ,
    • V. Havu
    et al.
      • Comput.Phys.Commun. 180 (2009) 2175
  • [15]
    The NoMaD (Novel Materials Discovery) repository contains full input and output files of calculations in materials science:
  • [16]
    Computational Complexity: A Modern Approach (University Press, Cambridge,)
    • S. Arora
      ,
    • B. Barak
  • [17]
    J Mach Learn Res. 3, 1157
    • I. Guyon
      ,
    • A. ELisseff
  • [18]
    to be published
    • L.M. Ghiringhelli
      ,
    • J. Vybiral
      ,
    • S. Levchenko
      ,
    • C. Draxl
      ,
    • M. Scheffler
  • [19]
    Two columns of D are correlated if the absolute value of their Pearson’s correlation index is (about) 1