Adaptive Latent Space Tuning for Non-Stationary Distributions

May 7, 2021
e-Print:

Citations per year

202020212022021
Abstract: (arXiv)
Powerful deep learning tools, such as convolutional neural networks (CNN), are able to learn the input-output relationships of large complicated systems directly from data. Encoder-decoder deep CNNs are able to extract features directly from images, mix them with scalar inputs within a general low-dimensional latent space, and then generate new complex 2D outputs which represent complex physical phenomenon. One important challenge faced by deep learning methods is large non-stationary systems whose characteristics change quickly with time for which re-training is not feasible. In this paper we present a method for adaptive tuning of the low-dimensional latent space of deep encoder-decoder style CNNs based on real-time feedback to quickly compensate for unknown and fast distribution shifts. We demonstrate our approach for predicting the properties of a time-varying charged particle beam in a particle accelerator whose components (accelerating electric fields and focusing magnetic fields) are also quickly changing with time.
  • [1]
    Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The elements of statistical learning: data mining, inference, and prediction / Science & Business Media
  • [2]
    Machine learning: a probabilistic perspective. MIT press
    • Murphy
      ,
    • Kevin P.
  • [3]
    Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436-444
  • [4]
    Machine learning: Trends, perspectives, and prospects. Science, 349(6245):255-260
    • Jordan
      ,
    • Michael I.
      ,
    • Mitchell
      ,
    • Tom M.
  • [5]
    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks
      • Adv.Neural Inf.Process.Syst. 25 (2012) 1097-1105
  • [6]
    Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. Spatial transformer networks. arXiv preprint
  • [7]
    Jonas Pfeiffer, Andreas Rücklé, Clifton Poth, Aishwarya Kamath, Ivan Vulí c, Sebastian Ruder, Kyunghyun Cho, and Iryna Gurevych. Adapterhub: A framework for adapting transformers. arXiv preprint
  • [8]
    Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, and Hervé Jégou. Going deeper with image transformers. arXiv preprint
  • [9]
    Yoshua Bengio, Patrice Simard, and Paolo Frasconi. Learning long-term dependencies with gradient descent is difficult
      • IEEE Trans.Neural Networks 5 (1994) 157-166
  • [10]
  • [12]
    Lechao Xiao, Jeffrey Pennington, and Samuel Schoenholz. Disentangling trainability and generalization in deep neural networks. In International Conference on Machine Learning, pages 10462-10472. PMLR
  • [13]
    Volodymyr Mnih, Koray Kavukcuoglu, David Silver / Joel Veness, Marc G Bellemare / Alex Graves, Martin Riedmiller, Andreas K Fidjeland / Georg Ostrovski, et al / Human-level control through deep reinforcement learning. nature, 518(7540):529-533
    • Rusu
      ,
    • Andrei A.
  • [14]
    Reinforcement learning: An introduction. MIT press
    • Sutton
      ,
    • Richard S.
      ,
    • Barto
      ,
    • Andrew G.
  • [15]
    Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael / Tensorflow: A system for large-scale machine learning. In 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16), pages 265-283
    • Isard
  • [17]
    Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia, pages 675-678
  • [18]
    Frank Seide and Amit Agarwal. Cntk: Microsoft’s open-source deep-learning toolkit. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 2135-2135
  • [21]
    Hugh Cartwright, Olexandr Isayev, and Aron Walsh. Machine learning for molecular and materials science. Nature, 559(7715):547-555
    • Butler
      ,
    • Keith T.
      ,
    • Davies
      ,
    • Daniel W.
  • [22]
    Darko Zibar, Henk Wymeersch, and Ilya Lyubomirsky. Machine learning under the spotlight
      • Nature Photon. 11 (2017) 749-751
  • [23]
    Samuel Stern Schoenholz, Jennifer M Rieser / Brad Dean Malone, Joerg Rottler, Douglas J Durian / Efthimios Kaxiras, and Andrea J Liu. Identifying structural flow defects in disordered solids using machine-learning methods
    • Cubuk
      ,
    • Ekin D.
      • Phys.Rev.Lett. 114 (2015) 108001
  • [24]
    Efthimios Kaxiras, and Andrea J Liu. A structural approach to relaxation in glassy liquids
    • Schoenholz
      ,
    • Samuel S.
      ,
    • Cubuk
      ,
    • Ekin D.
      ,
    • Sussman
    et al.
      • Nature Phys. 12 (2016) 469-471
  • [25]
    Combining machine learning and physics to understand glassy systems. In Journal of Physics: Conference Series, volume 1036, page 012021 / Publishing
    • Schoenholz
      ,
    • Samuel S.