Adaptive Latent Space Tuning for Non-Stationary Distributions

Scheinker, Alexander; Cropp, Frederick; Paiagua, Sergio; Filippetto, Daniele

Adaptive Latent Space Tuning for Non-Stationary Distributions

May 7, 2021

e-Print:

2105.03584 [stat.ML]

View in:

ADS Abstract Service

pdf

reference search3 citations

Citations per year

Abstract: (arXiv)

Powerful deep learning tools, such as convolutional neural networks (CNN), are able to learn the input-output relationships of large complicated systems directly from data. Encoder-decoder deep CNNs are able to extract features directly from images, mix them with scalar inputs within a general low-dimensional latent space, and then generate new complex 2D outputs which represent complex physical phenomenon. One important challenge faced by deep learning methods is large non-stationary systems whose characteristics change quickly with time for which re-training is not feasible. In this paper we present a method for adaptive tuning of the low-dimensional latent space of deep encoder-decoder style CNNs based on real-time feedback to quickly compensate for unknown and fast distribution shifts. We demonstrate our approach for predicting the properties of a time-varying charged particle beam in a particle accelerator whose components (accelerating electric fields and focusing magnetic fields) are also quickly changing with time.

References(90)

Figures(7)

[1]

Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The elements of statistical learning: data mining, inference, and prediction / Science & Business Media

[2]

Machine learning: a probabilistic perspective. MIT press

Murphy
,
Kevin P.

[3]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436-444

[4]

Machine learning: Trends, perspectives, and prospects. Science, 349(6245):255-260

Jordan
,
Michael I.
,
Mitchell
,
Tom M.

[5]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks

- Adv.Neural Inf.Process.Syst. 25 (2012) 1097-1105

[6]

Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. Spatial transformer networks. arXiv preprint

•
e-Print:
- 1506.02025

[7]

Jonas Pfeiffer, Andreas Rücklé, Clifton Poth, Aishwarya Kamath, Ivan Vulí c, Sebastian Ruder, Kyunghyun Cho, and Iryna Gurevych. Adapterhub: A framework for adapting transformers. arXiv preprint

•
e-Print:
- 2007.07779

[8]

Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, and Hervé Jégou. Going deeper with image transformers. arXiv preprint

•
e-Print:
- 2103.17239

[9]

Yoshua Bengio, Patrice Simard, and Paolo Frasconi. Learning long-term dependencies with gradient descent is difficult

- IEEE Trans.Neural Networks 5 (1994) 157-166

[10]

Long Short-Term Memory

- Neural Comput. 9 (1997) 8, 1735-1780
•
DOI:
- 10.1162/neco.1997.9.8.1735

[11]

Deep Neural Networks as Gaussian Processes

et al.

e-Print:
- 1711.00165

[12]

Lechao Xiao, Jeffrey Pennington, and Samuel Schoenholz. Disentangling trainability and generalization in deep neural networks. In International Conference on Machine Learning, pages 10462-10472. PMLR

[13]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver / Joel Veness, Marc G Bellemare / Alex Graves, Martin Riedmiller, Andreas K Fidjeland / Georg Ostrovski, et al / Human-level control through deep reinforcement learning. nature, 518(7540):529-533

Rusu
,
Andrei A.

[14]

Reinforcement learning: An introduction. MIT press

Sutton
,
Richard S.
,
Barto
,
Andrew G.

[15]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael / Tensorflow: A system for large-scale machine learning. In 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16), pages 265-283

Isard

[16]

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

et al.

e-Print:
- 1603.04467

[17]

Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia, pages 675-678

[18]

Frank Seide and Amit Agarwal. Cntk: Microsoft’s open-source deep-learning toolkit. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 2135-2135

[19]

Theano: A Python framework for fast computation of mathematical expressions

et al.

e-Print:
- 1605.02688

[20]

PyTorch: An Imperative Style, High-Performance Deep Learning Library

et al.

e-Print:
- 1912.01703

[21]

Hugh Cartwright, Olexandr Isayev, and Aron Walsh. Machine learning for molecular and materials science. Nature, 559(7715):547-555

Butler
,
Keith T.
,
Davies
,
Daniel W.

[22]

Darko Zibar, Henk Wymeersch, and Ilya Lyubomirsky. Machine learning under the spotlight

- Nature Photon. 11 (2017) 749-751

[23]

Samuel Stern Schoenholz, Jennifer M Rieser / Brad Dean Malone, Joerg Rottler, Douglas J Durian / Efthimios Kaxiras, and Andrea J Liu. Identifying structural flow defects in disordered solids using machine-learning methods

Cubuk
,
Ekin D.

- Phys.Rev.Lett. 114 (2015) 108001

[24]

Efthimios Kaxiras, and Andrea J Liu. A structural approach to relaxation in glassy liquids

Schoenholz
,
Samuel S.
,
Cubuk
,
Ekin D.
,
Sussman

et al.

- Nature Phys. 12 (2016) 469-471

[25]

Combining machine learning and physics to understand glassy systems. In Journal of Physics: Conference Series, volume 1036, page 012021 / Publishing

Schoenholz
,
Samuel S.

1-25 of 90
1
2
3
4
25 / page