While deep learning has been revolutionary for machine learning, most modern deep learning models cannot represent their uncertainty nor take advantage of the well studied tools of probability theory. This has started to change following recent developments of tools and techniques combining Bayesian approaches with deep learning. The intersection of the two fields has received great interest from the community over the past few years, with the introduction of new deep learning models that take advantage of Bayesian techniques, as well as Bayesian models that incorporate deep learning elements [1-11]. In fact, the use of Bayesian techniques in deep learning can be traced back to the 1990s’, in seminal works by Radford Neal [12], David MacKay [13], and Dayan et al. [14]. These gave us tools to reason about deep models’ confidence, and achieved state-of-the-art performance on many tasks. However earlier tools did not adapt when new needs arose (such as scalability to big data), and were consequently forgotten. Such ideas are now being revisited in light of new advances in the field, yielding many exciting new results.

Extending on last year’s workshop’s success, this workshop will again study the advantages and disadvantages of such ideas, and will be a platform to host the recent flourish of ideas using Bayesian approaches in deep learning and using deep learning tools in Bayesian modelling. The program includes a mix of invited talks, contributed talks, and contributed posters. It will be composed of five main themes: deep generative models, variational inference using neural network recognition models, practical approximate inference techniques in Bayesian neural networks, applications of Bayesian neural networks, and information theory in deep learning. Future directions for the field will be debated in a panel discussion.

Previous workshops:

Our 2016 workshop page is available here; videos from the 2016 workshop are available online as well.


Invited Speakers


8.00 - 8.05 Opening remarks Yarin Gal (Oxford)
8.05 - 8.30 Invited talk Dustin Tran (Columbia) Why Aren't You Using Probabilistic Programming?
8.30 - 8.45 Contributed talk Tom Rainforth, Tuan Anh Le, Maximilian Igl, Chris J. Maddison and Frank Wood Tighter ELBOs are Not Necessarily Better
8.45 - 9.10 Invited talk Finale Doshi (Harvard University) Automatic Model Selection in BNNs with Horseshoe Priors
9.10 - 9.40 Special talk Max Welling (Amsterdam / Qualcomm) Deep Bayes for Distributed Learning, Uncertainty Quantification and Compression
9.40 - 10.00 Poster spotlights
10.00 - 10.55 Discussion over coffee and poster session
10.55 - 11.20 Invited talk Matt Hoffman (Google) Stochastic Gradient Descent as Approximate Bayesian Inference
11.20 - 11.35 Contributed talk Gintare Karolina Dzuigaite and Daniel Roy Entropy-SG(L)D Optimizes the Prior of a (Valid) PAC-Bayes Bound
11.35 - 12.00 Invited talk Nal Kalchbrenner (Google DeepMind) Recent Advances in Autoregressive Generative Models
12.00 - 13.35 Lunch
13.35 - 14.00 Invited talk Russ Salakhutdinov (CMU/Apple) Deep Kernel Learning
14.00 - 14.15 Contributed talk Alexander Amini, Ava Soleimany, Sertac Karaman and Daniela Rus Spatial Uncertainty Sampling for End to End Control
14.15 - 14.40 Invited talk Meire Fortunato (Google DeepMind) Bayes by Backprop
14.40 - 15.35 Discussion over coffee and poster session
15.35 - 16.00 Invited talk Naftali (Tali) Tishby (The Hebrew University) How do the Deep Learning layers converge to the Information Bottleneck limit by Stochastic Gradient Descent?
16.00 - 17.00 Panel Session Panellists:
Finale Doshi-Velez
Zoubin Ghahramani
Yann LeCun
Max Welling
Yee Whye Teh
Ole Winther
Moderator: Neil Lawrence
17.00 - 19.00 Poster session

Accepted Abstracts

Authors Title
Mahdi Azarafrooz Doubly Stochastic Adversarial Autoencoder [paper]
Ashwin D'Cruz, Sebastian Nowozin and Bill Byrne Tradeoffs in Neural Variational Inference [paper]
Samuel L. Smith and Quoc V. Le A Bayesian Perspective on Generalization and Stochastic Gradient Descent [paper]
Chi Zhang, Jiasheng Tang, Hao Li, Cheng Yang, Shenghuo Zhu and Rong Jin An Asynchronous Variance Reduced Framework for Efficient Bayesian Deep Learning [paper]
Hengyuan Hu and Ruslan Salakhutdinov Learning Deep Generative Models With Discrete Latent Variables [paper]
Kira Kempinska and John Shawe-Taylor Adversarial Sequential Monte Carlo [paper]
Yunhao Tang and Alp Kucukelbir Variational Deep Q Network [paper]
Hanna Tseran and Tatsuya Harada Memory Augmented Neural Network with Gaussian Embeddings for One-Shot Learning [paper]
Joseph Marino, Yisong Yue and Stephan Mandt Iterative Inference Models [paper]
Matthias Bauer, Mateo Rojas-Carulla, Jakub Swiatkowski, Bernhard Schoelkopf and Richard E. Turner Discriminative k-shot learning using probabilistic models [paper]
Arya Pourzanjani, Richard Jiang and Linda Petzold Improving the Identifiability of Neural Networks for Bayesian Inference [paper]
Marco Federici, Karen Ullrich and Max Welling Improved Bayesian Compression [paper]
Maja Rudolph, Francisco Ruiz and David Blei Word2Net: Deep Representations of Language [paper]
Raanan Yehezkel Rohekar, Guy Koren, Shami Nisimov and Gal Novik Unsupervised Deep Structure Learning by Recursive Independence Testing [paper]
Eric Zhan, Stephan Zheng and Yisong Yue MAGnet: Generating Long-Term Multi-Agent Trajectories [paper]
Dimity Miller, Lachlan Nicholson, Feras Dayoub and Niko Sünderhauf Dropout Variational Inference Improves Object Detection in Open-Set Conditions [paper]
Patrick Mcclure and Nikolaus Kriegeskorte Robustly representing uncertainty through sampling in deep neural networks [paper]
Ian Dewancker, Jakob Bauer and Michael McCourt Sequential Preference-Based Optimization [paper]
Ryan Turner and Brady Neal How well does your sampler really work? [paper]
Kimin Lee, Honglak Lee, Kibok Lee and Jinwoo Shin Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples [paper]
Alexander Amini, Ava Soleimany, Sertac Karaman and Daniela Rus Spatial Uncertainty Sampling for End to End Control [paper]
Bin Liu, Lirong He, Shandian Zhe, Yingming Li and Zenglin Xu DeepCP: Nonlinear Tensor Decomposition as a Deep Generative Model [paper]
Cuong Nguyen, Yingzhen Li, Thang Bui and Richard Turner Variational Continual Learning in Deep Models [paper]
Stefan Webb, Adam Golinski, Robert Zinkov and Frank Wood Principled Inference Networks in Deep Generative Models [paper]
Subhadip Mukherjee, Debabrata Mahapatra and Chandra Sekhar Seelamantula DNNs for sparse coding and dictionary learning [paper]
Onur Ozdemir, Benjamin Woodward and Andrew Berlin Propagating Uncertainty in Multi-Stage Bayesian Convolutional Neural Networks with Application to Pulmonary Nodule Detection [paper]
Jonathan Gordon and Jose Miguel Hernandez-Lobato Bayesian Semisupervised Learning with Deep Generative Models [paper]
David Krueger, Chin-Wei Huang, Riashat Islam, Ryan Turner, Alexandre Lacoste and Aaron Courville Bayesian Hypernetworks [paper]
Guodong Zhang, Shengyang Sun and Roger Grosse Natural Gradient as Stochastic Variational Inference [paper]
Nick Pawlowski, Martin Rajchl and Ben Glocker Implicit Weight Uncertainty in Neural Networks [paper]
Ambrish Rawat, Martin Wistuba and Maria-Irina Nicolae Harnessing Model Uncertainty for Detecting Adversarial Examples [paper]
Xun Zheng, Manzil Zaheer, Amr Ahmed, Yuan Wang, Eric Xing and Alex Smola Particle MCMC for Latent LSTM Allocation [paper]
Yingzhen Li Approximate Gradient Decent for Training Implicit Generative Models [paper]
Aleksander Wieczorek, Mario Wieser, Damian Murezzan and Volker Roth Deep Copula Information Bottleneck [paper]
Chin-Wei Huang and Aaron Courville Sequentialized Sampling Importance Resampling and Scalable IWAE [paper]
Soumya Ghosh and Finale Doshi-Velez Model Selection in Bayesian Neural Networks via Horseshoe Priors [paper]
Sergey Tulyakov, Andrew Fitzgibbon and Sebastian Nowozin Hybrid VAE: Improving Deep Generative Models using Partial Observations [paper]
Hippolyt Ritter, Aleksandar Botev and David Barber A Scalable Laplace Approximation for Neural Networks [paper]
Jiri Hron, Alexander Matthews and Zoubin Ghahramani Two Problems with Variational Gaussian Dropout [paper]
Aritra Bhowmik, Aniruddha Adiga, Chandra Sekhar Seelamantula, Fabian Hauser, Jaroslaw Jacak and Bettina Heise Bayesian Deep Deconvolutional Neural Networks [paper]
Christopher Tegho, Pawel Budzianowski and Milica Gasic Uncertainty Estimates for Efficient Neural Network-based Dialogue Policy Optimisation [paper]
Abdul-Saboor Sheikh, Kashif Rasul, Andreas Merentitis and Urs Bergmann Stochastic Maximum Likelihood Optimization via Hypernetworks [paper]
Hugh Salimbeni and Marc Deisenroth Deeply Non-Stationary Gaussian Processes [paper]
Mohammad Emtiyaz Khan, Zuozhu Liu, Voot Tangkaratt and Yarin Gal Vprop: Variational Inference using RMSprop [paper]
Leonard Hasenclever, Jakub Tomczak, Rianne van den Berg and Max Welling Variational Inference with Orthogonal Normalizing Flow [paper]
Alex Lewandowski Batch Normalized Deep Kernel Learning for Weight Uncertainty [paper]
Gintare Karolina Dzuigaite and Daniel Roy Entropy-SG(L)D Optimizes the Prior of a (Valid) PAC-Bayes Bound [paper]
Casper Kaae Sønderby, Ben Poole and Andriy Mnih Continuous Relaxation Training of Discrete Latent Variable Image Models [paper]
Tom Rainforth, Tuan Anh Le, Maximilian Igl, Chris J. Maddison and Frank Wood Tighter ELBOs are Not Necessarily Better [paper]
Jhosimar Arias Figueroa and Adín Ramírez Rivera Is Simple Better?: Revisiting Simple Generative Models for Unsupervised Clustering [paper]
Carlos Riquelme, George Tucker and Jasper Snoek Deep Bayesian Bandits Showdown [paper]
Dustin Tran, Yura Burda and Ilya Sutskever Generative Models for Alignment and Data Efficiency in Language [paper]
Jaehoon Lee, Yasaman Bahri, Roman Novak, Samuel S. Schoenholz, Jeffrey Pennington and Jascha Sohl-Dickstein Deep Neural Networks as Gaussian Processes [paper]
Shengjia Zhao, Jiaming Song and Stefano Ermon A Lagrangian Perspective on Latent Variable Generative Modeling [paper]
Alexandre Lacoste, Thomas Boquet, Negar Rostamzadeh, Boris Oreshki, Wonchang Chung and David Krueger Deep Prior [paper]
John Lambert, Ozan Sener and Silvio Savarese Deep Learning under Privileged Information [paper]
Rui Shu, Hung H. Bui and Stefano Ermon AC-GAN Learns a Biased Distribution [paper]
Daniel Flam-Shepherd, James Requeima and David Duvenaud Mapping Gaussian Process Priors to Bayesian Neural Networks [paper]
Matthew Hoffman, Carlos Riquelme and Matthew Johnson The Beta-VAE's Implicit Prior [paper]
Juan Camilo Gamboa Higuera, David Meger and Gregory Dudek Synthesizing Neural Network Controllers with Probabilistic Model-Based Reinforcement Learning [paper]
Tim G. J. Rudner and Dino Sejdinovic Inter-domain Deep Gaussian Processes [paper]
Brian Trippe and Richard Turner Conditional Density Estimation with Bayesian Normalizing Flows [paper]
Stanislav Fort Gaussian Prototypical Networks for Few-Shot Learning on Omniglot [paper]
Aditya Grover, Aaron Zweig and Stefano Ermon Graphite: Iterative Generative Modeling of Graphs [paper]
Peter Henderson, Thang Doan, Riashat Islam and David Meger Bayesian Policy Gradients via Alpha Divergence Dropout Inference [paper]
Maxime Voisin and Daniel Ritchie An Improved Training Procedure for Neural Autoregressive Data Completion [paper]
Chin-Wei Huang, David Krueger and Aaron Courville Facilitating Multimodality in Normalizing Flows [paper]
Luigi Malagò, Alexandra Peste and Septimia Sarbu An Explanatory Analysis of the Geometry of Latent Variables Learned by Variational Auto-Encoders [paper]

Call for papers

We invite researchers to submit work in any of the following areas:

  • deep generative models,
  • variational inference using neural network recognition models,
  • practical approximate inference techniques in Bayesian neural networks,
  • applications of Bayesian neural networks,
  • information theory in deep learning,
  • or any of the topics below.

A submission should take the form of an extended abstract (3 pages long this year) in PDF format using the NIPS style. Author names do not need to be anonymised and references may extend as far as needed beyond the 3 page upper limit. If research has previously appeared in a journal, workshop, or conference (including NIPS 2017 conference), the workshop submission should extend that previous work. Parallel submissions (such as to ICLR) are permitted.

Submissions will be accepted as contributed talks or poster presentations. Extended abstracts should be submitted by Friday 3 November 2017; submission page is here. Final versions will be posted on the workshop website (and are archival but do not constitute a proceedings).

Please note that you can still submit to the workshop even if you did not register to NIPS in time. NIPS has reserved 1200 workshop registrations for accepted workshop submissions. If your submission is accepted but you are not registered to the workshops, please contact us promptly.

Key Dates:

  • Extended abstract submission deadline: Friday 3 November 2017 (submission page is here)
  • Acceptance notification: 17 November 2017
  • Complimentary workshop registration award notification: 17 November 2017
  • Final paper submission: 1 December 2017
  • Workshop: 9 December 2017


  • Probabilistic deep models for classification and regression (such as extensions and application of Bayesian neural networks),
  • Generative deep models (such as variational autoencoders),
  • Incorporating explicit prior knowledge in deep learning (such as posterior regularization with logic rules),
  • Approximate inference for Bayesian deep learning (such as variational Bayes / expectation propagation / etc. in Bayesian neural networks),
  • Scalable MCMC inference in Bayesian deep models,
  • Deep recognition models for variational inference (amortized inference),
  • Model uncertainty in deep learning,
  • Bayesian deep reinforcement learning,
  • Deep learning with small data,
  • Deep learning in Bayesian modelling,
  • Probabilistic semi-supervised learning techniques,
  • Active learning and Bayesian optimization for experimental design,
  • Information theory in deep learning,
  • Kernel methods in Bayesian deep learning,
  • Implicit inference,
  • Applying non-parametric methods, one-shot learning, and Bayesian deep learning in general.


Complimentary workshop registration

Several NIPS 2017 complimentary workshop registrations will be awarded to authors of accepted workshop submissions. These will be announced by 17 November 2017. Award recipients will be reimbursed by NIPS for their workshop registration.

Sponsorship Travel Awards

We have managed to secure four sponsorships, each donating two travel awards for junior researchers (8 in total). Each travel award is of the amount 700 USD, which will be awarded to selected submissions based on reviewer recommendation. These will be announced by 17 November 2017 as well. We are deeply grateful to our sponsors: Google, Microsoft Ventures, Uber, and Qualcomm.


  1. Kingma, DP and Welling, M, ‘’Auto-encoding variational bayes’’, 2013.
  2. Rezende, D, Mohamed, S, and Wierstra, D, ‘’Stochastic backpropagation and approximate inference in deep generative models’’, 2014.
  3. Blundell, C, Cornebise, J, Kavukcuoglu, K, and Wierstra, D, ‘’Weight uncertainty in neural network’’, 2015.
  4. Hernandez-Lobato, JM and Adams, R, ’’Probabilistic backpropagation for scalable learning of Bayesian neural networks’’, 2015.
  5. Gal, Y and Ghahramani, Z, ‘’Dropout as a Bayesian approximation: Representing model uncertainty in deep learning’’, 2015.
  6. Gal, Y and Ghahramani, G, ‘’Bayesian convolutional neural networks with Bernoulli approximate variational inference’’, 2015.
  7. Kingma, D, Salimans, T, and Welling, M. ‘’Variational dropout and the local reparameterization trick’’, 2015.
  8. Balan, AK, Rathod, V, Murphy, KP, and Welling, M, ‘’Bayesian dark knowledge’’, 2015.
  9. Louizos, C and Welling, M, “Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors”, 2016.
  10. Lawrence, ND and Quinonero-Candela, J, “Local distance preservation in the GP-LVM through back constraints”, 2006.
  11. Tran, D, Ranganath, R, and Blei, DM, “Variational Gaussian Process”, 2015.
  12. Neal, R, ‘’Bayesian Learning for Neural Networks’’, 1996.
  13. MacKay, D, ‘’A practical Bayesian framework for backpropagation networks‘’, 1992.
  14. Dayan, P, Hinton, G, Neal, R, and Zemel, S, ‘’The Helmholtz machine’’, 1995.
  15. Wilson, AG, Hu, Z, Salakhutdinov, R, and Xing, EP, “Deep Kernel Learning”, 2016.
  16. Saatchi, Y and Wilson, AG, “Bayesian GAN”, 2017.
  17. MacKay, D.J.C. “Bayesian Methods for Adaptive Models”, PhD thesis, 1992.