Skip to content

Conversation

@marcoct
Copy link
Contributor

@marcoct marcoct commented May 19, 2021

Pending merge of probcomp/Gen.jl#417 into Gen.jl

(Currently the rws_mnist/ project depends on the branch for probcomp/Gen.jl#417, which adds support for multi-threaded gradient estimation and removes some unnecessary parameter allocations, but before this PR is merged the branch of Gen used in rws_mnist/ should be changed to master)

Some conclusions:

  • It is possible to use Gen to successfully train the 10-200-200 generative model (with stochastic hidden layers) and associated inference network in https://arxiv.org/abs/1406.2751 on the binarized MNIST data, roughly on the order of a day or two, without using a GPU and without vectorizing the model, using multi-threaded gradient estimation. The Gen implementation is considerably higher-level and easier to follow than this implementation in Theano.

  • In some preliminary experiments, multi-threaded gradient estimation currently gives some significant speedup (>4x) for minibatches of size 16-32 on a c4.8xlarge EC2 instance. But more thorough benchmarking, including on bare metal instances, would be helpful.

  • Some profiling of this benchmark and optimization of Gen for it, for large multi-core cloud instances, would be helpful, since the performance is likely to carry over to other relevant use cases, such as learning generative models and inference networks (perhaps with comparable or somewhat smaller neural networks) where the trace has stochastic structure. (The use case for non-vectorized CPU-based gradient estimation is more compelling in the case of highly stochastic structure -- e.g. for Bayesian program synthesis -- in which vectorization is more difficult and throughput advantage of GPU is reduced).

@marcoct marcoct changed the title Add reweighted wake sleep deep generative model example Reweighted wake sleep deep generative model example May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants