tensorflow==2.1.0
numpy==1.16.4
absl_py==0.7.0
matplotlib==2.2.3
pandas==0.23.4
Pillow==6.1.0
- To download all the dependencies, simply execute
 
pip install -r requirements.txt
- To download the CUB 200 dataset, simply execute the 
data_download.pyfile 
python data_download.py
- Download the Char-RNN-CNN embeddings from this link: download link and unzip it in place.
 
unzip birds.zip
- The 
model.pyfile contains the bare minimum code to run the stage 1 and stage 2 architecture. It automatically stores the weights after the specified/default number of epochs have completed. Note that the weights will be stored at the same directory level asmodel.py. 
python model.py
- Stage 1
- Text Encoder Network
- Text description to a 1024 dimensional text embedding
 - Learning Deep Representations of Fine-Grained Visual Descriptions Arxiv Link
 
 - Conditioning Augmentation Network
- Adds randomness to the network
 - Produces more image-text pairs
 
 - Generator Network
 - Discriminator Network
 - Embedding Compressor Network
 - Outputs a 64x64 image
 
 - Text Encoder Network
 
- Stage 2
- Text Encoder Network
 - Conditioning Augmentation Network
 - Generator Network
 - Discriminator Network
 - Embedding Compressor Network
 - Outputs a 256x256 image
 
 
- StackGAN: Text to photo-realistic image synthesis [Arxiv Link]
 - Improved Techniques for Training GANs [Arxiv Link]
 - Generative Adversarial Text to Image Synthesis [Arxiv Link]
 - Learning Deep Representations of Fine-Grained Visual Descriptions [Arxiv Link]
 
This is the code I have submitted to TensorFlow for Google Summer of Code. Hence the attributions and the License is for "TensorFlow Authors" and not "Vishal V". This code is under the MIT License.
