Informative-Image-Captioning

Image Captioning is the technique in which automatic descriptions are generated for an image.

Image Captioning is the process of generating a textual description of an image. It uses both Natural Language Processing and CNN to generate the captions.

The entire code is in the jupyter notebook, so that should hopefully make it easier to understand.

Dependencies

1.Keras 2.3.1

2.Tensorflow-gpu 2.2.0

3.tqdm

4.numpy

5.pandas

6.matplotlib

7.pickle

8.PIL

9.glob

Imp: This code is implemented using Tensorflow-gpu.

You must have an Nvidia GPU and corresponding Drivers.

Dataset

I have used The Flickr8k dataset(size 1 GB). MS-COCO and Flickr30K are other datasets that you can use.

Flickr8K has training images-6000

validation images-1000

testing images-1000

Each image has 5 captions describing it.

Model

In Image Captioning, a CNN is used to extract the features from an image which is then along with the captions is fed into an RNN. To extract the features, we use a model trained on Imagenet. I tried out VGG-16, Resnet-50, and InceptionV3. Vgg16 has almost 134 million parameters and its top-5 error on Imagenet is 7.3%. InceptionV3 has 21 million parameters and its top-5 error on Imagenet is 3.46%. Human top-5 error on Imagenet is 5.1%.

For creating the model, the captions have to be put in an embedding. Setting the embedding size to 300. The image below is the model that I used. .

After training the model for 50 epochs with batch size of 512,

the accuracy achieved was 75% and the loss was lowered to 0.911.

Results

Finally, here are some results that I got. The code for the results is in the jupyter notebook and you can generate your own by writing some code at the end.

1. True caption: Three child soccer players go for the ball .

2. True caption: A dog wading in the water with a ball in his mouth .

3. True caption: A large white bird flies over water .

4. True caption: small dog running in the grass with a toy in its mouth .

5.True caption: The girls is jumping into the air on the beach .

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
model		model
results		results
weight		weight
Informative-Image-Captioning.ipynb		Informative-Image-Captioning.ipynb
LICENSE		LICENSE
README.md		README.md
unique.p		unique.p

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Informative-Image-Captioning

Dataset

Model

Results

About

Uh oh!

Releases

Packages

Languages

License

arjavdongaonkar/Informative-Image-Captioning

Folders and files

Latest commit

History

Repository files navigation

Informative-Image-Captioning

Dataset

Model

Results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages