Paper Reading - Convolutional Image Captioning ( CVPR 2018 )
2024-09-04 13:04:03
Link of the Paper: https://arxiv.org/abs/1711.09151
Motivation:
- LSTM units are complex and inherently sequential across time.
- Convolutional networks have shown advantages on machine translation and conditional image generation.
Innovation:
- The authors develop a convolutional ( CNN-based ) image captioning method that shows comparable performance to an LSTM based method on standard metrics.
- The authors analyze the characteristics of CNN and LSTM nets and provide useful insights such as -- CNNs produce more entropy ( useful for diverse predictions ), better classification accuracy, and do not suffer from vanishing gradients.
Improvement:
- Improved performance with a CNN model that uses Attention Mechanism to leverage spatial image features.
General Points:
- Image Captioning is applicable to virtual assistants, editing tools, image indexing and support of the disabled.
- Image Captioning is a basic ingredient for more complex operations such as storytelling and visual summarization.
- An illustration of a classical RNN architecture for image captioning is provided below.
最新文章
- Spark代码调优(一)
- C++中new的解说
- C# 方法调用的切换器 Update 2015.02.02
- B2车
- OBJECT ARX 添加标注样式
- Javascript进阶篇——( JavaScript内置对象---下)--Math对象---笔记整理
- C#获得命令提示符输出
- B - 最大报销额
- IE6,IE7,IE8下报JS错误:expected identifier, string or number的原因及解决的方法
- apollo实现c#与android消息推送(二)
- python正则表达式re模块详细介绍
- JS获取滚动条距离顶部高度
- avcodec_decode_video2()解码视频后丢帧的问题解决
- 最近公共祖先(LCA)的三种求解方法
- Python学习之旅(十八)
- HTML中body与html的关系
- html----常见的标签
- centos安装angr
- 项目中jsp的存放
- 024-linux中动态库libXXX.so