top of page

此網站是使用

網站幫手設計的。立即用來製作您的網站吧！立即開始

Xiaohan Jin (Sharon)

Improved video captioning based on Show-and-Tell model.
- image tags
- audio feature (MFCC)
- spatial attention
- GAN (novel)
Attended TREC Video Retrieval Evaluation (TRECVID) 2017 ‘Video to Text’ (VTT) competition held by NIST.

Exploring CNN-LSTM Architectures for Video Captioning

Project @ CMU 11775 Large-Scale Multimedia Analysis

Mar 2017 - May 2017

bottom of page