Our paper has been accepted to ACCV 2016.
- Video Summarization using Deep Semantic Features
Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, and Naokazu Yokoya
Thank you very much for all authors!
This work proposes a new video summarization approach that uses deep features of a video to get its semantics. By training a deep neural network with sentences, our deep features encode “sentence-level” semantics of the video, which boosts the performance of a standard, clustering-based video summarization approach.
See you at ACCV 2016 at Taipei!
Our paper has been accepted to 4th Workshop on Web-scale Vision and Social Media (VSM) in conjunction with ECCV 2016!
This paper is on video/text retrieval by text/video queries. Our approach uses LSTM to encode text as many other existing approaches, but our observation is that LSTM tends to forget about the detail in the text (It mixes up “typing the keyboard” and “playing the keyboard”). The main contribution of this paper is to fuse to text representation web images retrieved using the text as query, which can disambiguates text.
Looking forward to see you at the venue!
Now we have arXiv preprint. Please find it at: arXiv:1608.02367