To the top

Page Manager: Webmaster
Last update: 9/11/2012 3:13 PM

Tell a friend about this page
Print version

Knowing When to Look For … - University of Gothenburg, Sweden Till startsida
To content Read more about how we use cookies on

Knowing When to Look For What and Where: Evaluating Generation of Spatial Descriptions with Adaptive Attention

Conference paper
Authors Mehdi Ghanimifard
Simon Dobnik
Published in Computer Vision – ECCV 2018 Workshops
ISBN 978-3-030-11017-8
Publisher Springer, Cham
Publication year 2018
Published at Department of Philosophy, Linguistics and Theory of Science
Language en
Keywords image descriptions, grounded neural language model, attention model, spatial descriptions
Subject categories Human Computer Interaction, Image analysis, Computational linguistics


We examine and evaluate adaptive attention in (Lu et al. 2017) (which balances the focus on visual features and focus on textual features) in generating image captions in end-to-end neural networks, in particular how adaptive attention is informative for generating spatial relations. We show that the model generates spatial relations more on the basis of textual rather than visual features and therefore confirm the previous observations that the learned visual features are missing information about geometric relations between objects.

Page Manager: Webmaster|Last update: 9/11/2012

The University of Gothenburg uses cookies to provide you with the best possible user experience. By continuing on this website, you approve of our use of cookies.  What are cookies?