About Research+Code Blog

NLP Papers

The general theme that these papers follow is language grounding and vision for a range of different tasks, datasets, environments and.. downstream reasons we would want this.

Here’s the list. Happy reading!


Learning Visually Grounded Sentence Representations
D. Kiela, A. Conneau, A. Jabri and M. Nickel


Learning Robust Visual-Semantic Embeddings
Yao-Hung Hubert Tsai, Liang-Kang Huang, Ruslan Salakhutdinov


Multimodal learning with deep Boltzmann machines


Learning Representations by Maximizing Mutual Information Across Views
Philip Bachman, R Devon Hjelm, William Buchwalter


Learning Robust Visual-Semantic Embeddings
Yao-Hung Hubert Tsai, Liang-Kang Huang, Ruslan Salakhutdinov


Do Neural Network Cross-Modal Mappings Really Bridge Modalities?
Guillem Collell, Marie-Francine Moens


Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering
Ramakrishna Vedantam, Karan Desai, Stefan Lee, Marcus Rohrbach, Dhruv Batra, Devi Parikh


Learning to Count Objects in Natural Images for Visual Question Answering
Yan Zhang, Jonathon Hare, Adam Prügel-Bennett


Overcoming Language Priors in Visual Question Answering with Adversarial Regularization
Sainandan Ramakrishnan, Aishwarya Agrawal, Stefan Lee


Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, Lei Zhang


CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick


VQA: Visual Question Answering
Aishwarya Agrawal, Jiasen Lu, Stanislaw Antol, Margaret Mitchell, C. Lawrence Zitnick, Dhruv Batra, Devi Parikh