Principal Investigator Boris Katz
Project Website http://cbmm.mit.edu.ezproxy.canberra.edu.au/research/projects-thrust/visual-intelligence/grounded-quest…
We have constructed techniques for describing videos with natural language sentences. Building on this work, we are going beyond description to answering questions such as: What is the person on the left doing with the blue object? This work takes as input a natural-language question and produces a natural-language answer.
We are striving to create an approach that will make it possible for a system to understand and answer a variety of questions, rather than constructing individual systems for each question type (who is there?, what are they doing?, where are they?, etc.).