Designing efficient and reliable VQA systems remains a challenging problem, more so in the case of disaster management and response systems. In this work, we revisit fundamental combination methods like concatenation, addition and element-wise multiplication with modern image and text feature abstraction models. We design a simple and efficient system which outperforms pre-existing methods on the FloodNet dataset and achieves state-of-the-art performance. This simplified system requires significantly less training and inference time than modern VQA architectures. We also study the performance of various backbones and report their consolidated results. Code is available at https://github.com/sahilkhose/floodnet_vqa.
翻译:设计高效和可靠的VQA系统仍是一个具有挑战性的问题,在灾害管理和应对系统方面尤其如此。在这项工作中,我们重新审视了基本的组合方法,如连接、添加和与现代图像和文本的元素性倍增等,其特征是抽象模型。我们设计了一个简单而有效的系统,它比FloodNet数据集的原有方法更完善,并实现了最先进的性能。这个简化系统比现代VQA结构所需要的培训和推断时间要少得多。我们还研究各种骨干的表现并报告其综合结果。代码可在https://github.com/sahilkhose/floodnet_vqa查阅。