BESTMVQA: A Benchmark Evaluation System for Medical Visual Question Answering (BESTMVQA:医学视觉问答基准评估系统)
- 数据简介:
会议: ECML PKDD 2024(Applied Data Science Track, LNCS 14949)
页码:435–451;
DOI:10.1007/978-3-031-70378-2_27;
作者:Xiaojie Hong, Zixin Song, Liangzhi Li, Xiaoli Wang, Feiyan Liu。
英文摘要:
Medical Visual Question Answering (Med-VQA) is a task that answers a natural language question with a medical image. Existing VQA techniques can be directly applied to solving the task. However, they often suffer from (i) the data insufficient problem, which makes it difficult to train the state of the arts (SOTAs) for domain-specific tasks, and (ii) the reproducibility problem, that existing models have not been thoroughly evaluated in a unified experimental setup. To address the issues, we develop a Benchmark Evaluation SysTem for Medical Visual Question Answering, denoted by BESTMVQA. Given clinical data, our system provides a useful tool for users to automatically build Med-VQA datasets. Users can conveniently select a wide spectrum of models from our library to perform a comprehensive evaluation study. With simple configurations, our system can automatically train and evaluate the selected models over a benchmark dataset, and reports the comprehensive results for users to develop new techniques or perform medical practice. Limitations of existing work are overcome (i) by the data generation tool, which automatically constructs new datasets from unstructured clinical data, and (ii) by evaluating SOTAs on benchmark datasets in a unified experimental setup. The demonstration video of our system can be found at https://youtu.be/QkEeFlu1x4A, and the source code is shared on https://github.com/emmali808/BESTMVQA.
中文摘要:
医学视觉问答 (Med-VQA) 是一项利用医学图像回答自然语言问题的任务。现有的 VQA 技术可以直接应用于解决该任务。然而,它们常常面临以下问题:(i) 数据不足,难以训练出针对特定领域任务的最优模型 (SOTA);(ii) 可重复性问题,即现有模型尚未在统一的实验环境中进行全面评估。为了解决这些问题,我们开发了一个医学视觉问答基准评估系统,简称 BESTMVQA。该系统基于临床数据,为用户提供了一个自动构建 Med-VQA 数据集的实用工具。用户可以方便地从我们的模型库中选择各种模型进行全面的评估研究。只需简单的配置,我们的系统就可以在基准数据集上自动训练和评估所选模型,并报告综合结果,供用户开发新技术或开展医疗实践。现有研究的局限性通过以下方式得到克服:(i) 数据生成工具,该工具可自动从非结构化临床数据构建新的数据集;(ii) 通过在统一的实验设置中对基准数据集上的 SOTA 进行评估。我们系统的演示视频可在 https://youtu.be/QkEeFlu1x4A 观看,源代码可在 https://github.com/emmali808/BESTMVQA. 分享。

图一 BESTMVQA架构
