·厦门大学信息学院超智医疗创新研究中心

当前位置:

中文主页 >> 成果展示

Integrating retrieval-augmented generation for enhanced personalized physician recommendations in web-based medical services: model development study (将检索增强生成集成到基于网络的医疗服务中,以增强个性化医生推荐:模型开发研究)

发布时间:2025-11-01 点击次数:
数据简介:

期刊: Frontiers in Public Health

文章号: Article 1407754

DOI: 10.3389/fpubh.2025.1501408

作者: Yibo Xie, Kaifan Wang, Jiawei Zheng, Feiyan Liu, Xiaoli Wang, Guofeng Huang

英文摘要:

Background: 

    Web-based medical services have significantly improved access to healthcare by enabling remote consultations, streamlining scheduling, and improving access to medical information. However, providing personalized physician recommendations remains a challenge, often relying on manual triage by schedulers, which can be limited by scalability and availability.

Objective: 

    This study aimed to develop and validate a Retrieval-Augmented Generation-Based Physician Recommendation (RAGPR) model for better triage performance.

Methods: 

    This study utilizes a comprehensive dataset consisting of 646,383 consultation records from the Internet Hospital of the First Affiliated Hospital of Xiamen University. The research primarily evaluates the performance of various     embedding models, including FastText, SBERT, and OpenAI, for the purposes of clustering and classifying medical condition labels. Additionally, the study assesses the effectiveness of large language models (LLMs) by comparing Mistral, GPT-4o-mini, and GPT-4o. Furthermore, the study includes the participation of three triage staff members who contributed to the evaluation of the efficiency of the RAGPR model through questionnaires.

Results: 

    The results of the study highlight the different performance levels of different models in text embedding tasks. FastText has an F1-score of 46%, while the SBERT and OpenAI significantly outperform it, achieving F1-scores of 95 and 96%, respectively. The analysis highlights the effectiveness of LLMs, with GPT-4o achieving the highest F1-score of 95%, followed by Mistral and GPT-4o-mini with F1-scores of 94 and 92%, respectively. In addition, the performance ratings for the models are as follows: Mistral with 4.56, GPT-4o-mini with 4.45 and GPT-4o with 4.67. Among these, SBERT and Mistral are identified as the optimal choices due to their balanced performance, cost effectiveness, and ease of implementation.

中文摘要:

背景:

    基于网络的医疗服务通过实现远程会诊、简化预约流程和改善医疗信息获取,显著提升了医疗服务的可及性。然而,提供个性化的医生推荐仍然是一项挑战,目前通常依赖于预约人员的人工分诊,而人工分诊的可扩展性和可用性可能存在局限性。

目的:

    本研究旨在开发并验证一种基于检索增强生成算法的医生推荐(RAGPR)模型,以提升分诊性能。

方法:

    本研究利用厦门大学第一附属医院互联网医院包含646,383条就诊记录的综合数据集。研究主要评估了包括FastText、SBERT和OpenAI在内的多种嵌入模型在医疗状况标签聚类和分类方面的性能。此外,本研究还通过比较Mistral、GPT-4o-mini和GPT-4o,评估了大型语言模型(LLM)的有效性。此外,本研究还邀请了三名分诊工作人员参与,他们通过问卷调查对 RAGPR 模型的效率进行了评估。

结果:

    研究结果突显了不同模型在文本嵌入任务中的性能差异。FastText 的 F1 分数为 46%,而 SBERT 和 OpenAI 的性能显著优于 FastText,F1 分数分别达到 95% 和 96%。分析结果也体现了语言学习模型(LLM)的有效性,其中 GPT-4o 的 F1 分数最高,达到 95%,其次是 Mistral 和 GPT-4o-mini,F1 分数分别为 94% 和 92%。此外,各模型的性能评分如下:Mistral 为 4.56,GPT-4o-mini 为 4.45,GPT-4o 为 4.67。其中,SBERT 和 Mistral 因其性能均衡、成本效益高且易于实现,被认为是最佳选择。


in.png

图一 RAGPR模型架构