TCM-FTP: Fine-Tuning Large Language Models for Herbal Prescription Prediction

人工智能在中医药的应用研究

更多动态

1532

2024-09-17

近日，团队老师张连文、许玉龙参与，由香港科技大学、北京交通大学、中国中医科学院、河南中医药大学合作的科研成果: 《TCM-FTP：中医药诊断大模型》，被IEEE International Conference on Bioinformatics and Biomedicine (BIBM2024) 会议录用，IEEE BIBM会议是生物信息领域著名的会议，属交叉/综合/新兴类别，在CCF分级中为B类会议，近三年的录用率为19% 左右，在国际上有较高的影响力。

Abstract：Traditional Chinese medicine (TCM) relies on specific combinations of herbs in prescriptions to treat symptoms and signs, a practice that spans thousands of years. Predicting TCM prescriptions presents a fascinating technical challenge with practical implications. However, this task faces limitations due to the scarcity of high-quality clinical datasets and the intricate relationship between symptoms and herbs. To address these issues, we introduce DigestDS, a new dataset containing practical medical records from experienced experts in digestive system diseases. We also propose a method, TCM-FTP (TCM Fine-Tuning Pre-trained), to leverage pre-trained large language models (LLMs) through supervised fine-tuning on DigestDS.
Additionally, we enhance computational efficiency using a lowrank adaptation technique. TCM-FTP also incorporates data augmentation by permuting herbs within prescriptions, capitalizing on their order-agnostic properties. Impressively, TCMFTP achieves an F1-score of 0.8031, surpassing previous methods significantly. Furthermore, it demonstrates remarkable accuracy in dosage prediction, achieving a normalized mean square error of 0.0604. In contrast, LLMs without fine-tuning perform poorly. Although LLMs have shown capabilities on a wide range of tasks, this work illustrates the importance of fine-tuning for TCM prescription prediction, and we have proposed an effective way to do that.

附件

论文3 TCMFTP预出版tcmftp_arxiv_v2.pdf

登录用户可以查看和发表评论，请前往登录或注册。