PERFORMANCE EVALUATION OF A LARGE LANGUAGE MODEL-BASED TOOL FOR NUTRITIONAL RECOMMENDATIONS IN CHRONIC KIDNEY DISEASE
Abstract
Introduction: Nutritional management of chronic kidney disease (CKD) is an essential component of treatment; however, its implementation faces multiple challenges due to the complexity of dietary recommendations and the shortage of specialized professionals. Large language models (LLMs) offer the possibility of complementing professional consultations through virtual assistance tools, but their specific performance in the area of renal nutrition has not yet been adequately evaluated. Objective: To evaluate the performance of NutriRenal, a virtual assistant based on a large language model adjusted using a prompt designed by experts, through an evaluation by nutritionists specializing in CKD in response to nutrition-related queries from patients with CKD. Methods:A descriptive, cross-sectional study was conducted in which three specialized nutritionists evaluated 211 responses generated by NutriRenal to questions formulated by nephrologists. Responses were classified into three dimensions: comprehensibility, completeness, and consistency with scientific evidence, using a scale of 1 to 3. Differences were analyzed before and after the prompt's adjustment, as well as by CKD stage, presence of diabetes, and evaluator. Results: After the prompt's adjustment, NutriRenal demonstrated high performance: 99% of responses were rated as adequate in comprehensibility, 86.7% in completeness, and 95.2% in consistency with scientific evidence. These improvements were statistically significant compared to the original prompt. Performance was consistent across the different subgroups evaluated, with patients with diabetes showing the best scores. Conclusions: NutriRenal demonstrated robust performance after the rapid adjustment, generating high-quality responses according to the evaluated professional criteria. Its implementation could be a valuable complement to traditional nutritional consultations in patients with CKD. However, further studies in real-world clinical settings are needed to validate its impact on daily clinical practice.
Copyright (c) 2025 Revista de Nefrología, Diálisis y Trasplante

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
