Comparing LLMs for Heart Failure Diagnosis and Therapy

(AI in Healthcare – Research Project)

Author:

Máté Lukács

Date: April 2025

This repository presents a research project analyzing the application of large language models (LLMs) in the diagnosis and management of heart failure (HF) – a major global health challenge affecting more than 64 million people worldwide.

The study critically evaluates three state-of-the-art LLMs — GPT-4x (OpenAI), Gemini (Google DeepMind), and DeepSeek — focusing on their clinical reasoning, technical and medical limitations, and ethical considerations in the context of cardiovascular care.

📑 Research Scope

The project addresses three primary dimensions:

Reasoning capabilities of LLMs compared to expert cardiologists.
Technical and medical limitations, including accuracy, guideline adherence, and adaptability to clinical nuance.
Ethical implications of deploying LLMs in healthcare, covering issues such as bias, transparency, accountability, and patient trust.

🔬 Methodology

Conducted a structured literature and guideline-informed evaluation of LLMs in heart failure care.
Designed case-based prompts to simulate diagnostic and therapeutic decision-making.
Compared model responses with ESC, ACC/AHA, and AHA guidelines and expert clinical reasoning.
Assessed both strengths and limitations of each model.

📊 Key Findings

GPT-4x (OpenAI)
- Strong in guideline-based reasoning and pharmacological pathways.
- Limited by static knowledge base, hallucinations, and lack of patient-specific adaptability.
Gemini (DeepMind)
- Excels in contextual integration (socio-demographics, comorbidities).
- Opacity of training data and reliance on structured prompts reduce trustworthiness.
DeepSeek
- Highly structured and consistent in guideline summaries.
- Outputs often rigid, formulaic, and less clinically adaptive.

⚖️ Comparative Summary

Model	Strengths	Limitations	Best Use Case
GPT-4x	Accurate, guideline-based, versatile	Hallucinations, no patient context	Medical education, decision support
Gemini	Context-aware, integrates broader factors	Closed-source, prompt-dependent	Policy, medical planning, guideline synthesis
DeepSeek	Structured, factually consistent	Rigid, lacks nuance	Standardized workflows, information retrieval

📁 Repository Structure

.
├── Comparing_LLMs_in_HeartFailure.pdf # Full research paper
├── README.md # Project overview
└── LICENSE # MIT License

🧭 Roadmap

v1.0.0: Initial research release (comparative study).
v1.1.0: Extend with quantitative evaluation of LLM outputs using clinical benchmarks.
v2.0.0: Expand to other cardiovascular conditions (e.g., atrial fibrillation, coronary artery disease).

📄 License

This project is licensed under the MIT License – see the LICENSE file for details.

👥 Credits

This research project was officially submitted as part of a team assignment.

Research, analysis, and methodology were conducted entirely by Máté Lukács.
For assignment submission purposes, teammates' names were included:
- Ádám Földvári
- Héctor Carlos Flores Reynoso

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md
llm-heart-failure-analysis.pdf		llm-heart-failure-analysis.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comparing LLMs for Heart Failure Diagnosis and Therapy

📑 Research Scope

🔬 Methodology

📊 Key Findings

⚖️ Comparative Summary

📁 Repository Structure

🧭 Roadmap

📄 License

👥 Credits

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Comparing LLMs for Heart Failure Diagnosis and Therapy

📑 Research Scope

🔬 Methodology

📊 Key Findings

⚖️ Comparative Summary

📁 Repository Structure

🧭 Roadmap

📄 License

👥 Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Packages