How We Built a Disease Prediction Chatbot
Building an AI-powered healthcare chatbot that can predict diseases is one of the most impactful projects we've worked on. HealthGPT combines cutting-edge NLP techniques with medical domain knowledge to provide accurate health recommendations and disease predictions. In this post, I'll walk you through the entire architecture and implementation process.
We combined Machine Learning, RAG (Retrieval-Augmented Generation), and a Large Language Model (Llama 3.1 – 8B) to deliver accurate and context-aware predictions. Our goal was to create a fast, reliable pre-diagnosis tool that improves healthcare accessibility, especially in areas with limited medical resources.
Our dataset was built from trusted medical sources including WHO, CDC, NHS, NDHS, and MOHP. We focused on 10 diseases highly relevant to Nepal, such as Typhoid, Cholera, UTI, Anaemia, Tuberculosis, Prediabetes, and more. We cleaned and structured the dataset, removed irrelevant content, and split long documents into smaller chunks using RegEx for efficient retrieval.
We used MedEmbed, a medical-specific embedding model, to convert each medical text chunk into 1080-dimensional embeddings. These embeddings were stored in a vector database (ChromaDB), enabling intelligent semantic search. Whenever a user inputs symptoms, the system queries this database to retrieve the most relevant medical facts.
The RAG pipeline plays a crucial role in accuracy: instead of letting the LLM guess, it supplements the model with verified, real-world medical knowledge. This reduces hallucinations and ensures predictions are grounded in actual medical data. Llama 3.1 (8B) then interprets the retrieved information and produces a clear, human-like explanation for the user.
We also trained a machine learning classifier that maps symptoms to diseases. This model achieved an accuracy of 0.88 and a macro F1 score of 0.86. It performed exceptionally well on diseases like Cholera, Typhoid, and Prediabetes, where it achieved perfect F1 scores. We identified areas for improvement such as lower recall in Anaemia and Strep Throat.
The full system was built using Next.js for the frontend, FastAPI for the backend, ChromaDB + SQLite for storage, and Docker for deployment. The flow is simple: the user enters symptoms → system authenticates → embeddings are generated → vector search retrieves medical data → LLM analyzes → prediction is returned.
This project shows how AI can be used responsibly to improve healthcare accessibility.