Nutrigenetic-GraphRAG is an advanced, local-first retrieval augmented generation system, tailored for biomedical knowledge discovery and personalized nutrition counseling. Built on top of the excellent nano-graphrag, this project adapts entity extraction and graph-based RAG (Retrieval Augmented Generation) to the unique linguistic and conceptual challenges of nutrigenomics.
Our system empowers customers, nutritionists, and researchers to explore complex nutrigenetic interactions, query biomedical knowledge graphs in natural language, and receive context-rich, explainable answers.
All components run locally using Ollama for LLMs and sentence-transformers for embedding, ensuring privacy, speed, and cost effectiveness.
Credit: This work is deeply inspired by nano-graphrag. We adapted and extended its architecture for the challenges of nutrigenetic counseling and biomedical entity modeling.
- Biomedical entity extraction: Customized extraction for genes, nutrients, phenotypes, and variants.
- Graph-based RAG: Graph-building and retrieval that retains biomedical relationships for nuanced responses.
- Ablation study: Easily swap LLMs (via Ollama) and embedders (via sentence-transformers) for benchmarking.
- Multi-user design: Suitable for consumers, domain expert nutritionists, and researchers.
- Privacy-first: 100% local processing — no cloud LLMs or embeddings.
- Incremental updates: Efficiently add or update documents and knowledge bases.
- Asynchronous API: Fast, concurrent ingestion and query capabilities.
- Python 3.9+
- Ollama installed and running
- At least one supported local sentence-transformer model
# clone this repo
git clone https://github.com/YOUR_USERNAME/nutrigenetic-graphrag.git
cd nutrigenetic-graphrag
# Install requirements
pip install -r requirements.txt
If using under editable mode for development:
pip install -e .
- Prepare your biomedical corpus: Place plain text, PDF, or .csv in the
data/
folder. - Run Ollama:
E.g.,ollama serve
- Start embedding model download:
(Refer to sentence-transformers documentation or see script inscripts/
) - Build your graph RAG:
from nutrig_graphrag.nano_graphrag import GraphRAG, QueryParam
from nutrig_graphrag.biomedical.llm_utils import NutrigGraphRAG
# Initialize
ngrag = NutrigGraphRAG(GraphRAG,
working_dir="test_cache",
llm_model="gemma2-9b-it",
embedding_model="all-MiniLM-L6-v2",
)
# Ingest documents
for doc in ["data/pubmed_1.txt", "data/pubmed_2.txt"]:
with open(doc) as f:
ngrag.insert(f.read())
# Query knowledge graph
print(ngrag.query(
"How does the MTHFR C677T variant affect folate metabolism?",
param=QueryParam(mode="global")
))
Nutrigenetic-GraphRAG extends nano-graphrag
, modifying:
- Entity extraction prompts for biomedical NER.
- Chunking to preserve biomedical sentence/paragraph semantics.
- Custom embeddings for domain adaptation.
- Ablation pipeline for benchmarking various LLMs/embedders.
See docs/ARCHITECTURE.md.
We provide scripts for ablation experiments to compare:
- Different LLMs accessible via Ollama (e.g., Deepseek-v2, Gemma 2, Qwen2.5)
- Multiple sentence-transformer models
Models evaluated:
Embedders:
all-mpnet-base-v2
dmis-lab/biobert-v1.1
LLMs:
gemma2:9b
gemma2:27b
llama3.1:8b
qwen2:7b
qwen2.5:14b
deepseek-v2:16b
Results (see docs/RESULTS.md):
- Best LLM: [update this with your result]
- Best Embedder: [update this with your result]
- Personalized nutrition counseling for consumers
- Decision support for clinical nutritionists
- Biomedical discovery for researchers (variant-disease-nutrient links)
- Integrate gene-variant linking to dietary guidelines
- Add graph visualization
- Expand biomedical entity database
- More LLM and embedder plug-ins
See ROADMAP.md.
This project would not exist without nano-graphrag by gusye1234.
Also inspired by:
- Ollama
- sentence-transformers
- GRPM dataset (De Filippis et. al. 2025) (https://doi.org/10.1016/j.jbi.2025.104845)
If you use this software, please cite both nano-graphrag and this repo.
@misc{nutrigenetic-graphrag,
author = {johndef64},
title = {Nutrigenetic-GraphRAG: Biomedical GraphRAG for Nutrigenetics},
year = {2024},
howpublished = {\url{https://github.com/johndef64/nutrigenetic-graphrag}},
note = {Adapted from nano-graphrag https://github.com/gusye1234/nano-graphrag}
}
MIT License. See LICENSE.