DocChat 📝🤖

🚀 AI-powered Multi-Agent RAG system for intelligent document querying with fact verification

📌 Overview

DocChat is a multi-agent Retrieval-Augmented Generation (RAG) system designed to help users query long, complex documents with accurate, fact-verified answers. Unlike traditional chatbots like ChatGPT or DeepSeek, which hallucinate responses and struggle with structured data, DocChat retrieves, verifies, and corrects answers before delivering them.

💡 Key Features:
✅ Multi-Agent System – A Research Agent generates answers, while a Verification Agent fact-checks responses.
✅ Hybrid Retrieval – Uses BM25 and vector search to find the most relevant content.
✅ Handles Multiple Documents – Selects the most relevant document even when multiple files are uploaded.
✅ Scope Detection – Prevents hallucinations by rejecting irrelevant queries.
✅ Fact Verification – Ensures responses are accurate before presenting them to the user.
✅ Web Interface with Gradio – Allowing seamless document upload and question-answering.

🎥 Demo Video

📹 Click here to watch the DocChat demo
(Opens in a new tab)

🛠️ How DocChat Works

1️⃣ Query Processing & Scope Analysis

Users upload documents and ask a question.
DocChat analyzes query relevance and determines if the question is within scope.
If the query is irrelevant, DocChat rejects it instead of generating hallucinated responses.

2️⃣ Multi-Agent Research & Retrieval

Docling parses documents into a structured format (Markdown, JSON).
LangChain & ChromaDB handle hybrid retrieval (BM25 + vector embeddings).
Even when multiple documents are uploaded, DocChat finds the most relevant sections dynamically.

3️⃣ Answer Generation & Verification

Research Agent generates an answer using retrieved content.
Verification Agent cross-checks the response against the source document.
If verification fails, a self-correction loop re-runs retrieval and research.

4️⃣ Response Finalization

If the answer passes verification, it is displayed to the user.
If the question is out of scope, DocChat informs the user instead of hallucinating.

🎯 Why Use DocChat Instead of ChatGPT or DeepSeek?

Feature	ChatGPT/DeepSeek ❌	DocChat ✅
Retrieves from uploaded documents	❌ No	✅ Yes
Handles multiple documents	❌ No	✅ Yes
Extracts structured data from PDFs	❌ No	✅ Yes
Prevents hallucinations	❌ No	✅ Yes
Fact-checks answers	❌ No	✅ Yes
Detects out-of-scope queries	❌ No	✅ Yes

🚀 DocChat is built for enterprise-grade document intelligence, research, and compliance workflows.

📦 Installation

1️⃣ Clone the Repository

git clone https://github.com/HaileyTQuach/docchat-docling.git docchat
cd docchat

2️⃣ Set Up Virtual Environment

python3.11 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3️⃣ Install Dependencies

pip install -r requirements.txt

4️⃣ Set Up API Keys

DocChat requires an OpenAI API key for processing. Add it to a .env file:

OPENAI_API_KEY=your-api-key-here

5️⃣ Run the Application

python app.py

DocChat will be accessible at http://0.0.0.0:7860.

🖥️ Usage Guide

1️⃣ Upload one or more documents (PDF, DOCX, TXT, Markdown).

2️⃣ Enter a question related to the document.

3️⃣ Click "Submit" – DocChat retrieves, analyzes, and verifies the response.

4️⃣ Review the answer & verification report for confidence.

5️⃣ If the question is out of scope, DocChat will inform you instead of fabricating an answer.

🤝 Contributing

Want to improve DocChat? Feel free to:

Fork the repo
Create a new branch (feature-xyz)
Commit your changes
Submit a PR (Pull Request)

We welcome contributions from AI/NLP enthusiasts, researchers, and developers! 🚀

📜 License

This project is licensed under a Customed Non-Commercial License – check LICENSE for more details.

💬 Contact & Support

📧 Email: [[email protected]]

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
agents		agents
config		config
document_cache		document_cache
document_processor		document_processor
examples		examples
retriever		retriever
test		test
utils		utils
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DocChat 📝🤖

📌 Overview

🎥 Demo Video

🛠️ How DocChat Works

1️⃣ Query Processing & Scope Analysis

2️⃣ Multi-Agent Research & Retrieval

3️⃣ Answer Generation & Verification

4️⃣ Response Finalization

🎯 Why Use DocChat Instead of ChatGPT or DeepSeek?

📦 Installation

1️⃣ Clone the Repository

2️⃣ Set Up Virtual Environment

3️⃣ Install Dependencies

4️⃣ Set Up API Keys

5️⃣ Run the Application

🖥️ Usage Guide

🤝 Contributing

📜 License

💬 Contact & Support

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

ibm-developer-skills-network/zzpwx-docchat

Folders and files

Latest commit

History

Repository files navigation

DocChat 📝🤖

📌 Overview

🎥 Demo Video

🛠️ How DocChat Works

1️⃣ Query Processing & Scope Analysis

2️⃣ Multi-Agent Research & Retrieval

3️⃣ Answer Generation & Verification

4️⃣ Response Finalization

🎯 Why Use DocChat Instead of ChatGPT or DeepSeek?

📦 Installation

1️⃣ Clone the Repository

2️⃣ Set Up Virtual Environment

3️⃣ Install Dependencies

4️⃣ Set Up API Keys

5️⃣ Run the Application

🖥️ Usage Guide

🤝 Contributing

📜 License

💬 Contact & Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages