Skip to main content
Version: dev

MS-RAG

Multi-Source Enhanced Retrieval-Augmented Generation Framework (MS-RAG)

Introduction

Large Language Models (LLMs) are powerful, but they can only answer based on the data they were trained on. When users need up-to-date or domain-specific information — such as internal documents, proprietary databases, or the latest reports — LLMs alone fall short.

Retrieval-Augmented Generation (RAG) bridges this gap by retrieving relevant information from external knowledge sources and feeding it as context to the LLM before generating a response. This ensures answers are grounded in real data rather than memorized patterns.

DB-GPT implements a Multi-Source RAG (MS-RAG) framework that goes beyond basic document Q&A. It supports multiple knowledge sources (documents, URLs, databases, knowledge graphs), multiple retrieval strategies (vector, keyword, graph, hybrid), and integrates deeply with the DB-GPT agent and workflow ecosystem.

Architecture

Overall Pipeline​

The MS-RAG pipeline consists of four stages:

Knowledge Source → Chunking → Indexing → Retrieval → LLM Generation
  1. Knowledge Loading — KnowledgeFactory automatically routes data sources (files, URLs, text) to the appropriate Knowledge implementation based on type and file extension.
  2. Chunking — ChunkManager splits loaded documents into manageable chunks using configurable strategies (by size, page, paragraph, separator, or markdown headers).
  3. Indexing — Assembler classes (Embedding, BM25, Summary, DBSchema) persist chunks into the appropriate index store (vector database, full-text engine, or knowledge graph).
  4. Retrieval & Generation — At query time, Retriever fetches relevant chunks, optional QueryRewrite expands the query, and Ranker re-ranks results before the LLM generates the final answer.

Assembler Pipeline​

The BaseAssembler defines a unified pipeline that connects all stages:

Knowledge.load() → ChunkManager.split() → Assembler.persist() → Assembler.as_retriever()

DB-GPT provides four specialized assemblers:

AssemblerPurposeIndex Backend
EmbeddingAssemblerVector similarity RAG (most common)Vector Store (Chroma, Milvus, etc.)
BM25AssemblerKeyword-based full-text retrievalElasticsearch
SummaryAssemblerSummary-based RAG for long documentsVector Store
DBSchemaAssemblerDatabase schema retrieval for Text2SQLVector Store

Knowledge Sources

DB-GPT supports loading knowledge from multiple source types. In the Web UI, you can select a datasource type when uploading:

Datasource Types​

TypeDescriptionExample
DocumentUpload files in various formatsPDF, Word, Excel, CSV, Markdown, PowerPoint, TXT, HTML, JSON, ZIP
URLFetch and index web page contentAny accessible HTTP/HTTPS URL
TextDirectly input raw textPaste text content in the UI
YuqueImport from Yuque documentation platformYuque document links

Supported Document Formats​

FormatExtensionKnowledge Class
PDF.pdfPDFKnowledge
CSV.csvCSVKnowledge
Markdown.mdMarkdownKnowledge
Word (docx).docxDocxKnowledge
Word (legacy).docWord97DocKnowledge
Excel.xlsxExcelKnowledge
PowerPoint.pptxPPTXKnowledge
Plain Text.txtTXTKnowledge
HTML.htmlHTMLKnowledge
JSON.jsonJSONKnowledge

Storage Types

When creating a knowledge base, you can choose from three storage types:

Storage TypeDescriptionBest For
Vector StoreStores document embeddings for semantic similarity searchGeneral-purpose document Q&A
Knowledge GraphStores entities and relationships as a graph structureDomain knowledge with complex entity relationships
Full TextFull-text index for keyword-based retrievalExact term matching and keyword search

Vector Store Backends​

BackendDescriptionInstall Extra
ChromaDBDefault embedded vector database, zero setupstorage_chromadb
MilvusDistributed vector database for production scalestorage_milvus
PGVectorPostgreSQL extension for vector operationsstorage_pgvector
WeaviateCloud-native vector search enginestorage_weaviate
ElasticsearchFull-text + vector hybrid searchstorage_elasticsearch
OceanBaseCloud-native distributed databasestorage_oceanbase

Knowledge Graph Backends​

BackendDescription
TuGraphHigh-performance graph database by Ant Group
Neo4jPopular open-source graph database
MemGraphIn-memory graph database for low-latency queries

Full-Text Backends​

BackendDescription
ElasticsearchIndustry-standard full-text search engine
OpenSearchAWS-managed search and analytics suite

Retrieval Strategies

DB-GPT offers multiple retrieval modes. You can configure the retrieve mode in the knowledge base settings:

StrategyDescriptionBackend Required
SemanticVector similarity search using embeddingsVector Store
KeywordBM25-based keyword matchingElasticsearch
HybridCombines vector + keyword search with Reciprocal Rank Fusion (RRF)Vector Store + Elasticsearch
TreeTree-structured retrieval for hierarchical documentsVector Store

Query Enhancement​

Beyond basic retrieval, DB-GPT provides advanced query processing:

  • Query Rewrite — Uses an LLM to expand and rephrase the original query into multiple search queries for better recall.
  • Reranking — After initial retrieval, a reranker model re-scores and re-orders the results for higher precision.

Supported Rerankers​

RerankerTypeDescription
CrossEncoderRankerLocalUses sentence-transformers CrossEncoder models
QwenRerankEmbeddingsLocalQwen3-Reranker via transformers
OpenAPIRerankEmbeddingsAPICompatible with OpenAI-style rerank APIs
RRFRankerAlgorithmReciprocal Rank Fusion for merging multi-source results
DefaultRankerAlgorithmSimple score-based sorting

Chunking Strategies

Document chunking is a critical step in RAG quality. DB-GPT supports multiple chunking strategies:

StrategySplitterDescription
Chunk by SizeRecursiveCharacterTextSplitterSplit by character count with configurable size and overlap (default: 512 / 50)
Chunk by PagePageTextSplitterSplit at page boundaries (useful for PDFs)
Chunk by ParagraphParagraphTextSplitterSplit at paragraph boundaries
Chunk by SeparatorSeparatorTextSplitterSplit at custom separator strings
Chunk by Markdown HeaderMarkdownHeaderTextSplitterSplit at markdown heading levels

Chunking Parameters​

ParameterDescriptionDefault
chunk_sizeMaximum characters per chunk512
chunk_overlapOverlapping characters between adjacent chunks50
topkNumber of chunks to retrieve per query5
recall_scoreMinimum relevance score threshold0
recall_typeRecall strategy (TopK)TopK
modelEmbedding model to useDepends on configuration

Embedding Models

DB-GPT supports a wide range of embedding models for converting text into vector representations:

Local Models​

ModelClassDescription
HuggingFaceHuggingFaceEmbeddingsGeneral-purpose HuggingFace models
BGE SeriesHuggingFaceBgeEmbeddingsBAAI BGE models with instruction support (Chinese/English)
InstructorHuggingFaceInstructEmbeddingsInstruction-following embedding models

Remote API Models​

ProviderClassDescription
OpenAI-compatibleOpenAPIEmbeddingsAny OpenAI-compatible embedding API
JinaJinaEmbeddingsJina AI embedding service
OllamaOllamaEmbeddingsLocal Ollama embedding server
Tongyi (Aliyun)TongyiEmbeddingsAlibaba Cloud DashScope
Qianfan (Baidu)QianfanEmbeddingsBaidu Wenxin platform
SiliconFlowSiliconFlowEmbeddingsSiliconFlow embedding service

Knowledge Graph RAG

Beyond traditional vector-based RAG, DB-GPT supports Knowledge Graph RAG for structured knowledge retrieval.

How It Works​

  1. Triplet Extraction — An LLM extracts entities and relationships from documents as (subject, predicate, object) triplets.
  2. Graph Storage — Triplets are stored in a graph database (TuGraph, Neo4j, or MemGraph).
  3. Graph Retrieval — At query time, the GraphRetriever combines four sub-strategies:
    • Keyword-based — Match graph nodes by extracted keywords
    • Vector-based — Semantic similarity search on graph node embeddings
    • Text-based — Convert natural language to graph query language (Text2GQL) via LLM
    • Document-based — Retrieve through document-graph associations
  4. Community Summarization — Summarize graph communities for high-level understanding.

Usage

Creating a Knowledge Base (Web UI)​

Step 1 — Open Knowledge Management​

Navigate to the Knowledge section in the sidebar.

Step 2 — Create and Configure​

  1. Click Create to start a new knowledge base.
  2. Select the Storage Type (Vector Store, Knowledge Graph, or Full Text).
  3. Choose the Embedding Model and configure chunk parameters.

Step 3 — Upload Data​

Select a datasource type and upload your content. Supported types include Document (PDF, Word, Excel, CSV, etc.), URL, Text, and Yuque.

Step 4 — Configure Chunking​

Choose a chunking strategy and set parameters:

Step 5 — Configure Retrieval Strategy (Optional)​

You can configure the retrieval strategy for your knowledge base. DB-GPT supports multiple retrieve modes — Semantic, Keyword, Hybrid, and Tree — to suit different query scenarios. Select the mode that best fits your use case in the knowledge base settings.

Step 6 — Chat with Your Knowledge​

Go to Chat, click the knowledge base icon in the chat input toolbar, select your knowledge base from the dropdown, and start asking questions.

Programmatic Usage (Python API)​

from dbgpt.rag import Chunk
from dbgpt_ext.rag.assembler import EmbeddingAssembler
from dbgpt_ext.rag.knowledge import KnowledgeFactory

# Load knowledge from a file
knowledge = KnowledgeFactory.create(file_path="your_document.pdf")

# Build the embedding index
assembler = await EmbeddingAssembler.aload_from_knowledge(
knowledge=knowledge,
index_store=your_vector_store,
embedding_model=your_embedding_model,
)
assembler.persist()

# Retrieve relevant chunks
retriever = assembler.as_retriever(top_k=5)
chunks = await retriever.aretrieve("What is the main topic?")

Next Steps

TopicLink
Knowledge Base Web UI GuideKnowledge Base
RAG ConceptsRAG
Graph RAG SetupGraph RAG
AWEL RAG OperatorsAWEL
Source CodeGitHub