You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Index any document into a navigable tree structure, then retrieve relevant sections using any LLM. No vector databases, no embeddings — just structured tree retrieval.
Available for both Python and Node.js — same API, same index format, fully cross-compatible.
How It Works
Load — Extract pages from any supported format
Index — LLM analyzes page groups and extracts hierarchical structure
Build — Flat sections become a tree with page ranges and embedded text
Query — LLM selects relevant tree nodes for your question
Return — Get context text, source pages, and reasoning
Why TreeDex instead of Vector DB?
Supported LLM Providers
TreeDex works with every major AI provider out of the box. Pick what works for you:
fromtreedeximportTreeDex, GeminiLLMllm=GeminiLLM(api_key="YOUR_KEY") index=TreeDex.from_file("doc.pdf", llm=llm) result=index.query("What is the main argument?") print(result.context) print(result.pages_str) # "pages 5-8, 12-15"
import{TreeDex,GeminiLLM}from"treedex";constllm=newGeminiLLM("YOUR_KEY");constindex=awaitTreeDex.fromFile("doc.pdf",llm);constresult=awaitindex.query("What is the main argument?");console.log(result.context);console.log(result.pagesStr);// "pages 5-8, 12-15"
All providers work the same way
Python
Node.js / TypeScript
fromtreedeximport*# Google Geminillm=GeminiLLM(api_key="YOUR_KEY") # OpenAIllm=OpenAILLM(api_key="sk-...") # Claudellm=ClaudeLLM(api_key="sk-ant-...") # Groq (fast inference)llm=GroqLLM(api_key="gsk_...") # Together AIllm=TogetherLLM(api_key="...") # DeepSeekllm=DeepSeekLLM(api_key="...") # OpenRouter (access any model)llm=OpenRouterLLM(api_key="...") # Local Ollamallm=OllamaLLM(model="llama3") # Any OpenAI-compatible endpointllm=OpenAICompatibleLLM( base_url="https://your-api.com/v1", api_key="...", model="model-name", )
import{/* any backend */}from"treedex";// Google Geminiconstllm=newGeminiLLM("YOUR_KEY");// OpenAIconstllm=newOpenAILLM("sk-...");// Claudeconstllm=newClaudeLLM("sk-ant-...");// Groq (fast inference)constllm=newGroqLLM("gsk_...");// Together AIconstllm=newTogetherLLM("...");// DeepSeekconstllm=newDeepSeekLLM("...");// OpenRouter (access any model)constllm=newOpenRouterLLM("...");// Local Ollamaconstllm=newOllamaLLM("llama3");// Any OpenAI-compatible endpointconstllm=newOpenAICompatibleLLM({baseUrl: "https://your-api.com/v1",apiKey: "...",model: "model-name",});
Standard mode returns raw context. Agentic mode goes one step further — it retrieves the relevant sections, then generates a direct answer.
Python
Node.js / TypeScript
# Standard: returns context + page rangesresult=index.query("What is X?") print(result.context) # Agentic: returns a direct answerresult=index.query("What is X?", agentic=True) print(result.answer) # LLM-generated answerprint(result.pages_str) # source pages
// Standard: returns context + page rangesconstresult=awaitindex.query("What is X?");console.log(result.context);// Agentic: returns a direct answerconstresult=awaitindex.query("What is X?",{agentic: true});console.log(result.answer);// LLM-generated answerconsole.log(result.pagesStr);// source pages
Swap LLM at query time
# Build index with one LLMindex=TreeDex.from_file("doc.pdf", llm=gemini_llm) # Query with a different one — same index, different brainresult=index.query("...", llm=groq_llm)
Save and load indexes
Indexes are saved as JSON. An index created in Python loads in Node.js and vice versa.
Build an index with Python, query it from Node.js (or vice versa).
Benchmarks
TreeDex vs Vector DB vs Naive Chunking
Real benchmark on the same document (NCERT Electromagnetic Waves, 14 pages, 10 queries). All three methods retrieve from the same content — only the indexing and retrieval approach differs. Auto-generated by CI on every push.
git clone https://github.com/mithun50/TreeDex.git cd TreeDex # Python development pip install -e ".[dev]" pytest # Node.js development npm install npm run build npm test
License
MIT License — Mithun Gowda B
About
Tree-based, vectorless document RAG framework. Connect any LLM via URL/API key.