What is the performance limit ? #56
Manish-PM25 started this conversation in General
Replies: 2 comments
-
| Yea good question actually. Traditional RAG struggles at scale mainly because of arbitrary chunking and semantic drift in large embedding spaces. PageIndex should handle this better due to its hierarchical structure - it maintains page-level context and has more explainable retrieval paths. Instead of searching through thousands of random chunks, it navigates the document structure. That said, I haven't seen specific benchmarks at 10K+ docs yet. Would be interesting to test:
The architecture suggests it should scale better than traditional RAG, but real-world testing would confirm. Has anyone tried it with large document sets? |
Beta Was this translation helpful? Give feedback.
0 replies
-
| UNSUBSCRIBE …On Thu, Feb 12, 2026 at 08:46 Valdemar Stamm ***@***.***> wrote: Yea good question actually. Traditional RAG struggles at scale mainly because of arbitrary chunking and semantic drift in large embedding spaces. PageIndex should handle this better due to its hierarchical structure - it maintains page-level context and has more explainable retrieval paths. Instead of searching through thousands of random chunks, it navigates the document structure. That said, I haven't seen specific benchmarks at 10K+ docs yet. Would be interesting to test: - Retrieval accuracy vs document count - Query latency at scale - Memory footprint The architecture suggests it should scale better than traditional RAG, but real-world testing would confirm. Has anyone tried it with large document sets? — Reply to this email directly, view it on GitHub <#56 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BSGAVOYCGDZUKYFK4PSFWMT4LQ4X3AVCNFSM6AAAAACQWO6QN6VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTKNZXHE3TEMY> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***> |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Traditional RAGs see a drop in accuracy over 10,000 documents. Do we have any such benchmark limits for this PageIndex
Beta Was this translation helpful? Give feedback.
All reactions