Minimal, elegant RAG framework for TypeScript
Zero dependencies Β· Type-safe Β· Production-grade Β· Shield included
Most RAG frameworks are heavyweight, opinionated, and leave security as an afterthought. rag-core is different:
| Principle | What it means |
|---|---|
| πͺΆ Zero Dependencies | Pure TypeScript. Uses native fetch. No bloated dependency tree. |
| π Shield Built-in | Prompt injection detection out of the box. Production-grade from day one. |
| π§© Modular & Swappable | Every component β embedder, store, ranker β implements a clean interface. Swap OpenAI for Cohere in one line. |
| π― Deep Type Safety | Generic metadata flows through the entire pipeline. Your IDE knows the shape of your data everywhere. |
| β‘ 10-Line Quickstart | From install to working RAG pipeline in under 10 lines of code. |
npm install rag-coreimport { RagCore, OpenAIEmbedder, MemoryStore } from 'rag-core'; const rag = new RagCore({ embedder: new OpenAIEmbedder({ apiKey: process.env.OPENAI_API_KEY! }), vectorStore: new MemoryStore(), shield: { detectInjection: true }, }); // Ingest a document β chainable, readable, typed await rag .ingest({ content: 'TypeScript is a typed superset of JavaScript...' }) .split({ chunkSize: 500 }) .store(); // Query with automatic shield + vector search const results = await rag.query('What is TypeScript?', { topK: 5 });That's it. 7 lines to a working RAG pipeline with prompt injection protection.
Document β [ Ingestor ] β Chunks β [ Embedder ] β Vectors β [ Store ] β Query β [ Shield ] β [ Embedder ] β [ Store.search ] β [ Ranker ] β Results | Module | Class | Purpose |
|---|---|---|
| Ingestor | RecursiveCharacterSplitter | Smart text chunking with recursive separator hierarchy |
| Embedder | OpenAIEmbedder, CohereEmbedder | Map text to vectors via any embedding API |
| Store | MemoryStore | Vector storage with cosine similarity search |
| Ranker | CohereRanker | Re-rank results with cross-encoder models |
| Shield | InjectionDetector | Detect prompt injection attacks before they reach your LLM |
The main orchestrator. All operations flow through this class.
const rag = new RagCore<MyMetadata>({ embedder: new OpenAIEmbedder({ apiKey: '...' }), vectorStore: new MemoryStore<MyMetadata>(), ranker: new CohereRanker({ apiKey: '...' }), // optional shield: { detectInjection: true, threshold: 0.7 }, // optional });Start a chainable ingestion pipeline:
await rag .ingest({ id: 'doc-1', content: '...', metadata: { source: 'web' } }) .split({ chunkSize: 500, chunkOverlap: 50 }) .store();Batch-ingest multiple documents:
await rag.ingestMany([doc1, doc2, doc3], { chunkSize: 500 });Query the pipeline with automatic shield β embed β search β rerank:
const results = await rag.query('How does X work?', { topK: 5, // number of results (default: 5) rerank: true, // enable re-ranking (default: false) shield: true, // enable injection check (default: true) });Manually check any text for prompt injection:
const check = rag.shield(userInput); if (!check.safe) { console.warn(`Blocked! Threats: ${check.threats.join(', ')}`); }new OpenAIEmbedder({ apiKey: 'sk-...', model: 'text-embedding-3-small', // default baseUrl: 'https://api.openai.com/v1', // default });new CohereEmbedder({ apiKey: '...', model: 'embed-english-v3.0', // default });Implement the Embedder interface to use any provider:
import type { Embedder } from 'rag-core'; class MyEmbedder implements Embedder { async embed(texts: string[]): Promise<number[][]> { /* ... */ } async embedQuery(text: string): Promise<number[]> { /* ... */ } }In-memory store with pure cosine similarity. Great for prototyping and small datasets.
const store = new MemoryStore<MyMeta>(); store.size; // number of stored chunks store.clear(); // remove allImplement the VectorStore interface for Pinecone, Weaviate, Qdrant, etc.:
import type { VectorStore, EmbeddedChunk, SearchResult } from 'rag-core'; class PineconeStore<TMeta> implements VectorStore<TMeta> { async upsert(chunks: EmbeddedChunk<TMeta>[]): Promise<void> { /* ... */ } async search(query: number[], topK: number): Promise<SearchResult<TMeta>[]> { /* ... */ } }The unique selling point of rag-core. Most frameworks completely ignore prompt security.
import { InjectionDetector, sanitize } from 'rag-core'; const detector = new InjectionDetector(0.7); // threshold const result = detector.analyze('Ignore all previous instructions...'); // { safe: false, score: 0.9, threats: ['role-override:ignore-previous'] } const clean = sanitize(rawInput); // strips control chars, normalizes unicodeDetected threat categories:
- π‘οΈ Role overrides (
ignore previous,you are now,pretend to be) - π Delimiter injection (
<|im_start|>,[INST],<<SYS>>) - π€ Data exfiltration (
show me your prompt,repeat the context) - π Jailbreaks (
DAN mode,developer mode,god mode) - π Obfuscation (
base64,eval(), encoding tricks)
import { RecursiveCharacterSplitter } from 'rag-core'; const splitter = new RecursiveCharacterSplitter({ chunkSize: 500, // max chars per chunk (default: 500) chunkOverlap: 50, // overlap between chunks (default: 50) separators: ['\n\n', '\n', '. ', ' ', ''], // custom hierarchy }); const chunks = splitter.split({ id: 'doc-1', content: longText });rag-core uses TypeScript generics so your metadata type flows through the entire pipeline:
interface MyMeta { source: string; page: number; confidential: boolean; } const rag = new RagCore<MyMeta>({ embedder: new OpenAIEmbedder({ apiKey: '...' }), vectorStore: new MemoryStore<MyMeta>(), }); // Metadata is typed everywhere await rag.ingest({ content: '...', metadata: { source: 'report.pdf', page: 42, confidential: true }, }).split().store(); const results = await rag.query('...'); results[0].chunk.metadata?.source; // β
TypeScript knows this is `string` results[0].chunk.metadata?.page; // β
TypeScript knows this is `number`MIT Β© rag-core contributors