⚡ rag-core

Minimal, elegant RAG framework for TypeScript

Zero dependencies · Type-safe · Production-grade · Shield included

Why rag-core?

Most RAG frameworks are heavyweight, opinionated, and leave security as an afterthought. rag-core is different:

Principle	What it means
🪶 Zero Dependencies	Pure TypeScript. Uses native `fetch`. No bloated dependency tree.
🔒 Shield Built-in	Prompt injection detection out of the box. Production-grade from day one.
🧩 Modular & Swappable	Every component — embedder, store, ranker — implements a clean interface. Swap OpenAI for Cohere in one line.
🎯 Deep Type Safety	Generic metadata flows through the entire pipeline. Your IDE knows the shape of your data everywhere.
⚡ 10-Line Quickstart	From install to working RAG pipeline in under 10 lines of code.

Quick Start

npm install rag-core

import { RagCore, OpenAIEmbedder, MemoryStore } from 'rag-core'; const rag = new RagCore({ embedder: new OpenAIEmbedder({ apiKey: process.env.OPENAI_API_KEY! }), vectorStore: new MemoryStore(), shield: { detectInjection: true }, }); // Ingest a document — chainable, readable, typed await rag .ingest({ content: 'TypeScript is a typed superset of JavaScript...' }) .split({ chunkSize: 500 }) .store(); // Query with automatic shield + vector search const results = await rag.query('What is TypeScript?', { topK: 5 });

That's it. 7 lines to a working RAG pipeline with prompt injection protection.

Architecture

 Document → [ Ingestor ] → Chunks → [ Embedder ] → Vectors → [ Store ] ↓ Query → [ Shield ] → [ Embedder ] → [ Store.search ] → [ Ranker ] → Results

The 5 Pillars

Module	Class	Purpose
Ingestor	`RecursiveCharacterSplitter`	Smart text chunking with recursive separator hierarchy
Embedder	`OpenAIEmbedder`, `CohereEmbedder`	Map text to vectors via any embedding API
Store	`MemoryStore`	Vector storage with cosine similarity search
Ranker	`CohereRanker`	Re-rank results with cross-encoder models
Shield	`InjectionDetector`	Detect prompt injection attacks before they reach your LLM

API Reference

`RagCore<TMeta>`

The main orchestrator. All operations flow through this class.

const rag = new RagCore<MyMetadata>({ embedder: new OpenAIEmbedder({ apiKey: '...' }), vectorStore: new MemoryStore<MyMetadata>(), ranker: new CohereRanker({ apiKey: '...' }), // optional shield: { detectInjection: true, threshold: 0.7 }, // optional });

`.ingest(document)` → `Pipeline`

Start a chainable ingestion pipeline:

await rag .ingest({ id: 'doc-1', content: '...', metadata: { source: 'web' } }) .split({ chunkSize: 500, chunkOverlap: 50 }) .store();

`.ingestMany(documents, options?)` → `Promise<EmbeddedChunk[]>`

Batch-ingest multiple documents:

await rag.ingestMany([doc1, doc2, doc3], { chunkSize: 500 });

`.query(question, options?)` → `Promise<SearchResult[]>`

Query the pipeline with automatic shield → embed → search → rerank:

const results = await rag.query('How does X work?', { topK: 5, // number of results (default: 5) rerank: true, // enable re-ranking (default: false) shield: true, // enable injection check (default: true) });

`.shield(input)` → `ShieldResult`

Manually check any text for prompt injection:

const check = rag.shield(userInput); if (!check.safe) { console.warn(`Blocked! Threats: ${check.threats.join(', ')}`); }

Embedders

`OpenAIEmbedder`

new OpenAIEmbedder({ apiKey: 'sk-...', model: 'text-embedding-3-small', // default baseUrl: 'https://api.openai.com/v1', // default });

`CohereEmbedder`

new CohereEmbedder({ apiKey: '...', model: 'embed-english-v3.0', // default });

Custom Embedder

Implement the Embedder interface to use any provider:

import type { Embedder } from 'rag-core'; class MyEmbedder implements Embedder { async embed(texts: string[]): Promise<number[][]> { /* ... */ } async embedQuery(text: string): Promise<number[]> { /* ... */ } }

Vector Stores

`MemoryStore<TMeta>`

In-memory store with pure cosine similarity. Great for prototyping and small datasets.

const store = new MemoryStore<MyMeta>(); store.size; // number of stored chunks store.clear(); // remove all

Custom Store

Implement the VectorStore interface for Pinecone, Weaviate, Qdrant, etc.:

import type { VectorStore, EmbeddedChunk, SearchResult } from 'rag-core'; class PineconeStore<TMeta> implements VectorStore<TMeta> { async upsert(chunks: EmbeddedChunk<TMeta>[]): Promise<void> { /* ... */ } async search(query: number[], topK: number): Promise<SearchResult<TMeta>[]> { /* ... */ } }

Shield Layer

The unique selling point of rag-core. Most frameworks completely ignore prompt security.

import { InjectionDetector, sanitize } from 'rag-core'; const detector = new InjectionDetector(0.7); // threshold const result = detector.analyze('Ignore all previous instructions...'); // { safe: false, score: 0.9, threats: ['role-override:ignore-previous'] } const clean = sanitize(rawInput); // strips control chars, normalizes unicode

Detected threat categories:

🛡️ Role overrides (ignore previous, you are now, pretend to be)
🔓 Delimiter injection (<|im_start|>, [INST], <<SYS>>)
📤 Data exfiltration (show me your prompt, repeat the context)
🎭 Jailbreaks (DAN mode, developer mode, god mode)
🔀 Obfuscation (base64, eval(), encoding tricks)

Splitter

import { RecursiveCharacterSplitter } from 'rag-core'; const splitter = new RecursiveCharacterSplitter({ chunkSize: 500, // max chars per chunk (default: 500) chunkOverlap: 50, // overlap between chunks (default: 50) separators: ['\n\n', '\n', '. ', ' ', ''], // custom hierarchy }); const chunks = splitter.split({ id: 'doc-1', content: longText });

Type Safety

rag-core uses TypeScript generics so your metadata type flows through the entire pipeline:

interface MyMeta { source: string; page: number; confidential: boolean; } const rag = new RagCore<MyMeta>({ embedder: new OpenAIEmbedder({ apiKey: '...' }), vectorStore: new MemoryStore<MyMeta>(), }); // Metadata is typed everywhere await rag.ingest({ content: '...', metadata: { source: 'report.pdf', page: 42, confidential: true }, }).split().store(); const results = await rag.query('...'); results[0].chunk.metadata?.source; // ✅ TypeScript knows this is `string` results[0].chunk.metadata?.page; // ✅ TypeScript knows this is `number`

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ rag-core

Why rag-core?

Quick Start

Architecture

The 5 Pillars

API Reference

`RagCore<TMeta>`

`.ingest(document)` → `Pipeline`

`.ingestMany(documents, options?)` → `Promise<EmbeddedChunk[]>`

`.query(question, options?)` → `Promise<SearchResult[]>`

`.shield(input)` → `ShieldResult`

Embedders

`OpenAIEmbedder`

`CohereEmbedder`

Custom Embedder

Vector Stores

`MemoryStore<TMeta>`

Custom Store

Shield Layer

Splitter

Type Safety

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ rag-core

Why rag-core?

Quick Start

Architecture

The 5 Pillars

API Reference

RagCore<TMeta>

.ingest(document) → Pipeline

.ingestMany(documents, options?) → Promise<EmbeddedChunk[]>

.query(question, options?) → Promise<SearchResult[]>

.shield(input) → ShieldResult

Embedders

OpenAIEmbedder

CohereEmbedder

Custom Embedder

Vector Stores

MemoryStore<TMeta>

Custom Store

Shield Layer

Splitter

Type Safety

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`RagCore<TMeta>`

`.ingest(document)` → `Pipeline`

`.ingestMany(documents, options?)` → `Promise<EmbeddedChunk[]>`

`.query(question, options?)` → `Promise<SearchResult[]>`

`.shield(input)` → `ShieldResult`

`OpenAIEmbedder`

`CohereEmbedder`

`MemoryStore<TMeta>`

Packages