feat: plugin manager by midweste · Pull Request #12 · giancarloerra/SocratiCode

midweste · 2026-03-17T20:37:25Z

Summary

Adds a PluginManager class that enables SocratiCode to be extended via self-contained plugins without modifying core code. Plugins are auto-discovered from src/plugins/*/index.ts at startup and receive lifecycle hooks. All plugin errors are non-fatal — a failing plugin never affects the indexer.

SocratiCode gives AI agents context about what code does and how it's structured. Plugins extend that context with knowledge that can't be extracted from source files alone — things like why code was written a certain way, which parts of the codebase are most volatile, or what implicit dependencies exist between components. This additional context is stored alongside the existing index and surfaced automatically during search, giving AI agents a deeper understanding of the project without any changes to the core.

Changes

[NEW] src/services/plugin.ts — PluginManager class with auto-discovery, registration, lifecycle dispatch, and non-fatal error isolation
[NEW] src/plugins/README.md — Plugin convention docs: folder structure, interface, and how to create a plugin
[NEW] tests/unit/plugin.test.ts — 12 tests covering registration, hook dispatch, error isolation, and shutdown

Type of change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactoring (no functional changes)
Test coverage improvement

Testing

Unit tests pass (npm run test:unit)
Integration tests pass (npm run test:integration) — if applicable
TypeScript compiles cleanly (npx tsc --noEmit)
New tests added for new/changed functionality

12 tests covering: plugin registration, hook dispatch order, non-fatal error isolation, onProgress forwarding, shutdown resilience, and test reset. No plugins directory = gracefully skipped.

Checklist

My code follows the existing code style and conventions
I have added/updated JSDoc comments where appropriate
I have updated documentation (README.md / DEVELOPER.md) if needed
I have read the Contributing Guide
I agree to the Contributor License Agreement

Related issues

None

giancarloerra · 2026-03-18T10:45:55Z

Thank you for this and the clean implementation midweste, it's well-structured with good error isolation and test coverage.

Adding a plugin/extension system is a significant architectural decision that I'd like to think through carefully before committing to it. The project is still young and changing rapidly, and it feels a bit early. Some considerations:

The plugin interface becomes a public API surface that's hard to change once adopted
Exposing getClient() from qdrant.ts gives plugins direct Qdrant access, which has security implications: one badly coded or malicious plugin could cause havoc
There are currently no concrete plugins with strong use cases to validate the design or the need of it

SocratiCode already has context artifacts for extending project knowledge without code changes, which may partially overlap this.

I'm not ruling this out for the future, I do like the idea of external plugins.

But I'd prefer to think about it alongside a few concrete plugins so there are use cases for it and the API is validated by real usage. I definitely like more the idea of plugins instead of bloated code going beyond its core design, but I'd like it to be very surface-level and not posing any potential security or concern over the core functionality and indexes.

I'll keep this open for now for further comments also from other contributors. In the meantime, I'd like to first work more on the core product, smooth out existing bugs and implement core features.

midweste · 2026-03-18T12:13:27Z

Thanks for the thoughtful review and for keeping the door open. Totally understand wanting to be careful with architectural decisions this early.

I want to share the context behind why I built the plugin system — there's a concrete feature driving it.

Git Memory Plugin

SocratiCode answers "what does this code do?" through semantic search and context artifacts. Git Memory fills a gap: "why was it written this way?"

When a project is indexed, it reads unprocessed git commits — diffs, messages, and git-trailers — batches them, and sends them to a configurable LLM (OpenRouter, OpenAI, Google, Ollama) to extract structured memories: architectural decisions, bug fixes, refactors, patterns. These get embedded and stored in the same context_{id} collection with type: "git-memory", so existing search returns them alongside context artifacts with zero changes needed.

For example, a search for "authentication middleware" wouldn't just find the code — it would also surface "Switched from JWT to sessions due to XSS vulnerability (commit abc123)." An AI assistant would know not to reintroduce a bug because it can find "This validation was missing — caused production outage, fixed in def456."

It's fully opt-in (GIT_MEMORY_ENABLED=true), runs in the background, never blocks indexing, and is incremental. When disabled, zero code executes. I have it working with full test coverage.

I originally built it just as a local project mcp in go, but I see what I've built can compliment a fully indexed and searchable codebase

On the Plugin Architecture

Git-memory is a first use case for plugin architecture. The interface is intentionally minimal — 4 optional lifecycle hooks — and I think it's actually the right pattern for SocratiCode going forward. Features like this should be isolated modules that plug into the lifecycle, not code wired throughout the core. That keeps the core lean and each feature self-contained.

That said, your concerns about getClient() are valid. One approach: the PluginManager itself could expose a scoped API of approved Qdrant operations — scroll, count, setPayload — so plugins interact through the manager instead of importing core services directly. That way the PluginManager acts as a sandbox, and you control exactly what plugins can do. Context artifacts and git memory complement each other — artifacts are static docs you write, git memory is dynamic knowledge extracted automatically from commits.

I'd love to get your take on the git-memory feature itself — if it makes sense, we can work out the right integration approach together.

Happy to discuss.

giancarloerra · 2026-03-18T14:44:49Z

The Git Memory idea is interesting, and thanks for addressing the security concern. A few thoughts:

Feature vs plugin: Git Memory feels more like a core feature behind a flag (like INCLUDE_DOT_FILES or context artifacts) than something that validates a plugin system (more below).

LLM dependency: SocratiCode today uses just embeddings. Simple, local, no API keys (by default). Git Memory would need I think a generative LLM to read diffs and extract structured memories, which means configuring providers, API keys, picking models. That's a different level of complexity that I'm trying to avoid. How would it work without that?

Existing artifacts: Could raw commit messages + diffs be embedded directly without LLM structuring? How could them be maintained up to date? If possible, implementing a simple script to update a folder with all of that would mean artifacts could already cover it. I keep thinking the git memory is something for the existing artifacts more than anything.

Existing tools: most coding AI agents already have access to git — they can run git log, git blame, search commit history dynamically. So is it really a concern for SocratiCode?

On plugins generally: I really do like the idea of opening SocratiCode to community extensions — but I think a plugin should be a lighter touch. Something that enriches the index (or the use of it) but without introducing heavy dependencies or high complexity.

I'd love to see the Git Memory implementation to understand the architecture better — even if we end up shipping it as a core feature (maybe as part of artifacts) rather than a plugin. Happy to keep discussing :-)

midweste · 2026-03-18T18:56:22Z

The Git Memory idea is interesting, and thanks for addressing the security concern. A few thoughts:

Feature vs plugin: Git Memory feels more like a core feature behind a flag (like INCLUDE_DOT_FILES or context artifacts) than something that validates a plugin system (more below).

I understand that perspective too, I never want to be too presumptuous with other peoples projects :)

LLM dependency: SocratiCode today uses just embeddings. Simple, local, no API keys (by default). Git Memory would need I think a generative LLM to read diffs and extract structured memories, which means configuring providers, API keys, picking models. That's a different level of complexity that I'm trying to avoid. How would it work without that?

So currently, I've built it to use openrouter and it evaluates first:

the free sources that meet the criteria for extraction, triage and synthesis. Auto chooses free model if not specifically set. Most any model can do extraction, triage and synthesis need some reasoning

The three steps it does via LLM are:

Extraction:

Pulls raw memories out and types them ( "decision", "pattern", "convention", "context", "debt",
"bug_fix", "refactor", "feature", "architecture" )
Use git trailers for additional context that is provided by the user (or the AI that commits cause it knows the conversation context and will add things like 'chose to use openrouter over openai because X'

Triage:

Uses reasoning LLM to dedupe redundant memories

Synthesis (where some magic happens) and as currently setup needs a good size context window:

Relationship links between memories ( "supersedes", "contradicts", "supports", "extends",
"related_to", "depends_on", "caused_by", "alternative_to" )
Heuristic scoring - computes importance and confidence

Existing artifacts: Could raw commit messages + diffs be embedded directly without LLM structuring? How could them be maintained up to date? If possible, implementing a simple script to update a folder with all of that would mean artifacts could already cover it. I keep thinking the git memory is something for the existing artifacts more than anything.

I expect so, but I do think the why is where some of this becomes more valuable, however without any LLM, "git-commit" could be a context artifact of its own and have its own links to search results. "Git memory lite"?

Existing tools: most coding AI agents already have access to git — they can run git log, git blame, search commit history dynamically. So is it really a concern for SocratiCode?

But do they use it is the real question. Ime I have to keep telling it over and over to scan files and tell it what to be scanning myself. MCP's seem to function as first class tools that it will use. I talk to opus a lot about why is doesn't use certain things and it tells me that the more friction it takes to do certain things, the more likely it is to skip it and just do it the "old fashion" way with grep etc. Maybe this is a cost cutting methodology with how they train the models or injected prompts.

On plugins generally: I really do like the idea of opening SocratiCode to community extensions — but I think a plugin should be a lighter touch. Something that enriches the index (or the use of it) but without introducing heavy dependencies or high complexity.

I don't really have a preference one way or another for this. The reason I'm here is because I liked what you put together and thought my beta project was a natural fit. I did some dry run AI testing with both systems as MCPs before I even considered porting it, and opus reported a lot of complementary results. I had it dry run mentally a feature addition to a codebase and it used both MCPs and told me what information it would use from both systems and how it would influence how it would go about building the new feature.

I do like modularity (even your existing indexer could in theory, be a plugin), but if it works as part of the existing codebase thats fine too as the plugin I submitted are really the only touch points needed.

I'd love to see the Git Memory implementation to understand the architecture better — even if we end up shipping it as a core feature (maybe as part of artifacts) rather than a plugin. Happy to keep discussing :-)

What's the best way for me to do this? I can push to my fork after I test a couple runs. My gap analysis of the port is nearly covered now.

My main motivation for the git memory mcp was that I kinda realized that git is probably one of the best and most available memory systems that a project has at it's disposal. Yes, memories can be superseded by later commits but let's be honest, once it ends up in the repo, its a meaningful thing to remember.

midweste · 2026-03-19T00:53:57Z

Ok got the basic flow working for memory additions, here's a document i asked opus to make that would show a theoretical feature, it chose "Test coverage plugin" I'm guessing where test coverage information is added to the index. Not even sure if this makes any sense but it does show what it thinks about what its finding:

Dry Run: "Add a Test Coverage Plugin"

A walkthrough of how an AI agent researches a new feature using SocratiCode. Each step shows the combined results the agent receives, tagged by source.

Phase 1: How Do I Create a Plugin?

The agent runs two searches in parallel and receives these combined results:

What the agent learned	Source
File convention is `src/plugins/*/index.ts` — auto-discovered at load time, no registration config needed	🧠 git memory
All hooks are optional and non-fatal — plugin crashes don't take down the indexer	🧠 git memory
Old API `registerPlugin()` was renamed to `pluginManager.register()` — stale examples exist in docs	🧠 git memory
Exact interface: `name: string` + 4 optional async hooks (onProjectIndexed, onProjectUpdated, onProjectRemoved, onShutdown)	🔍 code search
Hook signatures: (projectPath: string, onProgress?: ProgressFn) => Promise	🔍 code search
Working registration template: pre-flight checks → create resources → define plugin object → register	🔍 code search

Agent is ready to: scaffold src/plugins/test-coverage/index.ts with the correct interface and registration pattern, knowing errors are safely isolated.

Phase 2: How Do I Store Data?

What the agent learned	Source
⚠️ Must use relative paths — absolute paths caused a shipped bug (v1.1.3 patch) in worktree indexing. Using absolute paths would silently break shared indexes.	🧠 git memory
Data goes in the existing `context_` collection (same collection as other artifacts) — don't create a new one	🧠 git memory
Storage API: `upsertPreEmbeddedChunks(collection, points[])` where each point has `{id, vector, bm25Text, payload}`	🔍 code search
Batching: upserts happen in batches of 100 with automatic per-point fallback on failure	🔍 code search
Need `ensurePayloadIndex(collection, fieldName)` for any fields used in filtering	🔍 code search

Agent is ready to: write the storage layer using the correct API, with relative paths, into the shared collection.

The relative-path constraint is the most valuable piece here. This information exists only in git history — it is not documented or visible in current code. An agent without this context could re-introduce the exact bug that was already debugged and patched.

Phase 3: Collection & Project Identity

What the agent learned	Source
projectIdFromPath() generates a stable ID: SHA-256 hash or explicit `SOCRATICODE_PROJECT_ID` env var	🔍 code search
contextCollectionName(projectId) derives the collection name — pattern: `context_{id}`	🔍 code search
Three collection families exist: `codebase_`, `codegraph_`, `context_` — test coverage belongs in `context_`	🔍 code search
Worktrees share the same collection via `SOCRATICODE_PROJECT_ID` — test coverage data must be worktree-safe	🧠 git memory

Agent is ready to: use the canonical naming functions instead of constructing collection names manually.

Full Knowledge Map

Every piece of knowledge the agent gathered, by source:

Knowledge	🔍 Code	🧠 Memory
Plugin file convention (`src/plugins/*/index.ts`)		✓
Hooks are non-fatal/optional		✓
API rename (`registerPlugin` → `pluginManager.register`)		✓
Interface contract (4 hooks + signatures)	✓
Registration boilerplate template	✓
⚠️ Use relative paths (absolute paths = past bug)		✓
Store in existing `context_` collection		✓
`upsertPreEmbeddedChunks()` API + batch pattern	✓
`ensurePayloadIndex()` for filterable fields	✓
projectIdFromPath() / contextCollectionName()	✓
Worktree sharing via `SOCRATICODE_PROJECT_ID`	✓	✓

Code search provided 7 pieces: implementation contracts, API signatures, working templates.
Git memory provided 5 pieces: conventions, constraints, past bugs, architectural intent.
1 piece surfaced from both, reinforcing each other.

Together: 12 distinct pieces of knowledge from 4 parallel query pairs, zero files opened.

giancarloerra · 2026-03-19T12:36:08Z

I've been thinking more about this and I think there's an approach that could work well for both the plugin system and Git Memory, and one that I like more as it's a good compromise between SocratiCode philosophy of KISS and an expandable plugins system that doesn't affect or interacts with any of the core features.

Plugins as artifact generators

What if the plugin contract was simply: "generate files in the context artifacts directory"? SocratiCode's existing pipeline handles embedding, indexing, and search. Plugins just produce the knowledge.

The interface could be minimal:

interface ArtifactPlugin { name: string; generateArtifacts(projectPath: string, artifactsDir: string): Promise<void>; cleanArtifacts(artifactsDir: string): Promise<void>; }

A plugin gets the project path and a directory to write to. It does its thing. SocratiCode indexes whatever it finds there. No Qdrant access, no lifecycle hooks into the indexing pipeline, no core API surface to maintain.

This solves the concerns I have:

Security: plugins never see Qdrant or any core internals
Stability: a failing plugin can't affect the indexer — it just writes files (or doesn't)
API surface: "write files to a directory" is about as stable a contract as you can get
Simplicity: dead simple to implement, dead simple to write plugins for

How Git Memory fits

Git Memory becomes a perfect first plugin with two modes:

Lite (no LLM): runs git log, extracts commits, writes structured markdown artifacts: commit messages, authors, affected files, diffs. No AI processing, just organized git history made searchable.

Full (with LLM): same extraction step, then sends batches to a configured LLM for structuring: types, relationships, importance scores. Writes richer artifacts.

Both modes just produce markdown files. SocratiCode's hybrid search (embeddings + BM25) handles them naturally: "switched from JWT to sessions due to XSS" will surface when someone searches "authentication security" regardless of format. Structured markdown with good headings actually chunks well for embeddings.

The LLM provider configuration stays entirely within the plugin, while SocratiCode core remains embeddings-only. Users who want the full mode configure the plugin separately.

What this enables

Because the contract is just "generate useful artifacts," other plugins become natural too: API docs from OpenAPI specs, dependency analysis, architecture decision records, CI/CD context. All just file generators, all sandboxed by design.

I know this trades power for safety compared to your original plugin system with lifecycle hooks and Qdrant access. But I think that limitation is actually a feature right now, fitting the SocratiCode original philosophy: doing one thing, well.

It's enough to validate the concept, and covers the git memory use case fully.

What do you think? If this direction works for you, you could share maybe in another dedicated PR this approach for the plugin system and the Git Memory Lite and Full? So there's the plugin system and the first use cases for a simple one and a more complex one needing more configuration (the LLM part should support Openrouter and major providers like Openai compatible, Ollama etc.).

midweste · 2026-03-19T12:47:18Z

When I get some additional time, I'll injest this a bit more fully and see what changes need to be made. Honestly, I dont think its that much, however the file format may likely benefit from a structure like json. The current system generates links between memories during the final step and adds superseded by type tags to inform the consumer of its "relevance".

Couple quick questions:

Isnt there already the socraticodecontextartifacts.json mechanism? Doesn't it fit into that already? Or are memories set to become more than context? Build this out more?
These files added to like a .socraticode folder and generally ignored by git? we don't want memories of memories of course

Maybe the plugin implements a schema that can plug in? Json schema would be useful as the main core could validate input before it adds anything (or maybe thats the domain of the plugin?)

The upsert ends up looking like this:

{ id, // UUID derived from SHA-256 of "git-memory:{contentHash}" vector, // embedding of prepareDocumentText(`[${memoryType}] ${summary}`, `git-memory:${filePath}`) bm25Text, // same text as above (for hybrid search) payload: { // ── Context artifact fields (shared with all SocratiCode context) ── artifactName: "git-memory", // constant — groups all git memories artifactDescription: "[decision] Git memory (importance: 85) from commits abc123, def456", filePath: "src/services/auth.ts", // primary file path relativePath: "git-memory", // constant content: "Decided to use JWT over sessions because...", // the memory summary startLine: 0, // not applicable endLine: 0, // not applicable language: "git-memory:decision", // "git-memory:{memoryType}" type: "git-memory", // constant — Qdrant filter key // ── Git-memory-specific fields ── contentHash: "a1b2c3d4e5f67890", // 16-char hex for dedup sourceCommits: ["abc123...", "def456..."], // full commit hashes filePaths: ["src/services/auth.ts", "src/config.ts"], // all related files tags: ["authentication", "jwt", "architecture"], importance: 85, // 0-100 confidence: 70, // 0-100 memoryType: "decision", // one of GIT_MEMORY_TYPES createdAt: "2025-06-15T10:30:00Z", // ISO date of earliest source commit }, }

Will respond more later, haven't had my second cup of joe yet, so I may be completely offbase here :P

giancarloerra · 2026-03-25T00:04:02Z

Sorry I forgot to answer your questions:

Yes, exactly that's the whole point of my "plugins as artifact generators" suggestion. The plugin wouldn't interact with Qdrant or the indexing pipeline at all. It would just generate files (markdown, JSON, whatever format works best) into a directory, and the existing .socraticodecontextartifacts.json mechanism would handle the rest: embedding, indexing, hybrid search. The plugin system is essentially an automated way to produce and maintain context artifacts that would be tedious or impractical to create by hand. Git Memory Lite could generate structured markdown files from git history, and those files become artifacts: indexed and searchable like any other. So it fits into the existing artifacts system, just with automated generation instead of manual curation.
Yes, that's the right approach. Plugin-generated artifacts would go into a local directory (something like .socraticode/artifacts/git-memory/) that gets added to .gitignore automatically or by convention. They're local to each machine, derived from git history that's already in the repo, so there's no reason to track them. And as you said, we definitely don't want to index artifacts about artifacts. The .socraticodeignore could also exclude the plugin output directory from the main codebase index, so only the context artifacts pipeline processes them.

midwestE added 2 commits March 17, 2026 15:11

feat: plugin manager for future support for plugins

88a6a37

docs: update plugin readme

dddc6cf

midweste changed the title ~~Midweste pluginmanager~~ Plugin Manager Mar 17, 2026

midweste changed the title ~~Plugin Manager~~ feat: plugin manager Mar 17, 2026

giancarloerra self-assigned this Mar 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: plugin manager#12

feat: plugin manager#12
midweste wants to merge 2 commits intogiancarloerra:mainfrom
midweste:midweste-pluginmanager

midweste commented Mar 17, 2026

giancarloerra commented Mar 18, 2026

midweste commented Mar 18, 2026

giancarloerra commented Mar 18, 2026

midweste commented Mar 18, 2026 •

edited

Loading

midweste commented Mar 19, 2026

giancarloerra commented Mar 19, 2026

midweste commented Mar 19, 2026 •

edited

Loading

giancarloerra commented Mar 25, 2026

Labels

2 participants

Conversation

midweste commented Mar 17, 2026

Summary

Changes

Type of change

Testing

Checklist

Related issues

giancarloerra commented Mar 18, 2026

midweste commented Mar 18, 2026

giancarloerra commented Mar 18, 2026

midweste commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

midweste commented Mar 19, 2026

Dry Run: "Add a Test Coverage Plugin"

Phase 1: How Do I Create a Plugin?

Phase 2: How Do I Store Data?

Phase 3: Collection & Project Identity

Full Knowledge Map

giancarloerra commented Mar 19, 2026

midweste commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

giancarloerra commented Mar 25, 2026

Labels

2 participants

midweste commented Mar 18, 2026 •

edited

Loading

midweste commented Mar 19, 2026 •

edited

Loading