Conversation
| Thank you for this and the clean implementation midweste, it's well-structured with good error isolation and test coverage. Adding a plugin/extension system is a significant architectural decision that I'd like to think through carefully before committing to it. The project is still young and changing rapidly, and it feels a bit early. Some considerations:
SocratiCode already has context artifacts for extending project knowledge without code changes, which may partially overlap this. I'm not ruling this out for the future, I do like the idea of external plugins. But I'd prefer to think about it alongside a few concrete plugins so there are use cases for it and the API is validated by real usage. I definitely like more the idea of plugins instead of bloated code going beyond its core design, but I'd like it to be very surface-level and not posing any potential security or concern over the core functionality and indexes. I'll keep this open for now for further comments also from other contributors. In the meantime, I'd like to first work more on the core product, smooth out existing bugs and implement core features. |
| Thanks for the thoughtful review and for keeping the door open. Totally understand wanting to be careful with architectural decisions this early. I want to share the context behind why I built the plugin system — there's a concrete feature driving it. Git Memory Plugin SocratiCode answers "what does this code do?" through semantic search and context artifacts. Git Memory fills a gap: "why was it written this way?" When a project is indexed, it reads unprocessed git commits — diffs, messages, and git-trailers — batches them, and sends them to a configurable LLM (OpenRouter, OpenAI, Google, Ollama) to extract structured memories: architectural decisions, bug fixes, refactors, patterns. These get embedded and stored in the same For example, a search for "authentication middleware" wouldn't just find the code — it would also surface "Switched from JWT to sessions due to XSS vulnerability (commit abc123)." An AI assistant would know not to reintroduce a bug because it can find "This validation was missing — caused production outage, fixed in def456." It's fully opt-in ( I originally built it just as a local project mcp in go, but I see what I've built can compliment a fully indexed and searchable codebase On the Plugin Architecture Git-memory is a first use case for plugin architecture. The interface is intentionally minimal — 4 optional lifecycle hooks — and I think it's actually the right pattern for SocratiCode going forward. Features like this should be isolated modules that plug into the lifecycle, not code wired throughout the core. That keeps the core lean and each feature self-contained. That said, your concerns about I'd love to get your take on the git-memory feature itself — if it makes sense, we can work out the right integration approach together. Happy to discuss. |
| The Git Memory idea is interesting, and thanks for addressing the security concern. A few thoughts: Feature vs plugin: Git Memory feels more like a core feature behind a flag (like INCLUDE_DOT_FILES or context artifacts) than something that validates a plugin system (more below). LLM dependency: SocratiCode today uses just embeddings. Simple, local, no API keys (by default). Git Memory would need I think a generative LLM to read diffs and extract structured memories, which means configuring providers, API keys, picking models. That's a different level of complexity that I'm trying to avoid. How would it work without that? Existing artifacts: Could raw commit messages + diffs be embedded directly without LLM structuring? How could them be maintained up to date? If possible, implementing a simple script to update a folder with all of that would mean artifacts could already cover it. I keep thinking the git memory is something for the existing artifacts more than anything. Existing tools: most coding AI agents already have access to git — they can run git log, git blame, search commit history dynamically. So is it really a concern for SocratiCode? On plugins generally: I really do like the idea of opening SocratiCode to community extensions — but I think a plugin should be a lighter touch. Something that enriches the index (or the use of it) but without introducing heavy dependencies or high complexity. I'd love to see the Git Memory implementation to understand the architecture better — even if we end up shipping it as a core feature (maybe as part of artifacts) rather than a plugin. Happy to keep discussing :-) |
I understand that perspective too, I never want to be too presumptuous with other peoples projects :)
So currently, I've built it to use openrouter and it evaluates first:
The three steps it does via LLM are: Extraction:
Triage:
Synthesis (where some magic happens) and as currently setup needs a good size context window:
I expect so, but I do think the why is where some of this becomes more valuable, however without any LLM, "git-commit" could be a context artifact of its own and have its own links to search results. "Git memory lite"?
But do they use it is the real question. Ime I have to keep telling it over and over to scan files and tell it what to be scanning myself. MCP's seem to function as first class tools that it will use. I talk to opus a lot about why is doesn't use certain things and it tells me that the more friction it takes to do certain things, the more likely it is to skip it and just do it the "old fashion" way with grep etc. Maybe this is a cost cutting methodology with how they train the models or injected prompts.
I don't really have a preference one way or another for this. The reason I'm here is because I liked what you put together and thought my beta project was a natural fit. I did some dry run AI testing with both systems as MCPs before I even considered porting it, and opus reported a lot of complementary results. I had it dry run mentally a feature addition to a codebase and it used both MCPs and told me what information it would use from both systems and how it would influence how it would go about building the new feature. I do like modularity (even your existing indexer could in theory, be a plugin), but if it works as part of the existing codebase thats fine too as the plugin I submitted are really the only touch points needed.
What's the best way for me to do this? I can push to my fork after I test a couple runs. My gap analysis of the port is nearly covered now. My main motivation for the git memory mcp was that I kinda realized that git is probably one of the best and most available memory systems that a project has at it's disposal. Yes, memories can be superseded by later commits but let's be honest, once it ends up in the repo, its a meaningful thing to remember. |
| Ok got the basic flow working for memory additions, here's a document i asked opus to make that would show a theoretical feature, it chose "Test coverage plugin" I'm guessing where test coverage information is added to the index. Not even sure if this makes any sense but it does show what it thinks about what its finding: Dry Run: "Add a Test Coverage Plugin"A walkthrough of how an AI agent researches a new feature using SocratiCode. Each step shows the combined results the agent receives, tagged by source. Phase 1: How Do I Create a Plugin?The agent runs two searches in parallel and receives these combined results:
Agent is ready to: scaffold Phase 2: How Do I Store Data?
Agent is ready to: write the storage layer using the correct API, with relative paths, into the shared collection.
Phase 3: Collection & Project Identity
Agent is ready to: use the canonical naming functions instead of constructing collection names manually. Full Knowledge MapEvery piece of knowledge the agent gathered, by source:
Code search provided 7 pieces: implementation contracts, API signatures, working templates. Together: 12 distinct pieces of knowledge from 4 parallel query pairs, zero files opened. |
| I've been thinking more about this and I think there's an approach that could work well for both the plugin system and Git Memory, and one that I like more as it's a good compromise between SocratiCode philosophy of KISS and an expandable plugins system that doesn't affect or interacts with any of the core features. Plugins as artifact generators What if the plugin contract was simply: "generate files in the context artifacts directory"? SocratiCode's existing pipeline handles embedding, indexing, and search. Plugins just produce the knowledge. The interface could be minimal:
A plugin gets the project path and a directory to write to. It does its thing. SocratiCode indexes whatever it finds there. No Qdrant access, no lifecycle hooks into the indexing pipeline, no core API surface to maintain. This solves the concerns I have: Security: plugins never see Qdrant or any core internals How Git Memory fits Git Memory becomes a perfect first plugin with two modes: Lite (no LLM): runs git log, extracts commits, writes structured markdown artifacts: commit messages, authors, affected files, diffs. No AI processing, just organized git history made searchable. Full (with LLM): same extraction step, then sends batches to a configured LLM for structuring: types, relationships, importance scores. Writes richer artifacts. Both modes just produce markdown files. SocratiCode's hybrid search (embeddings + BM25) handles them naturally: "switched from JWT to sessions due to XSS" will surface when someone searches "authentication security" regardless of format. Structured markdown with good headings actually chunks well for embeddings. The LLM provider configuration stays entirely within the plugin, while SocratiCode core remains embeddings-only. Users who want the full mode configure the plugin separately. What this enables Because the contract is just "generate useful artifacts," other plugins become natural too: API docs from OpenAPI specs, dependency analysis, architecture decision records, CI/CD context. All just file generators, all sandboxed by design. I know this trades power for safety compared to your original plugin system with lifecycle hooks and Qdrant access. But I think that limitation is actually a feature right now, fitting the SocratiCode original philosophy: doing one thing, well. It's enough to validate the concept, and covers the git memory use case fully. What do you think? If this direction works for you, you could share maybe in another dedicated PR this approach for the plugin system and the Git Memory Lite and Full? So there's the plugin system and the first use cases for a simple one and a more complex one needing more configuration (the LLM part should support Openrouter and major providers like Openai compatible, Ollama etc.). |
| When I get some additional time, I'll injest this a bit more fully and see what changes need to be made. Honestly, I dont think its that much, however the file format may likely benefit from a structure like json. The current system generates links between memories during the final step and adds superseded by type tags to inform the consumer of its "relevance". Couple quick questions:
Maybe the plugin implements a schema that can plug in? Json schema would be useful as the main core could validate input before it adds anything (or maybe thats the domain of the plugin?) The upsert ends up looking like this: { id, // UUID derived from SHA-256 of "git-memory:{contentHash}" vector, // embedding of prepareDocumentText(`[${memoryType}] ${summary}`, `git-memory:${filePath}`) bm25Text, // same text as above (for hybrid search) payload: { // ── Context artifact fields (shared with all SocratiCode context) ── artifactName: "git-memory", // constant — groups all git memories artifactDescription: "[decision] Git memory (importance: 85) from commits abc123, def456", filePath: "src/services/auth.ts", // primary file path relativePath: "git-memory", // constant content: "Decided to use JWT over sessions because...", // the memory summary startLine: 0, // not applicable endLine: 0, // not applicable language: "git-memory:decision", // "git-memory:{memoryType}" type: "git-memory", // constant — Qdrant filter key // ── Git-memory-specific fields ── contentHash: "a1b2c3d4e5f67890", // 16-char hex for dedup sourceCommits: ["abc123...", "def456..."], // full commit hashes filePaths: ["src/services/auth.ts", "src/config.ts"], // all related files tags: ["authentication", "jwt", "architecture"], importance: 85, // 0-100 confidence: 70, // 0-100 memoryType: "decision", // one of GIT_MEMORY_TYPES createdAt: "2025-06-15T10:30:00Z", // ISO date of earliest source commit }, }Will respond more later, haven't had my second cup of joe yet, so I may be completely offbase here :P |
| Sorry I forgot to answer your questions:
|
Summary
Adds a PluginManager class that enables SocratiCode to be extended via self-contained plugins without modifying core code. Plugins are auto-discovered from src/plugins/*/index.ts at startup and receive lifecycle hooks. All plugin errors are non-fatal — a failing plugin never affects the indexer.
SocratiCode gives AI agents context about what code does and how it's structured. Plugins extend that context with knowledge that can't be extracted from source files alone — things like why code was written a certain way, which parts of the codebase are most volatile, or what implicit dependencies exist between components. This additional context is stored alongside the existing index and surfaced automatically during search, giving AI agents a deeper understanding of the project without any changes to the core.
Changes
Type of change
Testing
12 tests covering: plugin registration, hook dispatch order, non-fatal error isolation, onProgress forwarding, shutdown resilience, and test reset. No plugins directory = gracefully skipped.
Checklist
Related issues
None