Skip to content

100% Rust GraphRAG implementation with a MCP server for indexing large codebases with blazingly fast speed and querying the graph with natural language

Notifications You must be signed in to change notification settings

Jakedismo/codegraph-rust

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CodeGraph

CodeGraph turns your local codebase into a searchable knowledge graph so any Model Context Protocol (MCP) client can answer questions with project-aware context. The CLI indexes source code, pushes semantic embeddings into FAISS, and exposes a fast MCP server that tools such as LM Studio or Claude Desktop can call.

This README focuses on the streamlined developer workflow we ship today. For a deeper dive into the agent architecture or legacy experiments, see the material moved under docs/legacy/.


Quick Start

  1. Install prerequisites

    # macOS / Linux prerequisites curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh brew install faiss # or your distro's libfaiss-dev package
  2. Install CodeGraph

    cd codegraph-rust cargo install --path crates/codegraph-mcp --features "embeddings,faiss"
  3. Prepare LM Studio (recommended runtime)

    • Download LM Studio from lmstudio.ai
    • Download the models:
      • Embeddings: jinaai/jina-code-embeddings-1.5b
      • LLM (optional): lmstudio-community/DeepSeek-Coder-V2-Lite-Instruct-Q4_K_M
    • Start the LM Studio local server on http://localhost:1234
  4. Configure CodeGraph Create .env in your repository (or export the variables):

    CODEGRAPH_EMBEDDING_PROVIDER=lmstudio CODEGRAPH_EMBEDDING_MODEL=jinaai/jina-code-embeddings-1.5b CODEGRAPH_LMSTUDIO_URL=http://localhost:1234 CODEGRAPH_EMBEDDING_DIMENSION=1536 RUST_LOG=warn
  5. Index and serve

    codegraph index . # builds .codegraph/ with embeddings + graph codegraph start stdio # exposes MCP interface for LM Studio / Claude

That is the entire end-to-end workflow. LM Studio now has a project-specific vector store it can query through MCP.


Everyday CLI

Command Description
codegraph index <path> Incrementally index or reindex a project.
codegraph start stdio Launch the MCP server on STDIO (works with LM Studio, Claude Desktop, etc.).
codegraph start http Optional HTTP transport for custom tooling.
codegraph tools list Inspect available MCP tools exposed by the server.

Tips:

  • Run codegraph index from each repository root you want to serve.
  • Keep RUST_LOG=warn in your environment for a clean TUI while indexing.
  • To script updates, rerun codegraph index after significant code changes; the cache handles incremental updates.

Speed Metrics (Apple Silicon, LM Studio Backend)

Scenario Embedding Throughput Notes
LM Studio + Jina 1.5B embeddings 120 embeddings/sec Uses MLX + Flash Attention 2.
Ollama + nomic-embed-code ~60 embeddings/sec Roughly half the speed of LM Studio.
FAISS vector search 2–5 ms latency With index caching enabled.

Optimization Cheatsheet

Optimization Typical Speedup Extra Memory Enabled by Default
FAISS index cache 10–50× faster warm searches 300–600 MB
Embedding generator cache 10–100× faster first query ~90 MB
Query result cache 100× on repeated prompts ~10 MB
Parallel shard search 2–3× negligible

These numbers come from the same benchmarking harness described in docs/performance/. They are preserved here so you can quickly sanity-check your own runs.


LM Studio Reference

The full LM Studio checklist (model recommendations, port configuration, batch-size tuning, and troubleshooting) lives in LMSTUDIO_SETUP.md. It expands on:

  • Switching between embedding and completion models within LM Studio
  • Running multiple projects side by side
  • Recommended .env / .codegraph.toml settings

For Ollama or ONNX-only setups, see docs/INSTALLATION.md and the legacy guides under docs/legacy/.


Advanced Features

  • Graph analytics: The indexer produces ownership graphs, dependency edges, and metadata that RAG tools consume automatically.
  • MCP tools: vector.search, graph.traverse, code.read, and more are exposed through the MCP server.
  • FAISS tuning: Set CODEGRAPH_FAISS_TRAINING_THRESHOLD or provide a pre-built index in .codegraph/faiss.index for massive monorepos.

Explore the docs/ tree for deeper architecture notes, RAG pipeline details, deployment manifests, and operator playbooks.


Contributing

We welcome pull requests! Running the full test suite:

cargo fmt --all cargo clippy --workspace --all-targets cargo test --workspace

Please open an issue if you want to discuss large changes or integrations.


License

CodeGraph is dual-licensed under MIT and Apache 2.0. See LICENSE-MIT and LICENSE-APACHE for details.

About

100% Rust GraphRAG implementation with a MCP server for indexing large codebases with blazingly fast speed and querying the graph with natural language

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published