BookSmart is a local-first book analysis pipeline for long books that do not fit into a single model context window. It uses hierarchical summarization with LangChain map-reduce chains, a local llama.cpp-compatible chat endpoint, a local embedding endpoint, Chroma for retrieval, and Chainlit for the UI.
- Whole-book summaries.
- Chapter summaries.
- Question answering over a processed book.
- Question answering scoped by chapter.
- Nearby chunk summaries for a query.
- A larger local section view assembled around the most relevant chunk.
- Split a text book into chapters when chapter headings are present.
- Split large chapters into context-safe chunks.
- Run LangChain map-reduce summarization over the chunks.
- Persist chunk summaries, chapter summaries, and a whole-book summary.
- Build a local Chroma index from the source chunks.
- Answer questions using global summary + chapter summaries + retrieved chunks + a larger local section.
- Re-ingesting or uploading a book rebuilds its artifacts so stale outputs are not reused.
- Continuing with an existing processed book reuses the persisted artifacts already on disk.
Copy booksmart.example.toml to booksmart.toml or set environment variables directly.
Default endpoints:
- Chat LLM:
http://localhost:8001/v1 - Embeddings:
http://localhost:8002/v1
Config values:
data_dirllm_base_urlembedding_base_urlapi_keymodelembedding_modeltemperaturechunk_size_charschunk_overlap_charsembedding_chunk_size_charsembedding_chunk_overlap_charschat_llm_base_urlchat_modelretrieval_ksection_char_budgetreduce_max_tokens
uv syncIngest a book from a text file:
uv run booksmart ingest /path/to/book.txtList processed books:
uv run booksmart listShow a whole-book summary:
uv run booksmart summary <book-slug>Ask a question:
uv run booksmart ask <book-slug> "What drives the protagonist's decision near the end?"Run the UI locally:
uv run chainlit run chainlit_app.pyUseful commands inside the chat:
/books/use <book-slug>/ingest <path-to-book.txt>/upload/summary/chapter <number-or-title>/nearby <question>/section <question>
Each processed book is stored under data/books/<slug>/ with:
source.txtmanifest.jsonchapters.jsonchunks.jsonmap_summaries.jsonchapter_summaries.jsonglobal_summary.mdchroma/
- Input support is intentionally focused on plain text for v1.
- The implementation assumes OpenAI-compatible llama.cpp endpoints.
- The section context budget is character-based, which is a pragmatic approximation for local models.
- Summarization and embeddings now use separate chunk budgets. Keep
chunk_size_charslarge enough for efficient map-reduce summarization, and keepembedding_chunk_size_charssmall enough for your embedding endpoint's token limit.