Name	Name	Last commit message	Last commit date
parent directory ..
GGUF-model-support.md	GGUF-model-support.md
README.md	README.md
compilation-guide.md	compilation-guide.md
development.md	development.md
integration-guide.md	integration-guide.md
lancedb.md	lancedb.md
npu-support.md	npu-support.md

Name

Last commit message

Last commit date

GGUF-model-support.md

Fluid Server Documentation

This directory contains comprehensive documentation for the Fluid Server project, including compilation guides, feature documentation, and troubleshooting information.

Documents

Compilation Guide

Complete guide for building PyInstaller executables, including:

Critical import fixes for PyInstaller compatibility
Build performance optimization
Common compilation issues and solutions
Build environment setup requirements

GGUF Model Support

Complete guide for using any GGUF model from HuggingFace Hub:

Flexible model format support (repo, repo/file, legacy names)
Popular model recommendations and quantization guidance
Automatic download and caching system
Performance optimization for different hardware configurations

Quick Reference

Build the Executable

.\scripts\build.ps1

Use Any GGUF Model

./fluid-server.exe --llm-model "unsloth/gemma-3-4b-it-GGUF/gemma-3-4b-it-Q4_K_M.gguf"

Test Streaming

curl -X POST http://localhost:3847/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model": "current", "messages": [{"role": "user", "content": "Hello"}], "stream": true}'

Architecture Overview

Fluid Server ├── FastAPI Application (app.py) ├── Runtime Manager (managers/) │ ├── Model loading/unloading │ └── Memory management ├── Model Runtimes (runtimes/) │ ├── LlamaCpp (GGUF models) │ └── OpenVINO (optimized models) ├── API Endpoints (api/) │ ├── Chat completions (/v1/chat/completions) │ ├── Model management (/v1/models) │ └── Health checks (/health) └── Utilities (utils/) ├── Model discovery ├── Model downloading └── Platform detection

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Fluid Server Documentation

Documents

Compilation Guide

GGUF Model Support

Quick Reference

Build the Executable

Use Any GGUF Model

Test Streaming

Architecture Overview

FilesExpand file tree

docs

Directory actions

More options

Directory actions

More options

Latest commit

History

docs

Folders and files

parent directory

README.md

Fluid Server Documentation

Documents

Compilation Guide

GGUF Model Support

Quick Reference

Build the Executable

Use Any GGUF Model

Test Streaming

Architecture Overview