Components

The server logic is organized into modular services found in app/services/. This design pattern separates business logic from API routes, making the code testable and reusable.

Key Services

Chat Service

File: app/services/chat_service.py This is the heart of the conversational capability. It handles:

Session Management: Creating and retrieving chat sessions.
Prompt Engineering: Constructing the system prompt with context.
Token Management: Ensuring prompts stay within model limits.
Provider Resolution: Determining which LLM model to use for a specific workspace.

Embedding Service

File: app/services/embedding_service.py Responsible for everything related to RAG (Retrieval-Augmented Generation):

Chunking: Splitting large documents into manageable text chunks.
Embedding: Calling embedding models (e.g., OpenAI text-embedding-3-small) to vectorize text.
Indexing: Storing vectors in the configured Vector DB.
Retrieval: Performing cosine similarity searches to find relevant context.

Agent Service

File: app/services/agent_service.py Manages autonomous behaviors and tool use. It allows the LLM to “act” rather than just “speak” by executing defined tools (like web search or file operations).

Pluggable Layers

The server is designed to be agnostic to specific vendors for key components.

LLM Providers

Location: app/services/llm/ The server uses an adapter pattern to support multiple LLM providers.

Factory: app/services/llm/factory.py instantiates the correct provider based on configuration.
Base Class: All providers inherit from a common base class, ensuring a consistent interface.
Supported: OpenAI, Anthropic, Ollama, Google Gemini, Groq, Azure.

Vector Databases

Location: app/services/vector_db/ Similar to LLMs, vector database support is modular.

Factory: app/services/vector_db/factory.py.
Supported:
- LanceDB: Embedded, serverless vector DB (default).
- ChromaDB: Open-source embedding database.
- Pinecone: Managed cloud vector database.
- Qdrant: High-performance vector search engine.
- Weaviate: AI-native vector database.

Authentication System

Location: app/core/security.py Authentication is flexible and controlled by MULTI_USER_MODE.

Single User: Validates a simple static AUTH_TOKEN. Ideal for personal use.
Multi User: Full JWT implementation.
- Login: /api/v1/auth/login returns an access token.
- Protection: Routes are protected by the get_current_user dependency.
- Hashing: Passwords are hashed using bcrypt.

Getting Started

Core Components

Key Services

Chat Service

Embedding Service

Agent Service

Pluggable Layers

LLM Providers

Vector Databases

Authentication System

Getting Started

Core Components

​Key Services

​Chat Service

​Embedding Service

​Agent Service

​Pluggable Layers

​LLM Providers

​Vector Databases

​Authentication System

Key Services

Chat Service

Embedding Service

Agent Service

Pluggable Layers

LLM Providers

Vector Databases

Authentication System