Skip to main content

Overview

The Olis API Server is a FastAPI-based backend service that provides the core intelligence behind Olis. It includes a sophisticated RAG (Retrieval-Augmented Generation) pipeline for document understanding and intelligent responses.

Technology Stack

FastAPI

Modern, fast Python web framework

Uvicorn

Lightning-fast ASGI server

RAG Pipeline

Document retrieval and generation system

Vector Database

Semantic search capabilities

Project Structure

apps/api-server/
├── src/
   └── backend/
       ├── routers/          # API route handlers
       ├── utils/
   └── retrievers/   # RAG components
       └── document_db/
           ├── reranker.py
           └── retriever.py
       ├── models/           # Pydantic models
       └── middleware/       # Custom middleware
├── tests/
   └── integration_tests/    # Integration tests
├── scripts/
   ├── preflight_rag.sh     # Unix RAG preflight
   └── preflight_rag.ps1    # Windows RAG preflight
├── requirements.txt          # Python dependencies
└── main.py                   # Application entry point

Development Setup

Prerequisites

  • Python 3.9+
  • pip or poetry
  • Redis (optional, for caching)
  • Vector database (optional, for RAG)

Installation

1

Navigate to the API server directory

cd apps/api-server
2

Create a virtual environment (recommended)

# Create virtual environment
python -m venv venv

# Activate it
# Windows
.\venv\Scripts\activate

# macOS/Linux
source venv/bin/activate
3

Install dependencies

pip install -r requirements.txt
4

Set up environment variables

Create a .env file:
# API Configuration
API_HOST=0.0.0.0
API_PORT=8000
DEBUG=True

# Database
DATABASE_URL=postgresql://user:pass@localhost/olis

# Redis Cache
REDIS_URL=redis://localhost:6379

# RAG Configuration
VECTOR_DB_URL=http://localhost:8080
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
5

Run the server

uvicorn main:app --reload --host 0.0.0.0 --port 8000
The API will be available at http://localhost:8000.

API Endpoints

Core Endpoints

Health check endpointResponse:
{
  "status": "healthy",
  "timestamp": "2024-01-01T00:00:00Z",
  "version": "0.1.0"
}
Send a chat message and receive AI responseRequest:
{
  "message": "What is Olis?",
  "context": [],
  "sessionId": "uuid-string"
}
Response:
{
  "response": "Olis is an AI assistant...",
  "sources": ["doc1.pdf", "doc2.md"],
  "confidence": 0.95
}
Search documents semanticallyRequest:
{
  "query": "machine learning basics",
  "limit": 10,
  "filters": {
    "type": "pdf"
  }
}
Response:
{
  "results": [
    {
      "id": "doc1",
      "content": "...",
      "score": 0.92,
      "metadata": {...}
    }
  ],
  "total": 42
}
Ingest documents into the RAG systemRequest: Multipart form data with file uploadsResponse:
{
  "success": true,
  "documentsProcessed": 5,
  "jobId": "job-uuid"
}

Interactive API Documentation

FastAPI automatically generates interactive API documentation:
  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

RAG Pipeline

Architecture

Components

Process:
  1. Document upload via API
  2. Text extraction (PDF, DOCX, etc.)
  3. Chunking into manageable pieces
  4. Embedding generation
  5. Storage in vector database
Supported Formats:
  • PDF
  • DOCX
  • TXT
  • MD (Markdown)
  • JSON
  • CSV

RAG Preflight Testing

Before deploying, run the RAG preflight check to catch issues early:
cd apps/api-server
./scripts/preflight_rag.sh

Environment Overrides

# Set custom timeout (default: 300 seconds)
PREFLIGHT_TIMEOUT_SECONDS=600 ./scripts/preflight_rag.sh

# Keep containers running for debugging
KEEP_PREFLIGHT_RUNNING=1 ./scripts/preflight_rag.sh

What It Tests

  • All Python modules import successfully
  • No missing dependencies
  • Correct Python version
  • API server starts without errors
  • Database connections work
  • Redis cache is accessible
  • Vector database is reachable
  • Document ingestion pipeline
  • Query retrieval
  • Reranking functionality
  • End-to-end RAG flow

Docker Deployment

Local Development

Use Docker Compose for local development:
# docker-compose.local.yml
version: '3.8'

services:
  api:
    build: ./apps/api-server
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=${DATABASE_URL}
      - REDIS_URL=redis://redis:6379
    depends_on:
      - redis
      - vector-db

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

  vector-db:
    image: qdrant/qdrant
    ports:
      - "6333:6333"
Start services:
docker-compose -f docker-compose.local.yml up

Production Build

# Dockerfile
FROM python:3.11-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application
COPY . .

# Run with uvicorn
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
Build and run:
docker build -t olis-api:latest .
docker run -p 8000:8000 olis-api:latest

Testing

Running Tests

cd apps/api-server
pytest

Test Structure

# tests/integration_tests/test_fastapi.py
import pytest
from fastapi.testclient import TestClient
from main import app

client = TestClient(app)

def test_health_endpoint():
    response = client.get("/health")
    assert response.status_code == 200
    assert response.json()["status"] == "healthy"

def test_chat_endpoint():
    response = client.post(
        "/api/chat",
        json={"message": "Hello", "sessionId": "test"}
    )
    assert response.status_code == 200
    assert "response" in response.json()

Configuration

Environment Variables

API_HOST=0.0.0.0
API_PORT=8000
DEBUG=False
LOG_LEVEL=INFO
CORS_ORIGINS=["http://localhost:3000"]
DATABASE_URL=postgresql://user:pass@localhost/olis
DATABASE_POOL_SIZE=20
DATABASE_MAX_OVERFLOW=10
REDIS_URL=redis://localhost:6379
REDIS_CACHE_TTL=3600
REDIS_MAX_CONNECTIONS=50
VECTOR_DB_URL=http://localhost:6333
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
CHUNK_SIZE=512
CHUNK_OVERLAP=50
TOP_K_DOCUMENTS=10
RERANK_ENABLED=True

Performance Optimization

Caching

  • Redis for query results
  • Embedding cache
  • Response caching
  • Connection pooling

Async Operations

  • Async/await throughout
  • Non-blocking I/O
  • Background tasks
  • Parallel processing

Database Optimization

  • Connection pooling
  • Query optimization
  • Index management
  • Batch operations

Monitoring

  • Prometheus metrics
  • Request logging
  • Error tracking
  • Performance profiling

Troubleshooting

Problem: uvicorn fails to startSolution:
# Check port availability
netstat -an | grep 8000

# Try different port
uvicorn main:app --port 8001

# Check Python version
python --version  # Should be 3.9+
Problem: Module import failuresSolution:
# Verify virtual environment is activated
which python

# Reinstall dependencies
pip install --force-reinstall -r requirements.txt

# Check for missing packages
pip list
Problem: Cannot connect to databaseSolution:
# Verify DATABASE_URL format
# postgresql://user:password@host:port/database

# Test connection directly
psql $DATABASE_URL

# Check firewall/network settings
telnet host port
Problem: Document ingestion or retrieval failsSolution:
  1. Run preflight check: ./scripts/preflight_rag.sh
  2. Check vector database is running
  3. Verify embedding model is downloaded
  4. Check logs for specific errors
  5. Test with simple document first

Next Steps