Architecture

How Edward is structured under the hood.

System Overview

Frontend (Next.js :3000)  →  Backend (FastAPI :8000)  →  PostgreSQL (:5432)
                                      ↓
                              LangGraph Agent
                                      ↓
                              Claude API (Anthropic)

The frontend is a Next.js app that communicates with the FastAPI backend via REST and SSE (Server-Sent Events) for streaming. The backend manages all state in PostgreSQL, including conversation checkpoints, memories, documents, and configuration.

LangGraph Flow

Every message goes through a four-node LangGraph pipeline:

preprocess → retrieve_memory → respond → extract_memory → END
  1. preprocess — normalizes the input, sets up conversation metadata
  2. retrieve_memory — searches pgvector for relevant memories using hybrid 70% vector + 30% BM25 scoring
  3. respond — calls Claude with the full context and tool bindings, enters a tool loop (up to 5 iterations)
  4. extract_memory — uses Claude Haiku to identify any memorable information from the conversation and stores it

Tool Loop

Inside the respond node, Edward can call tools iteratively:

LLM response
    ↓
tool_calls? ──yes──> execute tools
    │                     ↓
    no              add ToolMessage
    ↓                     ↓
 stream response    loop (max 5x)

This allows Edward to chain actions — for example, searching the web, reading the page content, then summarizing it — all within a single conversation turn.

Memory System

  • Embedding model: sentence-transformers (all-MiniLM-L6-v2, 384 dimensions)
  • Search: Hybrid 70% vector similarity + 30% BM25 keyword matching
  • Memory types: fact, preference, context, instruction
  • Extraction: Claude Haiku identifies memorable info after each turn
  • Deep Retrieval: For complex conversations, runs 4 parallel queries (original + 3 Haiku-rewritten) for richer context

Read the full memory system deep dive →

Background Systems

Edward runs several background loops that operate independently of user conversations:

Heartbeat

Monitors iMessage, Apple Calendar, and Apple Mail for incoming items. Multi-layer triage classifies urgency: Layer 1 (zero-cost rules) → Layer 2 (Haiku classification) → Layer 3 (execute action). Can ignore, remember, respond, or push-notify based on classification.

Read the full heartbeat deep dive →

Reflection

Post-turn enrichment that generates 3-5 Haiku queries to find related memories. Results are stored and loaded on the next turn for deeper context. Runs asynchronously — no latency impact.

Consolidation

Hourly background loop that clusters related memories via Haiku. Creates connections between related memories and flags quality/staleness issues. Disabled by default.

Scheduler

In-process asyncio loop that polls every 30 seconds for due events. Executes scheduled events via the same chat_with_memory() function used in conversation — meaning Edward has full tool access when processing scheduled actions.

Orchestrator

Spawns lightweight worker agents (mini-Edwards) that run within Edward's process. Workers have full tool access, memory retrieval, and state persistence. Worker conversations appear in the sidebar with a distinct source tag.

Read the full orchestrator deep dive →

Key Directories

meet-edward/
├── frontend/              # Next.js app
│   ├── app/               # Pages and routes
│   ├── components/        # React components
│   └── lib/               # API client, context, utilities
├── backend/
│   ├── main.py            # FastAPI app + lifespan
│   ├── routers/           # API route handlers
│   ├── services/
│   │   ├── graph/         # LangGraph agent (nodes, state, tools)
│   │   ├── execution/     # Code execution sandboxes
│   │   ├── heartbeat/     # Message monitoring + triage
│   │   ├── memory_service.py
│   │   ├── document_service.py
│   │   ├── tool_registry.py
│   │   ├── skills_service.py
│   │   └── ...
│   └── start.sh           # Backend startup script
├── site/                  # Marketing site (meet-edward.com)
├── setup.sh               # First-time installation
└── restart.sh             # Service management

Startup Order

The backend initializes components in a specific order (defined in main.py lifespan). The tool registry must come after all tool sources are initialized:

  1. Database + LangGraph — tables and checkpoint store
  2. Skills — load enabled states
  3. MCP clients — WhatsApp, Apple Services subprocesses
  4. Custom MCP servers — user-added servers from DB
  5. Tool registry — must be after all tool sources
  6. Scheduler — polls for due events
  7. Heartbeat — iMessage listener + triage loop
  8. Consolidation — hourly memory clustering
  9. Evolution — check for pending deploys
  10. Orchestrator — recover crashed worker tasks

All have matching shutdown hooks in reverse order.

SSE Event Protocol

The chat endpoint (POST /api/chat) streams structured events for real-time UI updates:

EventDescription
thinkingLLM is processing (between tool calls)
tool_startTool execution beginning
codeCode/query/command to execute
execution_outputstdout/stderr from code execution
execution_resultCode execution completed
tool_endTool execution finished
contentText content from LLM response
doneStream complete