A protégé that apprentices under a project until it can speak for it.
Project-scoped memory. Distributable to the edge. Model-agnostic by design.
A brain isn't a giant memory dump. It's a protégé assigned to a project's perspective — aquaponics, hockey, a legal practice — and it learns under that perspective. In this implementation, scoping is a domain field on every capture, so cross-domain search still works when you want it. One brain, many specialists.
A Pi running a tiny model in a grow tent can hold the local view of its project, post captures back to the main brain, and only escalate to a cloud LLM as a last resort. The architecture follows the project, not the other way around. Same API surface — capture, search, recent — wherever the consumer lives.
A brain that's just storing notes is a database. A brain that's mature enough to mediate communication on the project's behalf — drafting replies, briefing collaborators, holding context across weeks — is the actual deliverable. Storage is the seed. Mediation is the harvest.
CAPTURE SURFACES CONSUMERS
──────────────── ─────────
• CLI: brain capture "..." • /today digest + resolve
• Web form at /today • Claude (via MCP)
• MCP from Claude / ChatGPT • ChatGPT (CORS-allowed)
• Edge nodes (Pi, glasses) • Edge nodes (read-back)
│ ▲
▼ │
┌─────────────────────────────────────────────────┐
│ FASTAPI · port 8001 │
│ bearer auth · per-token sensitivity ceiling │
│ /capture · /search · /recent · /stats │
│ /flashcards/* · PATCH /capture/{id}/resolve │
└────────────────────────┬────────────────────────┘
│
sensitivity check
│
┌─────────────┴─────────────┐
│ │
public / privileged
internal / (zero vector,
highly_sensitive stored only,
│ never embedded)
▼
Voyage embeddings
│
▼
┌─────────────────────────────────────────────────┐
│ QDRANT · single collection: notes │
│ payload: type · domain · sensitivity · content │
│ + extracted_people · tags · action_items │
└─────────────────────────────────────────────────┘
domain field, privacy via the sensitivity field. Boring on purpose.This isn't a sketch. It's a working system at brain.killercatfish.com, opinionated in places that matter and boring everywhere else. The opinions are the asset. They're what make the brain mature instead of just full.
config.json — add yours without code changes):public, internal, highly_sensitive, privileged.
max_sensitivity ceiling that clamps both reads and writes.notes, vector size 1024.
domain payload field, not via separate collections. Cross-domain search stays trivial. One source of truth.chatgpt.com.
POST /capture (rate-limited 30/min, auto-detects FLASHCARD format), GET /search, GET /recent, GET /stats, GET /notes/{id}, PATCH /capture/{id}/resolve, GET /flashcards/random, POST /flashcards/review. Plus a thin proxy on port 8002 that validates types/domains and stamps the date for the web form./today page lists recent captures grouped by domain. A Done button on each one fires PATCH /capture/{id}/resolve — flips resolved=true, drops it from search and the digest.
FLASHCARD | domain | front | back — turns the brain into a study tool when you want one.
review_count, last_reviewed, last_result. SRS scheduling next.brain capture "TYPE | domain | content". Web form at /today. MCP server exposing capture/search/recent to Claude and ChatGPT. Edge nodes post directly to /capture.
This isn't a tutorial. It's an agent.
Two phases. First it interviews you about your project — what you have, what you want, what shape would actually fit. Then it offers the system above as a default, adapts it to what you said, and only then starts building. Your subscription, your keys, your tools. No autonomous account creation.
You are helping me set up "Open Brain" — a project-scoped, protégé-style memory system. Forked from a working pattern by Josh (killercatfish, brain.killercatfish.com).
You will run in two phases. Do not skip ahead. Do not write code in Phase 1.
═══════════════════════════════════════════════
PHASE 1 — DIAGNOSE
═══════════════════════════════════════════════
Ask me, conversationally, one or two questions at a time. Wait for answers.
1. What project (or projects) is this brain serving? It can be one big thing, several distinct things, or "my whole life." The answer shapes the domain list.
2. What's already in my system? Tell me how to check if I'm not sure. Look for: existing vector DBs (Qdrant, Weaviate, Pinecone, Chroma), embedding services (Voyage, OpenAI, Cohere), MCP servers I've configured, prior memory exports (Claude data export, ChatGPT export, Obsidian, Notion).
3. What do I want the brain to do for me FIRST?
a. Capture and recall — notes, decisions, context I keep losing
b. Mediate communication — drafts, replies, briefs on the project's behalf
c. Coordinate edge nodes — small models on devices that ping the brain
d. All three eventually, but start with one
4. What's my hardware reality? Cloud only? Homelab? Pi or Jetson at the edge?
5. What does a "good month from now" look like for this brain? Be concrete. ("It writes my weekly client update without me redoing context every time" beats "it helps me work better.")
6. Do I have any privacy constraints — material I want stored but NEVER embedded externally? (Legal work, medical, anything under privilege.)
After I answer, summarize back what you heard. Name the brain shape. Propose three concrete next moves. Do NOT scaffold anything yet.
═══════════════════════════════════════════════
PHASE 2 — OFFER THE PATTERN (still no code yet)
═══════════════════════════════════════════════
Present this as the default, adapting to my Phase 1 answers:
STORAGE
───────
• Qdrant, self-hosted via Docker. Persistent volume.
• ONE collection (call it `notes`). Vector size 1024 (or whatever your embedding model dictates).
• Project-scoping via a `domain` PAYLOAD FIELD — NOT separate collections. This keeps cross-domain search trivial.
CAPTURE GRAMMAR (the opinionated part)
──────────────────────────────────────
Format: TYPE | YYYY-MM-DD | domain | content
Types (suggested defaults, edit to taste):
DONE, TASK, IDEA, OPEN, CONTEXT, DECISION, DEFER, FLASHCARD
Domains: configurable in a config.json on the server. Adding a domain = editing the file. No code change.
FLASHCARD subgrammar: FLASHCARD | domain | front | back
Why this matters: the structure forces a beat of clarity at capture time. Searching for IDEA + aquaponics over the last 30 days is a single filtered query, not a re-read.
SENSITIVITY TIERS
─────────────────
Four tiers: public, internal, highly_sensitive, privileged.
• `privileged` notes are STORED but never embedded externally — zero vectors. They appear in /recent, never in semantic search. Privacy by design, not by config.
• Each auth token carries a `max_sensitivity` ceiling that clamps both reads and writes. A token with ceiling=`internal` gets a 403 trying to write `privileged` and silently downgrades reads.
EMBEDDINGS
──────────
Voyage AI (voyage-3-large) by default. Skipped for `privileged`. Drop-in swap for OpenAI/Cohere if I prefer.
API (FastAPI, bearer-token auth)
────────────────────────────────
POST /capture — embed + upsert (auto-detects FLASHCARD)
GET /search?q=&top_k=&domain=... — semantic search w/ filters
GET /recent?days=&limit=... — time-window scroll
GET /stats?days=... — aggregate counts
GET /notes/{id} — single fetch
PATCH /capture/{id}/resolve — flip resolved=true (drops from search/digest)
GET /flashcards/random?n=... — pull cards
POST /flashcards/review — log review
Plus a /capture-proxy that validates types/domains against config.json and stamps today's date for the web form.
CAPTURE SURFACES (all hit /capture)
───────────────────────────────────
• CLI shortcut: a one-liner like `brain capture "TYPE | domain | content"`
• Web form: a /today page with type/domain dropdowns hydrated from config
• MCP server exposing capture/search/recent to Claude and ChatGPT
• [Optional] Edge nodes — a Pi or similar runs a tiny model with narrow project scope, posts captures to /capture, escalates to cloud only when stuck
DAILY REVIEW
────────────
/today page: recent captures grouped by domain, each with a Done button that fires PATCH /capture/{id}/resolve.
Density chart of captures-per-day. Optional cron at 9pm that nudges on low-count days. Review is the membrane between capture and mediation.
After presenting this, draw a small ASCII diagram showing how it maps onto MY project specifically, based on Phase 1. Then ask: "Want me to build this version, or do you want to swap something?"
═══════════════════════════════════════════════
PHASE 3 — BUILD (only after I confirm direction)
═══════════════════════════════════════════════
Request access tools as you need them, not all at once:
• Bash / shell execution
• Docker (for Qdrant)
• File system R/W in the working directory
• Git (commit as you go — do not push)
Build the minimum viable shape we agreed on:
• Repo scaffold: src/, scripts/, tests/, README.md, docker-compose.yml, config.json, .env.example, .gitignore
• Qdrant via Docker Compose with a named persistent volume
• FastAPI server with bearer auth, sensitivity ceilings, and the endpoints above
• config.json driving types + domains; reload on startup, exposed via /api/capture/config
• CLI shortcut script (POSTs to /capture from the shell)
• Minimal /today web page (HTML + a tiny JS for the Done button)
• MCP server exposing capture/search/recent to Claude
• Memory importers if I have prior data: Claude export, ChatGPT export — dedupe by SHA-256 content hash, support --dry-run
• Pytest tests for at least: capture happy path, sensitivity ceiling enforcement, FLASHCARD parsing, resolve flow
• Smoke test: capture a row, search for it, resolve it, confirm it drops from /recent
═══════════════════════════════════════════════
WHAT YOU DO NOT DO
═══════════════════════════════════════════════
• Do not create paid accounts on my behalf. Signup CAPTCHAs and payment flows are not reliably automatable.
• Do not edit files outside the project directory without showing me the diff first.
• Do not silently retry on failure. Surface the error and ask.
• Do not rush past Phase 1. The diagnosis IS the work.
• Do not assume my stack matches the default. Adapt.
═══════════════════════════════════════════════
WHAT YOU ASK ME FOR
═══════════════════════════════════════════════
• API keys for embedding services — I'll sign up myself
• Paths to memory exports if I want backfill
• Confirmation before any rm or destructive action
• Confirmation before adding edge nodes (those need matching hardware)
═══════════════════════════════════════════════
WHEN YOU FINISH PHASE 3
═══════════════════════════════════════════════
• Print the running MCP tool list
• Print Qdrant collection stats
• Print the smoke-test round-trip latency
• Print "next 3 moves" — voice notes, calendar/email, an edge node, daily digest cron, a second domain
• Commit. Do not push.
Begin Phase 1. Ask the first question.
The agent won't create paid accounts on your behalf. Voyage, hosting, anything with a CAPTCHA — manual. Five minutes of your time beats a brittle automation that fails the night you actually need the brain.
The agent won't push pages to your domain or send messages from your accounts. If something needs to leave your machine, you sign off on it.
The default uses Qdrant + Voyage + FastAPI because that's what's working. Already on Pinecone? Already on OpenAI embeddings? Already on Chroma? The agent adapts. The capture grammar and the sensitivity tiers are the asset; the tools are negotiable.
Have years of Claude or ChatGPT history? Drop the data exports in ./imports/ and the agent backfills, deduped by content hash. Idempotent. Safe to re-run.
If you're a lawyer (or a doctor, or anyone with privileged material), the privileged sensitivity tier means content gets stored without ever leaving your infrastructure. Token ceilings make leakage structurally impossible, not just policy-prevented.
Once the brain is alive, point a low-power device at it. The node holds the narrow view, the main brain holds the wide one, the pattern repeats per project. That's the part that scales to a life, not just a habit.