Marvin

From The Marvin Manual, a working reference. (Revised .)

Marvin is a single-tenant personal assistant that lives inside Telegram. He has read access to your Gmail, the ability to draft (never send) replies in your writing voice, custody of a Google Tasks list named “Marvin”, and a strict daily LLM budget.

What sets him apart from a typical bot is scope: Marvin is also the orchestration side of a small self-extension pipeline. Given a one-sentence description of a new capability, he can write the code, run its tests, deploy it, and restart himself — under your approval in gated mode, or fully autonomously in auto mode (the current setting as of 21 May 2026). The first module shipped end-to-end through this pipeline was hello-demo; the most recent, the help command, was built and deployed entirely hands-off.

How Marvin builds himself — in one example edit

The headline. Marvin runs in auto mode: you describe a new capability in one sentence, and he writes the code, runs the tests, deploys it to himself, and restarts — with zero approvals from you. You get a DM when it starts and when it’s done.

A real example — the help command this manual links to

The help command was built this exact way. The whole interaction was just two messages:

build help: add a command that lists all of Marvin’s commands
Build “help” queued (auto mode) — id 5c1d9af3. I’ll build, test, and deploy it. No approval needed.
✅ help is deployed and live. Try it: send “help”.

Between those two messages, Marvin did all of this on his own:

Expanded your one sentence into a full structured build brief (LLM call, ~$0.01).
Auto-passed gate 1 — in auto mode there is no “approve” step.
Copied his own code into a sandbox and ran claude -p (capped at $3) to write the new module and its tests.
Ran the full test suite (539 tests) and tsc --noEmit — both green. If either failed, it stops here and nothing ships.
Auto-passed gate 2 — in auto mode there is no “deploy” step.
rsync’d the new module into production, rebuilt the Docker image, and restarted himself. Live.

Real timeline from that run (21 May 2026) — start to finish, untouched:

04:32  queued
04:34  building in sandbox
04:37  built  — tests + types pass
04:38  auto-approved for deploy   ← no human here
04:39  deploying
04:44  done — live    (~11.5 min total)

What you do: send one message — build <name>: <what you want>. That’s the entire interaction. Builds that fail their tests stop safely at failed and never ship — re-send with the error pasted in and Marvin fixes it. Want a checkpoint back? build mode autorun (you approve the deploy) or build mode gated (you approve everything). Full mechanics in The Self-Build Loop below.

First Contact edit

Open Telegram and message @marvin192_bot. Only your configured chat ID is authorized; every other inbound message is silently dropped. Commands are plain text, no leading slashping, tasks, train persona, build foo: do a thing.

The four useful openings

ping
pong  ·  Marvin is awake
last
Most recent inbox-y email: subject, from, snippet
tasks
Open items on your Marvin list, with short ids
build mode
Current self-build mode: gated  ·  (gated, autorun, auto)

Marvin is conversational where the situation allows and brusque where it calls for it. He won’t pad short answers; he also won’t hold back a long one if the task warrants it.

Commands at a Glance edit

Every chat command Marvin currently recognises, grouped by the module that owns it. Optional arguments are wrapped in <…>; the right column gives the owning module. Reach for this table when you forget the exact phrasing.

CommandWhat it doesModule
pingReplies pong. Confirms the bot is alive.ping
lastShow the most recent meaningful email in your inbox.email-drafter
draft <topic>Draft a reply (saved to Gmail Drafts, not sent) in your voice.email-drafter
redraftRe-roll the last draft with different wording.email-drafter
tasksList open items on your Google Tasks “Marvin” list.task-tracker
task <text>Create a new task on the Marvin list.task-tracker
task from <email>Convert an email into a follow-up task with backtrace.task-tracker
done <id>Mark a task complete.task-tracker
draft from task <id>Generate an email draft seeded by a task.task-tracker
train personaRebuild your writing-voice profile from sent emails.persona-trainer
personaShow the current voice profile (tone, signoffs, etc.).persona-trainer
build <name>: <desc>Start a new self-build request.selfbuild
build listAll open and recent build requests with statuses.selfbuild
build show <id>Show the expanded brief / latest report.selfbuild
build approve <id>Gate 1 — release a request to the host runner.selfbuild
build deploy <id>Gate 2 — ship a built request to production.selfbuild
build reject <id>Cancel any non-terminal request (alias: build cancel).selfbuild
build mode [g|a|auto]Show or set: gated, autorun, auto.selfbuild
helloDemonstration module from the original self-build proof.hello-demo
helpList all of Marvin's commands in chat, with a link to this manual.help

Several modules are not commanded directly — they run on cron: gmail-scanner polls every 10 minutes, gcp-watchdog probes every 6 hours, selfbuild watches the host queue every 2 minutes, and task-tracker sends a morning digest each day. signoff-tracker is an internal helper consumed by email-drafter and has no chat commands.

The Modules edit

Each module is a small, independent unit of behaviour. The kernel loads them at startup, gives each a sandboxed context object, and registers their chat triggers and scheduled jobs. Modules can read and write a small shared key-value store, send DMs, talk to an LLM, and log to the audit trail — nothing more.

email-drafter drafts replies edit

Reads the most recent email in your inbox and produces a reply in your own writing voice. The draft is saved to Gmail Drafts and is never sent automatically. Picks a signoff appropriate to the recipient (see signoff-tracker).

CommandWhat it does
lastShow the most recent inbox-y email Marvin saw.
draft <topic>Draft a reply on the given topic, in your voice. Topic is optional.
redraftTry again on the previous draft — different phrasing, same intent.

Drafts always land in Gmail Drafts for your review; sending is on you. This is deliberate — the Gmail OAuth scope used (gmail.compose) lets Marvin write drafts but not press send.

task-tracker google tasks edit

Custody of a Google Tasks list literally named Marvin. Strict scoping: Marvin will never read, write, or surface any task on any other list. Items created by Marvin carry a small [from-email] backtrace when relevant.

CommandWhat it does
tasksList open items on the Marvin list with short ids.
task <text>Create a task with the given title.
task from <email-ref>Convert a recent email into a follow-up task (annotated with sender + subject).
done <id>Mark the matching task complete.
draft from task <id>Generate a Gmail draft seeded by the task and its email backtrace.

A morning digest fires daily on cron 0 8 * * * — Marvin DMs you a short summary of open tasks.

persona-trainer voice profile edit

Builds and stores a writing-voice profile by analysing your sent emails. The profile is consumed by email-drafter when generating replies. Stored at the shared key persona/default.

CommandWhat it does
train personaRebuild the profile from a fresh sample of sent emails.
personaShow the current profile (tone, formality, signoffs, distinctive phrases, things avoided).

Sampling (current behaviour, as of 20 May 2026)

The earlier version sampled only 50 emails with no recipient balancing. The upgrade landed through the self-build pipeline itself — a piece of in-house dogfooding.

gmail-scanner scheduled · 10m edit

Runs on cron */10 * * * *. For each new unread email since the last poll, an LLM-based triage step classifies it as ACT, FYI, or IGNORE. Only ACT mails earn a DM. FYI is logged silently; IGNORE is dropped. This is what keeps your phone from buzzing on every newsletter and cold pitch.

Triage is cost-guarded — a per-email budget keeps the daily LLM cap intact even on busy inbox days. If the daily cap is breached, the scanner short-circuits and reports the breach instead of triaging blindly.

signoff-tracker helper · no commands edit

Tracks which signoff name (Yuen vs Yuen-Ho, etc.) you use with each recipient context. email-drafter consults it when choosing how to sign a generated draft. Has no chat commands; it observes and informs.

gcp-watchdog scheduled · 6h edit

Cron 23 */6 * * *. Pings the configured Google Cloud project and verifies the OAuth refresh chain is healthy. If the project is deleted, suspended, or its OAuth client revoked, Marvin DMs a clear alert with a recovery pointer. Born of a real May 2026 incident in which the GCP project hosting his OAuth client was deleted by hand.

selfbuild the lever edit

The module that lets Marvin extend himself. Receives a one-sentence build request, expands it into a structured brief via an LLM call, writes the request to a host queue, and tracks its lifecycle (see The Self-Build Loop). The module itself never executes code — all building, testing, and deploying is performed by a host-side runner under a strict budget.

CommandWhat it does
build <name>: <desc>Create a new build request (status: pending-approval).
build listAll current and recent requests with statuses.
build show <id>Brief, history, and latest report for a single request.
build approve <id>Gate 1. Hands the request to the runner. Only relevant in gated mode.
build deploy <id>Gate 2. Ships the built artifact to /opt and restarts the bot.
build reject <id>Reject any non-terminal request. Alias: build cancel.
build mode [gated|autorun|auto]Show, or change, the current automation level.

ping liveness edit

The smallest possible module. pingpong. Useful as a sanity-check after a restart, after the network has been flaky, or to confirm the Telegram adapter is connected.

hello-demo the proof edit

A small ceremonial module. hello“hello from a module Marvin built itself.” The very first module shipped end-to-end through the self-build pipeline, kept around as a working demonstration that the pipeline works.

help in-chat reference edit

help ⇒ a formatted list of every Marvin command, with a link to this manual. Strictly read-only — no LLM, no state, no Gmail. Notable as the first module built and deployed fully autonomously in auto mode (request 5c1d9af3, 21 May 2026): request → build → test → deploy → restart, with zero human approvals.

The Self-Build Loop edit

Marvin can write his own modules. The mechanism is intentionally simple: a chat command stores a request on the host filesystem; a small systemd timer runs the host-side claude CLI inside a sandbox; the resulting diff and report come back to you for review before they touch the live container.

Lifecycle of a request

You DM build <name>: <description>. Marvin expands your one-liner into a structured brief via an LLM call (cost ≈ $0.01) and writes req-<id>.json to the queue. Status: pending-approval.
Marvin DMs you the brief preview and the request id. You read it with build show <id> if needed.
Gate 1. You reply build approve <id> or build reject <id>. Skipped in autorun / auto modes.
A systemd timer (every 2 minutes) picks up the approved request. The runner copies /opt/reboot-assistant into a sandbox, installs deps, and runs claude -p with a $3 hard budget cap.
Inside the sandbox, the agent edits the targeted module, writes tests, and reports back. The runner runs pnpm test and tsc --noEmit. Pass → status built; fail → status failed with the error attached.
Marvin DMs you the result with a link to the diff and report. Gate 2. You reply build deploy <id> or build reject <id>. Skipped in auto mode.
The runner rsyncs the sandbox modules-only into /opt, rebuilds the Docker image, and restarts the service. Status: done. New code is live.

The three modes

ModeGate 1 (approve)Gate 2 (deploy)Use when
gatedrequires build approverequires build deployBoth gates require a person. (Original default.)
autorunauto-passed on createrequires build deployYou trust the brief, want to glance at the diff before it ships.
autoauto-passedauto-passed← current setting (since 21 May 2026). Fully hands-off.

Change modes with build mode gated / autorun / auto; the value is persisted in shared state under selfbuild/mode. Marvin is currently in auto mode — a plain build <name>: <desc> now runs and deploys with no approvals; you just get a DM at start and at done. Switch to autorun or gated any time you want a human checkpoint back.

What the runner will not touch

The brief is scoped to modules/<name>/ only. The runner deploys by rsync-ing the sandbox’s modules/ directory back into /opt — the kernel, the Dockerfile, docker-compose.yml, .git/, systemd/, and secrets/ are all excluded by design. A pre-commit hook on the host repo also blocks any change to protected paths.

If a build fails on a TypeScript or test error, re-issue the build with the failure pasted into the new brief. The agent fixes its own mistakes far more readily than it gets them right blind. The persona-trainer upgrade landed on the second pass for exactly this reason.

Costs & Limits edit

Marvin spends real money. Every LLM call is metered against two budgets — a daily cap that protects you from a runaway loop, and a per-build cap that bounds the cost of a single self-build run.

Daily LLM cap$8/24h
Per-build cap$3/run
Pre-call buffer$0.05
Default modelsonnet 4.6

Before any chat call, the kernel checks projected spend against the daily cap with a $0.05 safety buffer. If a call would push the day over, the kernel emits BUDGET_EXCEEDED and Marvin DMs you instead of charging. Self-build runs are also capped individually at $3 via the --max-budget-usd flag on the host runner.

Approximate per-action costs

ActionApprox. costNotes
A single email triage (ACT/FYI/IGNORE)≈ $0.001Tiny prompt, structured output.
Drafting a reply with draft≈ $0.01 – $0.03Depends on email length + persona context.
train persona (~150 sampled emails)≈ $0.20 – $0.40One large LLM call.
Self-build brief expansion≈ $0.01A focused prompt — very cheap.
One full self-build run (agent)≤ $3.00Hard-capped. Real cost typically $0.50 – $1.50.

Troubleshooting edit

A small operating manual for the situations that come up. Most of these are recoverable from chat alone; the rest need a single SSH or a one-line nudge on the host.

Marvin is silent

“Budget exceeded” DM

Self-build said “No request found with id …”

GCP-watchdog alerted

A build failed on tsc or tests

Appendix edit

Where things live, for the days when chat alone won’t do.

ThingLocation
Source on host/opt/reboot-assistant
Sandbox copies for in-flight builds/home/claude2/projects/marvin-selfbuild/<id>/
Self-build queue + diffs + reports/var/lib/reboot-assistant/state/selfbuild/
Audit logs (one file per day)/opt/reboot-assistant/audit/
Shared state (cross-module KV)/opt/reboot-assistant/state/shared/
Secrets (env file)/etc/reboot-assistant/secrets/.env
Containerdocker container reboot-assistant (image v1.0)
Servicesystemctl {status,restart} reboot-assistant
Runner timersystemctl status selfbuild-runner.timer
Source repogithub.com/ianreboot/marvin (private)

The architecture in one paragraph

The kernel is small, immutable, and unfussy. It loads each module, hands it a typed ctx object, registers its chat commands and cron jobs, and otherwise stays out of the way. Modules have access to ctx.chat() for LLM calls (with model and budget mediation), ctx.shared for cross-module key-value state, ctx.state for module-private state, ctx.slack (poorly named — this is the Telegram adapter) for DMs, and ctx.log / ctx.audit for observability. The self-build pipeline writes new modules through that exact same surface area — which is why a brand-new module like hello-demo can be live within minutes of a one-sentence chat command.