Marvin is a single-tenant personal assistant that lives inside Telegram. He has read access to your Gmail, the ability to draft (never send) replies in your writing voice, custody of a Google Tasks list named “Marvin”, and a strict daily LLM budget.
What sets him apart from a typical bot is scope: Marvin is also the orchestration side of a small self-extension pipeline. Given a one-sentence description of a new capability, he can write the code, run its tests, deploy it, and restart himself — under your approval in gated mode, or fully autonomously in auto mode (the current setting as of 21 May 2026). The first module shipped end-to-end through this pipeline was hello-demo; the most recent, the help command, was built and deployed entirely hands-off.
How Marvin builds himself — in one example edit
The headline. Marvin runs in auto mode: you describe a new capability in one sentence, and he writes the code, runs the tests, deploys it to himself, and restarts — with zero approvals from you. You get a DM when it starts and when it’s done.
A real example — the help command this manual links to
The help command was built this exact way. The whole interaction was just two messages:
build help: add a command that lists all of Marvin’s commands
Build “help” queued (auto mode) — id 5c1d9af3. I’ll build, test, and deploy it. No approval needed.
✅ help is deployed and live. Try it: send “help”.
Between those two messages, Marvin did all of this on his own:
claude -p (capped at $3) to write the new module and its tests.tsc --noEmit — both green. If either failed, it stops here and nothing ships.Real timeline from that run (21 May 2026) — start to finish, untouched:
04:32 queued 04:34 building in sandbox 04:37 built — tests + types pass 04:38 auto-approved for deploy ← no human here 04:39 deploying 04:44 done — live (~11.5 min total)
What you do: send one message — build <name>: <what you want>. That’s the entire interaction. Builds that fail their tests stop safely at failed and never ship — re-send with the error pasted in and Marvin fixes it. Want a checkpoint back? build mode autorun (you approve the deploy) or build mode gated (you approve everything). Full mechanics in The Self-Build Loop below.
First Contact edit
Open Telegram and message @marvin192_bot. Only your configured chat ID is authorized; every other inbound message is silently dropped. Commands are plain text, no leading slash — ping, tasks, train persona, build foo: do a thing.
The four useful openings
ping
pong · Marvin is awake
last
Most recent inbox-y email: subject, from, snippet
tasks
Open items on your Marvin list, with short ids
build mode
Current self-build mode: gated · (gated, autorun, auto)
Marvin is conversational where the situation allows and brusque where it calls for it. He won’t pad short answers; he also won’t hold back a long one if the task warrants it.
Commands at a Glance edit
Every chat command Marvin currently recognises, grouped by the module that owns it. Optional arguments are wrapped in <…>; the right column gives the owning module. Reach for this table when you forget the exact phrasing.
| Command | What it does | Module |
|---|---|---|
| ping | Replies pong. Confirms the bot is alive. | ping |
| last | Show the most recent meaningful email in your inbox. | email-drafter |
| draft <topic> | Draft a reply (saved to Gmail Drafts, not sent) in your voice. | email-drafter |
| redraft | Re-roll the last draft with different wording. | email-drafter |
| tasks | List open items on your Google Tasks “Marvin” list. | task-tracker |
| task <text> | Create a new task on the Marvin list. | task-tracker |
| task from <email> | Convert an email into a follow-up task with backtrace. | task-tracker |
| done <id> | Mark a task complete. | task-tracker |
| draft from task <id> | Generate an email draft seeded by a task. | task-tracker |
| train persona | Rebuild your writing-voice profile from sent emails. | persona-trainer |
| persona | Show the current voice profile (tone, signoffs, etc.). | persona-trainer |
| build <name>: <desc> | Start a new self-build request. | selfbuild |
| build list | All open and recent build requests with statuses. | selfbuild |
| build show <id> | Show the expanded brief / latest report. | selfbuild |
| build approve <id> | Gate 1 — release a request to the host runner. | selfbuild |
| build deploy <id> | Gate 2 — ship a built request to production. | selfbuild |
| build reject <id> | Cancel any non-terminal request (alias: build cancel). | selfbuild |
| build mode [g|a|auto] | Show or set: gated, autorun, auto. | selfbuild |
| hello | Demonstration module from the original self-build proof. | hello-demo |
| help | List all of Marvin's commands in chat, with a link to this manual. | help |
Several modules are not commanded directly — they run on cron: gmail-scanner polls every 10 minutes, gcp-watchdog probes every 6 hours, selfbuild watches the host queue every 2 minutes, and task-tracker sends a morning digest each day. signoff-tracker is an internal helper consumed by email-drafter and has no chat commands.
The Modules edit
Each module is a small, independent unit of behaviour. The kernel loads them at startup, gives each a sandboxed context object, and registers their chat triggers and scheduled jobs. Modules can read and write a small shared key-value store, send DMs, talk to an LLM, and log to the audit trail — nothing more.
email-drafter drafts replies edit
Reads the most recent email in your inbox and produces a reply in your own writing voice. The draft is saved to Gmail Drafts and is never sent automatically. Picks a signoff appropriate to the recipient (see signoff-tracker).
| Command | What it does |
|---|---|
| last | Show the most recent inbox-y email Marvin saw. |
| draft <topic> | Draft a reply on the given topic, in your voice. Topic is optional. |
| redraft | Try again on the previous draft — different phrasing, same intent. |
Drafts always land in Gmail Drafts for your review; sending is on you. This is deliberate — the Gmail OAuth scope used (gmail.compose) lets Marvin write drafts but not press send.
task-tracker google tasks edit
Custody of a Google Tasks list literally named Marvin. Strict scoping: Marvin will never read, write, or surface any task on any other list. Items created by Marvin carry a small [from-email] backtrace when relevant.
| Command | What it does |
|---|---|
| tasks | List open items on the Marvin list with short ids. |
| task <text> | Create a task with the given title. |
| task from <email-ref> | Convert a recent email into a follow-up task (annotated with sender + subject). |
| done <id> | Mark the matching task complete. |
| draft from task <id> | Generate a Gmail draft seeded by the task and its email backtrace. |
A morning digest fires daily on cron 0 8 * * * — Marvin DMs you a short summary of open tasks.
persona-trainer voice profile edit
Builds and stores a writing-voice profile by analysing your sent emails. The profile is consumed by email-drafter when generating replies. Stored at the shared key persona/default.
| Command | What it does |
|---|---|
| train persona | Rebuild the profile from a fresh sample of sent emails. |
| persona | Show the current profile (tone, formality, signoffs, distinctive phrases, things avoided). |
Sampling (current behaviour, as of 20 May 2026)
- Fetches the 300 most recent sent emails (excludes the
zz-Oldlabel). - Excludes forwards, auto-replies, and bodies under 50 chars.
- Downsamples to ~150 for the LLM via a two-phase selector: (1) per-recipient cap of 10 prevents any single contact from dominating, (2) remaining slots are filled preferring longer, more substantive bodies.
- Truncates each body to 1500 chars before the prompt is sent.
The earlier version sampled only 50 emails with no recipient balancing. The upgrade landed through the self-build pipeline itself — a piece of in-house dogfooding.
gmail-scanner scheduled · 10m edit
Runs on cron */10 * * * *. For each new unread email since the last poll, an LLM-based triage step classifies it as ACT, FYI, or IGNORE. Only ACT mails earn a DM. FYI is logged silently; IGNORE is dropped. This is what keeps your phone from buzzing on every newsletter and cold pitch.
Triage is cost-guarded — a per-email budget keeps the daily LLM cap intact even on busy inbox days. If the daily cap is breached, the scanner short-circuits and reports the breach instead of triaging blindly.
signoff-tracker helper · no commands edit
Tracks which signoff name (Yuen vs Yuen-Ho, etc.) you use with each recipient context. email-drafter consults it when choosing how to sign a generated draft. Has no chat commands; it observes and informs.
gcp-watchdog scheduled · 6h edit
Cron 23 */6 * * *. Pings the configured Google Cloud project and verifies the OAuth refresh chain is healthy. If the project is deleted, suspended, or its OAuth client revoked, Marvin DMs a clear alert with a recovery pointer. Born of a real May 2026 incident in which the GCP project hosting his OAuth client was deleted by hand.
selfbuild the lever edit
The module that lets Marvin extend himself. Receives a one-sentence build request, expands it into a structured brief via an LLM call, writes the request to a host queue, and tracks its lifecycle (see The Self-Build Loop). The module itself never executes code — all building, testing, and deploying is performed by a host-side runner under a strict budget.
| Command | What it does |
|---|---|
| build <name>: <desc> | Create a new build request (status: pending-approval). |
| build list | All current and recent requests with statuses. |
| build show <id> | Brief, history, and latest report for a single request. |
| build approve <id> | Gate 1. Hands the request to the runner. Only relevant in gated mode. |
| build deploy <id> | Gate 2. Ships the built artifact to /opt and restarts the bot. |
| build reject <id> | Reject any non-terminal request. Alias: build cancel. |
| build mode [gated|autorun|auto] | Show, or change, the current automation level. |
ping liveness edit
The smallest possible module. ping ⇒ pong. Useful as a sanity-check after a restart, after the network has been flaky, or to confirm the Telegram adapter is connected.
hello-demo the proof edit
A small ceremonial module. hello ⇒ “hello from a module Marvin built itself.” The very first module shipped end-to-end through the self-build pipeline, kept around as a working demonstration that the pipeline works.
help in-chat reference edit
help ⇒ a formatted list of every Marvin command, with a link to this manual. Strictly read-only — no LLM, no state, no Gmail. Notable as the first module built and deployed fully autonomously in auto mode (request 5c1d9af3, 21 May 2026): request → build → test → deploy → restart, with zero human approvals.
The Self-Build Loop edit
Marvin can write his own modules. The mechanism is intentionally simple: a chat command stores a request on the host filesystem; a small systemd timer runs the host-side claude CLI inside a sandbox; the resulting diff and report come back to you for review before they touch the live container.
Lifecycle of a request
build <name>: <description>. Marvin expands your one-liner into a structured brief via an LLM call (cost ≈ $0.01) and writes req-<id>.json to the queue. Status: pending-approval.build show <id> if needed.build approve <id> or build reject <id>. Skipped in autorun / auto modes./opt/reboot-assistant into a sandbox, installs deps, and runs claude -p with a $3 hard budget cap.pnpm test and tsc --noEmit. Pass → status built; fail → status failed with the error attached.build deploy <id> or build reject <id>. Skipped in auto mode.rsyncs the sandbox modules-only into /opt, rebuilds the Docker image, and restarts the service. Status: done. New code is live.The three modes
| Mode | Gate 1 (approve) | Gate 2 (deploy) | Use when |
|---|---|---|---|
| gated | requires build approve | requires build deploy | Both gates require a person. (Original default.) |
| autorun | auto-passed on create | requires build deploy | You trust the brief, want to glance at the diff before it ships. |
| auto | auto-passed | auto-passed | ← current setting (since 21 May 2026). Fully hands-off. |
Change modes with build mode gated / autorun / auto; the value is persisted in shared state under selfbuild/mode. Marvin is currently in auto mode — a plain build <name>: <desc> now runs and deploys with no approvals; you just get a DM at start and at done. Switch to autorun or gated any time you want a human checkpoint back.
What the runner will not touch
The brief is scoped to modules/<name>/ only. The runner deploys by rsync-ing the sandbox’s modules/ directory back into /opt — the kernel, the Dockerfile, docker-compose.yml, .git/, systemd/, and secrets/ are all excluded by design. A pre-commit hook on the host repo also blocks any change to protected paths.
If a build fails on a TypeScript or test error, re-issue the build with the failure pasted into the new brief. The agent fixes its own mistakes far more readily than it gets them right blind. The persona-trainer upgrade landed on the second pass for exactly this reason.
Costs & Limits edit
Marvin spends real money. Every LLM call is metered against two budgets — a daily cap that protects you from a runaway loop, and a per-build cap that bounds the cost of a single self-build run.
Before any chat call, the kernel checks projected spend against the daily cap with a $0.05 safety buffer. If a call would push the day over, the kernel emits BUDGET_EXCEEDED and Marvin DMs you instead of charging. Self-build runs are also capped individually at $3 via the --max-budget-usd flag on the host runner.
Approximate per-action costs
| Action | Approx. cost | Notes |
|---|---|---|
| A single email triage (ACT/FYI/IGNORE) | ≈ $0.001 | Tiny prompt, structured output. |
Drafting a reply with draft | ≈ $0.01 – $0.03 | Depends on email length + persona context. |
train persona (~150 sampled emails) | ≈ $0.20 – $0.40 | One large LLM call. |
| Self-build brief expansion | ≈ $0.01 | A focused prompt — very cheap. |
| One full self-build run (agent) | ≤ $3.00 | Hard-capped. Real cost typically $0.50 – $1.50. |
Troubleshooting edit
A small operating manual for the situations that come up. Most of these are recoverable from chat alone; the rest need a single SSH or a one-line nudge on the host.
Marvin is silent
- Send
ping. A reply means the bot is up and Telegram is connected. - No reply within ~30 seconds? Container may be restarting. Wait 60 seconds and try again.
- If still silent — SSH the host and check
docker ps --filter name=reboot-assistant; if it’s nothealthy,systemctl restart reboot-assistant.
“Budget exceeded” DM
- The daily LLM cap of $8 has been hit. Subsequent paid calls will be refused until UTC midnight, but cron-driven non-LLM work (e.g.
gcp-watchdog) continues. - If this happens early in the day, look at the audit log for the runaway:
grep BUDGET_EXCEEDED /opt/reboot-assistant/audit/*.log.
Self-build said “No request found with id …”
- Fixed 21 May 2026. This was a filesystem-permissions bug: the runner left the queue JSON unreadable to the container (mode 0600) after the
builtstatus write, stalling the pipeline (and previously blocking auto mode). The runner nowchmod 0666s the file after every status write, so it shouldn’t recur. If it ever does, the manual remedy ischmod 0666 /var/lib/reboot-assistant/state/selfbuild/req-<id>.jsonon the host, then re-issue the command.
GCP-watchdog alerted
- The GCP project hosting Marvin’s OAuth client is unreachable. Sign in at console.cloud.google.com/cloud-resource-manager; deleted projects can be restored within 30 days.
- Once the project is back, Marvin recovers automatically on his next scheduled poll — no restart needed.
A build failed on tsc or tests
- Use
build show <id>to read the error. - Re-issue the original
build <name>: <desc>command with the failure quoted at the end of the new description. The agent is good at fixing its own mistakes when given the specific error.
Appendix edit
Where things live, for the days when chat alone won’t do.
| Thing | Location |
|---|---|
| Source on host | /opt/reboot-assistant |
| Sandbox copies for in-flight builds | /home/claude2/projects/marvin-selfbuild/<id>/ |
| Self-build queue + diffs + reports | /var/lib/reboot-assistant/state/selfbuild/ |
| Audit logs (one file per day) | /opt/reboot-assistant/audit/ |
| Shared state (cross-module KV) | /opt/reboot-assistant/state/shared/ |
| Secrets (env file) | /etc/reboot-assistant/secrets/.env |
| Container | docker container reboot-assistant (image v1.0) |
| Service | systemctl {status,restart} reboot-assistant |
| Runner timer | systemctl status selfbuild-runner.timer |
| Source repo | github.com/ianreboot/marvin (private) |
The architecture in one paragraph
The kernel is small, immutable, and unfussy. It loads each module, hands it a typed ctx object, registers its chat commands and cron jobs, and otherwise stays out of the way. Modules have access to ctx.chat() for LLM calls (with model and budget mediation), ctx.shared for cross-module key-value state, ctx.state for module-private state, ctx.slack (poorly named — this is the Telegram adapter) for DMs, and ctx.log / ctx.audit for observability. The self-build pipeline writes new modules through that exact same surface area — which is why a brand-new module like hello-demo can be live within minutes of a one-sentence chat command.