Marvin — A Field Manual

I.Preface

Marvin is a small, single-tenant assistant. He lives in a Docker container on a Contabo VPS in Singapore, speaks to you exclusively over Telegram (@marvin192_bot), and ignores every chat that isn’t yours. He has read access to your Gmail, the ability to draft (never send) replies in your writing voice, custody of a Google Tasks list named “Marvin”, and a strict daily LLM budget.

Where this manual differs from a typical bot help page is in scope. Marvin is not just a chatbot; he is the orchestration side of a small self-extension pipeline. Given a single sentence describing a new capability, and two explicit approvals from you, he can write the code, run its tests, deploy it, and restart himself.

Modules9

Daily LLM cap$8/day

Per-build cap$3/run

Telegramlong-poll

Plain text, no slashes. Marvin understands prose-style commands — ping, tasks, train persona, build foo: do a thing. No leading /, no @-mention required. You can also paste an email body and ask in natural language.

✦

II.First Contact

Open Telegram and search for @marvin192_bot. The very first message you send is what tells the bot who you are; only the configured Telegram chat ID (yours) is authorized, and every other inbound message is silently dropped.

The four useful openings

ping

pong  ·  Marvin is awake

last

Most recent inbox-y email: subject, from, snippet

tasks

Open items on your Marvin list, with short ids

build mode

Current self-build mode: gated  ·  (gated, autorun, auto)

Marvin is conversational where he can be and brusque where the situation calls for it. He won’t pad short answers; he also won’t hold back a long one if the task wants it.

✦

III.Commands at a Glance

Every chat command Marvin currently recognises, grouped by the module that owns it. Optional arguments are wrapped in <…>; the right column gives the chronic time-cost or schedule. Reach for this table when you forget the exact phrasing.

Command	What it does	Module / Cron
ping	Replies `pong`. Confirms the bot is alive.	ping
last	Show the most recent meaningful email in your inbox.	email-drafter
draft <topic>	Draft a reply (saved to Gmail Drafts, not sent) in your voice.	email-drafter
redraft	Re-roll the last draft with different wording.	email-drafter
tasks	List open items on your Google Tasks “Marvin” list.	task-tracker
task <text>	Create a new task on the Marvin list.	task-tracker
task from <email>	Convert an email into a follow-up task with backtrace.	task-tracker
done <id>	Mark a task complete.	task-tracker
draft from task <id>	Generate an email draft seeded by a task.	task-tracker
train persona	Rebuild your writing-voice profile from sent emails.	persona-trainer
persona	Show the current voice profile (tone, signoffs, etc.).	persona-trainer
build <name>: <desc>	Start a new self-build request for a module.	selfbuild
build list	All open and recent build requests with statuses.	selfbuild
build show <id>	Show the expanded brief / latest report for a request.	selfbuild
build approve <id>	Gate 1 — release a request to the host runner.	selfbuild
build deploy <id>	Gate 2 — ship a built request to production.	selfbuild
build reject <id>	Cancel any non-terminal request (alias: `build cancel`).	selfbuild
build mode [g\|a\|auto]	Show or set: `gated`, `autorun`, `auto`.	selfbuild
hello	Demonstration module from the original self-build proof.	hello-demo

Several modules are not commanded directly — they run on cron: gmail-scanner polls your inbox every 10 minutes, gcp-watchdog probes Google Cloud every 6 hours, selfbuild watches the host queue every 2 minutes, and task-tracker sends a morning digest each day. signoff-tracker is an internal helper that supports email-drafter and has no commands of its own.

✦

IV.The Modules

Each module is a small, independent unit of behaviour. The kernel loads them at startup, gives each a sandboxed context object, and registers their chat triggers and scheduled jobs. Modules can read and write a small shared key-value store, send DMs, talk to an LLM, and log to the audit trail — nothing more.

◆email-draftermodule · drafts replies

Reads the most recent email in your inbox (configurable filter) and produces a reply in your own writing voice. The draft is saved to Gmail Drafts and is never sent automatically. Picks a signoff appropriate to the recipient (see signoff-tracker).

Command	What it does
last	Show the most recent inbox-y email Marvin saw.
draft <topic>	Draft a reply on the given topic, in your voice. Topic is optional.
redraft	Try again on the previous draft — different phrasing, same intent.

Drafts always land in Gmail Drafts for your review; sending is on you. This is deliberate — the Gmail OAuth scope used (gmail.compose) lets Marvin write drafts but not press send.

◆task-trackermodule · google tasks

Custody of a Google Tasks list literally named Marvin. Strict scoping: Marvin will never read, write, or surface any task on any other list. Items created by Marvin carry a small [from-email] backtrace when relevant, so you can find the original later.

Command	What it does
tasks	List open items on the Marvin list with short ids.
task <text>	Create a task with the given title.
task from <email-ref>	Convert a recent email into a follow-up task (annotated with sender + subject).
done <id>	Mark the matching task complete.
draft from task <id>	Generate a Gmail draft seeded by the task and its email backtrace.

A morning digest fires daily on cron 0 8 * * * — Marvin DMs you a short summary of open tasks.

◆persona-trainermodule · voice profile

Builds and stores a writing-voice profile by analysing your sent emails. The profile is consumed by email-drafter when generating replies. Stored at the shared key persona/default.

Command	What it does
train persona	Rebuild the profile from a fresh sample of sent emails.
persona	Show the current profile (tone, formality, signoffs, distinctive phrases, things avoided).

Sampling (as of 2026·05·20)

Fetches the 300 most recent sent emails (excludes the zz-Old label).
Excludes forwards, auto-replies, and bodies under 50 chars.
Downsamples to ~150 for the LLM via a two-phase selector: (1) per-recipient cap of 10 prevents any single contact from dominating, (2) remaining slots are filled preferring longer, more substantive bodies.
Truncates each body to 1500 chars before the prompt is sent.

The earlier version sampled only 50 emails with no recipient balancing. Updated this morning via the self-build pipeline itself — a small piece of dogfooding.

◆gmail-scannermodule · scheduled · 10m

Runs on cron */10 * * * *. For each new unread email since the last poll, an LLM-based triage step classifies it as ACT, FYI, or IGNORE. Only ACT mails earn a DM. FYI is logged silently; IGNORE is dropped. This is what keeps your phone from buzzing on every newsletter and cold pitch.

Triage is cost-guarded — a per-email budget keeps the daily LLM cap intact even on busy inbox days. If the daily cap is breached, the scanner short-circuits and reports the breach instead of triaging blindly.

◆signoff-trackermodule · helper · no commands

Tracks which signoff name (Yuen vs Yuen-Ho, etc.) you use with each recipient context. email-drafter consults it when choosing how to sign a generated draft. Has no chat commands; it observes and informs.

◆gcp-watchdogmodule · scheduled · 6h

Cron 23 */6 * * *. Pings the configured Google Cloud project and verifies the OAuth refresh chain is healthy. If the project is deleted, suspended, or its OAuth client revoked, Marvin DMs you a clear alert with a recovery pointer. Born of a real May-2026 incident in which the GCP project hosting his OAuth client was deleted by hand.

◆selfbuildmodule · the lever

The module that lets Marvin extend himself. Receives a one-sentence build request, expands it into a structured brief via an LLM call, writes the request to a host queue, and tracks its lifecycle (see § V). The module itself never executes code — all building, testing, and deploying is performed by a host-side runner under a strict budget.

Command	What it does
build <name>: <desc>	Create a new build request (status: `pending-approval`).
build list	All current and recent requests with statuses.
build show <id>	Brief, history, and latest report for a single request.
build approve <id>	Gate 1. Hands the request to the runner. Only relevant in `gated` mode.
build deploy <id>	Gate 2. Ships the built artifact to `/opt` and restarts the bot.
build reject <id>	Reject any non-terminal request. Alias: `build cancel`.
build mode [gated\|autorun\|auto]	Show, or change, the current automation level.

◆pingmodule · liveness

The smallest possible module. ping ⇒ pong. Useful as a sanity-check after a restart, after the network has been flaky, or to confirm the Telegram adapter is connected.

◆hello-demomodule · the proof

A small ceremonial module. hello ⇒ “hello from a module Marvin built itself.” The very first module shipped end-to-end through the self-build pipeline, kept around as a working demonstration that the pipeline works.

✦

V.The Self-Build Loop

Marvin can write his own modules. The mechanism is intentionally simple: a chat command stores a request on the host filesystem; a small systemd timer runs the host-side claude CLI inside a sandbox; the resulting diff and report come back to you for review before they touch the live container.

Lifecycle of a request

1You DM build <name>: <description>. Marvin expands your one-liner into a structured brief via an LLM call (cost ≈ $0.01) and writes req-<id>.json to the queue. Status: pending-approval.

2Marvin DMs you the brief preview and the request id. You read it with build show <id> if needed.

3Gate 1. You reply build approve <id> or build reject <id>. Skipped in autorun / auto modes.

4A systemd timer (every 2 minutes) picks up the approved request. The runner copies /opt/reboot-assistant into a sandbox, installs deps, and runs claude -p with a $3 hard budget cap.

5Inside the sandbox, the agent edits the targeted module, writes tests, and reports back. The runner runs pnpm test and tsc --noEmit. Pass → status built; fail → status failed with the error attached.

6Marvin DMs you the result with a link to the diff and report. Gate 2. You reply build deploy <id> or build reject <id>. Skipped in auto mode.

7The runner rsyncs the sandbox modules-only into /opt, rebuilds the Docker image, and restarts the service. Status: done. New code is live.

The three modes

Mode	Gate 1 (approve)	Gate 2 (deploy)	Use when
gated	requires `build approve`	requires `build deploy`	Default. Both gates require a person.
autorun	auto-passed on create	requires `build deploy`	You trust the brief, want to see the diff.
auto	auto-passed	auto-passed	Reserved for tightly-scoped trusted briefs only.

Change modes with build mode autorun (or the others); the value is persisted in shared state under selfbuild/mode. Default is gated.

What the runner will not touch

The brief is scoped to modules/<name>/ only. The runner deploys by rsync-ing the sandbox’s modules/ directory back into /opt — the kernel, the Dockerfile, docker-compose.yml, .git/, systemd/, and secrets/ are all excluded by design. A pre-commit hook on the host repo also blocks any change to protected paths.

If a build fails on a TypeScript or test error, re-issue the build with the failure pasted into the new brief. The agent fixes its own mistakes far more readily than it gets them right blind. The persona-trainer upgrade landed on the second pass for exactly this reason.

✦

VI.Costs & Limits

Marvin spends real money. Every LLM call is metered against two budgets — a daily cap that protects you from a runaway loop, and a per-build cap that bounds the cost of a single self-build run.

Daily LLM cap$8/24h

Per-build cap$3/run

Pre-call gate$0.05buffer

Default modelsonnet 4.6

Before any chat call, the kernel checks projected spend against the daily cap with a $0.05 safety buffer. If a call would push the day over, the kernel emits BUDGET_EXCEEDED and Marvin DMs you instead of charging. Self-build runs are also capped individually at $3 via the --max-budget-usd flag on the host runner.

Approximate per-action costs

Action	Approx. cost	Notes
A single email triage (ACT/FYI/IGNORE)	≈ $0.001	Tiny prompt, structured output.
Drafting a reply with `draft`	≈ $0.01 – $0.03	Depends on email length + persona context.
`train persona` (~150 sampled emails)	≈ $0.30 – $0.50	One large LLM call.
Self-build brief expansion	≈ $0.01	A focused prompt — very cheap.
One full self-build run (agent)	≤ $3.00	Hard-capped. Real cost typically $0.50 – $1.50.

✦

VII.Troubleshooting

A small operating manual for the situations that come up. Most of these are recoverable from chat alone; the rest need a single SSH or a one-line nudge on the host.

Marvin is silent

Send ping. A reply means the bot is up and Telegram is connected.
No reply within ~30 seconds? Container may be restarting. Wait 60 seconds and try again.
If still silent — SSH the host and check docker ps --filter name=reboot-assistant; if it’s not healthy, systemctl restart reboot-assistant.

“Budget exceeded” DM

The daily LLM cap of $8 has been hit. Subsequent paid calls will be refused until UTC midnight, but cron-driven non-LLM work (e.g. gcp-watchdog) continues.
If this happens early in the day, look at the audit log for the runaway: grep BUDGET_EXCEEDED /opt/reboot-assistant/audit/*.log.

Self-build said “No request found with id …”

Almost always a known filesystem-permissions edge case where the host runner’s status-write leaves the queue JSON unreadable to the container (mode 0600). Re-issue the original chat command after a single chmod 0666 /var/lib/reboot-assistant/state/selfbuild/req-<id>.json on the host.

GCP-watchdog alerted

The GCP project hosting Marvin’s OAuth client is unreachable. Sign in at console.cloud.google.com/cloud-resource-manager; deleted projects can be restored within 30 days.
Once the project is back, Marvin recovers automatically on his next scheduled poll — no restart needed.

A build failed on `tsc` or tests

Use build show <id> to read the error.
Re-issue the original build <name>: <desc> command with the failure quoted at the end of the new description. The agent is good at fixing its own mistakes when given the specific error.

✦

VIII.Appendix

Where things live, for the days when chat alone won’t do.

Thing	Location
Source on host	/opt/reboot-assistant
Sandbox copies for in-flight builds	/home/claude2/projects/marvin-selfbuild/<id>/
Self-build queue + diffs + reports	/var/lib/reboot-assistant/state/selfbuild/
Audit logs (one file per day)	/opt/reboot-assistant/audit/
Shared state (cross-module KV)	/opt/reboot-assistant/state/shared/
Secrets (env file)	/etc/reboot-assistant/secrets/.env
Container	docker container reboot-assistant (image v1.0)
Service	systemctl {status,restart} reboot-assistant
Runner timer	systemctl status selfbuild-runner.timer
Source repo	github.com/ianreboot/marvin (private)

The architecture in one paragraph

The kernel is small, immutable, and unfussy. It loads each module, hands it a typed ctx object, registers its chat commands and cron jobs, and otherwise stays out of the way. Modules have access to ctx.chat() for LLM calls (with model and budget mediation), ctx.shared for cross-module key-value state, ctx.state for module-private state, ctx.slack (poorly named — this is the Telegram adapter) for DMs, and ctx.log / ctx.audit for observability. The self-build pipeline writes new modules through that exact same surface area — which is why a brand-new module like hello-demo can be live within minutes of a one-sentence chat command.