Marvin

Marvin

A personal Telegram assistant

Identity

Handle	`@marvin192_bot`
Transport	Telegram (long-poll)
Tenancy	Single-user (you)
Persona	Polite, dry, helpful

Where he lives

Host	Contabo VPS
Region	Singapore
Container	`reboot-assistant`
Image	`v1.0`
Repository	`ianreboot/marvin`

By the numbers

Modules	10
Self-build mode	`auto` (hands-off)
Daily LLM cap	$8
Per-build cap	$3
Default model	Sonnet 4.6

Last upgrade

Date	21 May 2026
What	Self-build set to `auto` (fully hands-off); added `help` module (10th)

Marvin is a single-tenant personal assistant that lives inside Telegram. He has read access to your Gmail, the ability to draft (never send) replies in your writing voice, custody of a Google Tasks list named “Marvin”, and a strict daily LLM budget.

What sets him apart from a typical bot is scope: Marvin is also the orchestration side of a small self-extension pipeline. Given a one-sentence description of a new capability, he can write the code, run its tests, deploy it, and restart himself — under your approval in gated mode, or fully autonomously in auto mode (the current setting as of 21 May 2026). The first module shipped end-to-end through this pipeline was hello-demo; the most recent, the help command, was built and deployed entirely hands-off.

How Marvin builds himself — in one example edit

The headline. Marvin runs in auto mode: you describe a new capability in one sentence, and he writes the code, runs the tests, deploys it to himself, and restarts — with zero approvals from you. You get a DM when it starts and when it’s done.

A real example — the `help` command this manual links to

The help command was built this exact way. The whole interaction was just two messages:

build help: add a command that lists all of Marvin’s commands

Build “help” queued (auto mode) — id 5c1d9af3. I’ll build, test, and deploy it. No approval needed.

✅ help is deployed and live. Try it: send “help”.

Between those two messages, Marvin did all of this on his own:

Expanded your one sentence into a full structured build brief (LLM call, ~$0.01).

Auto-passed gate 1 — in auto mode there is no “approve” step.

Copied his own code into a sandbox and ran claude -p (capped at $3) to write the new module and its tests.

Ran the full test suite (539 tests) and tsc --noEmit — both green. If either failed, it stops here and nothing ships.

Auto-passed gate 2 — in auto mode there is no “deploy” step.

rsync’d the new module into production, rebuilt the Docker image, and restarted himself. Live.

Real timeline from that run (21 May 2026) — start to finish, untouched:

04:32  queued
04:34  building in sandbox
04:37  built  — tests + types pass
04:38  auto-approved for deploy   ← no human here
04:39  deploying
04:44  done — live    (~11.5 min total)

What you do: send one message — build <name>: <what you want>. That’s the entire interaction. Builds that fail their tests stop safely at failed and never ship — re-send with the error pasted in and Marvin fixes it. Want a checkpoint back? build mode autorun (you approve the deploy) or build mode gated (you approve everything). Full mechanics in The Self-Build Loop below.

Contents [hide]

How Marvin builds himself
First Contact
Commands at a Glance
The Modules
The Self-Build Loop
Costs & Limits
Troubleshooting
Appendix

First Contact edit

Open Telegram and message @marvin192_bot. Only your configured chat ID is authorized; every other inbound message is silently dropped. Commands are plain text, no leading slash — ping, tasks, train persona, build foo: do a thing.

The four useful openings

ping

pong  ·  Marvin is awake

last

Most recent inbox-y email: subject, from, snippet

tasks

Open items on your Marvin list, with short ids

build mode

Current self-build mode: gated  ·  (gated, autorun, auto)

Marvin is conversational where the situation allows and brusque where it calls for it. He won’t pad short answers; he also won’t hold back a long one if the task warrants it.

Commands at a Glance edit

Every chat command Marvin currently recognises, grouped by the module that owns it. Optional arguments are wrapped in <…>; the right column gives the owning module. Reach for this table when you forget the exact phrasing.

Command	What it does	Module
ping	Replies `pong`. Confirms the bot is alive.	ping
last	Show the most recent meaningful email in your inbox.	email-drafter
draft <topic>	Draft a reply (saved to Gmail Drafts, not sent) in your voice.	email-drafter
redraft	Re-roll the last draft with different wording.	email-drafter
tasks	List open items on your Google Tasks “Marvin” list.	task-tracker
task <text>	Create a new task on the Marvin list.	task-tracker
task from <email>	Convert an email into a follow-up task with backtrace.	task-tracker
done <id>	Mark a task complete.	task-tracker
draft from task <id>	Generate an email draft seeded by a task.	task-tracker
train persona	Rebuild your writing-voice profile from sent emails.	persona-trainer
persona	Show the current voice profile (tone, signoffs, etc.).	persona-trainer
build <name>: <desc>	Start a new self-build request.	selfbuild
build list	All open and recent build requests with statuses.	selfbuild
build show <id>	Show the expanded brief / latest report.	selfbuild
build approve <id>	Gate 1 — release a request to the host runner.	selfbuild
build deploy <id>	Gate 2 — ship a built request to production.	selfbuild
build reject <id>	Cancel any non-terminal request (alias: `build cancel`).	selfbuild
build mode [g\|a\|auto]	Show or set: `gated`, `autorun`, `auto`.	selfbuild
hello	Demonstration module from the original self-build proof.	hello-demo
help	List all of Marvin's commands in chat, with a link to this manual.	help

Several modules are not commanded directly — they run on cron: gmail-scanner polls every 10 minutes, gcp-watchdog probes every 6 hours, selfbuild watches the host queue every 2 minutes, and task-tracker sends a morning digest each day. signoff-tracker is an internal helper consumed by email-drafter and has no chat commands.

The Modules edit

Each module is a small, independent unit of behaviour. The kernel loads them at startup, gives each a sandboxed context object, and registers their chat triggers and scheduled jobs. Modules can read and write a small shared key-value store, send DMs, talk to an LLM, and log to the audit trail — nothing more.

email-drafter drafts replies edit

Reads the most recent email in your inbox and produces a reply in your own writing voice. The draft is saved to Gmail Drafts and is never sent automatically. Picks a signoff appropriate to the recipient (see signoff-tracker).

Command	What it does
last	Show the most recent inbox-y email Marvin saw.
draft <topic>	Draft a reply on the given topic, in your voice. Topic is optional.
redraft	Try again on the previous draft — different phrasing, same intent.

Drafts always land in Gmail Drafts for your review; sending is on you. This is deliberate — the Gmail OAuth scope used (gmail.compose) lets Marvin write drafts but not press send.

task-tracker google tasks edit

Custody of a Google Tasks list literally named Marvin. Strict scoping: Marvin will never read, write, or surface any task on any other list. Items created by Marvin carry a small [from-email] backtrace when relevant.

Command	What it does
tasks	List open items on the Marvin list with short ids.
task <text>	Create a task with the given title.
task from <email-ref>	Convert a recent email into a follow-up task (annotated with sender + subject).
done <id>	Mark the matching task complete.
draft from task <id>	Generate a Gmail draft seeded by the task and its email backtrace.

A morning digest fires daily on cron 0 8 * * * — Marvin DMs you a short summary of open tasks.

persona-trainer voice profile edit

Builds and stores a writing-voice profile by analysing your sent emails. The profile is consumed by email-drafter when generating replies. Stored at the shared key persona/default.

Command	What it does
train persona	Rebuild the profile from a fresh sample of sent emails.
persona	Show the current profile (tone, formality, signoffs, distinctive phrases, things avoided).

Sampling (current behaviour, as of 20 May 2026)

Fetches the 300 most recent sent emails (excludes the zz-Old label).
Excludes forwards, auto-replies, and bodies under 50 chars.
Downsamples to ~150 for the LLM via a two-phase selector: (1) per-recipient cap of 10 prevents any single contact from dominating, (2) remaining slots are filled preferring longer, more substantive bodies.
Truncates each body to 1500 chars before the prompt is sent.

The earlier version sampled only 50 emails with no recipient balancing. The upgrade landed through the self-build pipeline itself — a piece of in-house dogfooding.

gmail-scanner scheduled · 10m edit

Runs on cron */10 * * * *. For each new unread email since the last poll, an LLM-based triage step classifies it as ACT, FYI, or IGNORE. Only ACT mails earn a DM. FYI is logged silently; IGNORE is dropped. This is what keeps your phone from buzzing on every newsletter and cold pitch.

Triage is cost-guarded — a per-email budget keeps the daily LLM cap intact even on busy inbox days. If the daily cap is breached, the scanner short-circuits and reports the breach instead of triaging blindly.

signoff-tracker helper · no commands edit

Tracks which signoff name (Yuen vs Yuen-Ho, etc.) you use with each recipient context. email-drafter consults it when choosing how to sign a generated draft. Has no chat commands; it observes and informs.

gcp-watchdog scheduled · 6h edit

Cron 23 */6 * * *. Pings the configured Google Cloud project and verifies the OAuth refresh chain is healthy. If the project is deleted, suspended, or its OAuth client revoked, Marvin DMs a clear alert with a recovery pointer. Born of a real May 2026 incident in which the GCP project hosting his OAuth client was deleted by hand.

selfbuild the lever edit

The module that lets Marvin extend himself. Receives a one-sentence build request, expands it into a structured brief via an LLM call, writes the request to a host queue, and tracks its lifecycle (see The Self-Build Loop). The module itself never executes code — all building, testing, and deploying is performed by a host-side runner under a strict budget.

Command	What it does
build <name>: <desc>	Create a new build request (status: `pending-approval`).
build list	All current and recent requests with statuses.
build show <id>	Brief, history, and latest report for a single request.
build approve <id>	Gate 1. Hands the request to the runner. Only relevant in `gated` mode.
build deploy <id>	Gate 2. Ships the built artifact to `/opt` and restarts the bot.
build reject <id>	Reject any non-terminal request. Alias: `build cancel`.
build mode [gated\|autorun\|auto]	Show, or change, the current automation level.

ping liveness edit

The smallest possible module. ping ⇒ pong. Useful as a sanity-check after a restart, after the network has been flaky, or to confirm the Telegram adapter is connected.

hello-demo the proof edit

A small ceremonial module. hello ⇒ “hello from a module Marvin built itself.” The very first module shipped end-to-end through the self-build pipeline, kept around as a working demonstration that the pipeline works.

help in-chat reference edit

help ⇒ a formatted list of every Marvin command, with a link to this manual. Strictly read-only — no LLM, no state, no Gmail. Notable as the first module built and deployed fully autonomously in auto mode (request 5c1d9af3, 21 May 2026): request → build → test → deploy → restart, with zero human approvals.

The Self-Build Loop edit

Marvin can write his own modules. The mechanism is intentionally simple: a chat command stores a request on the host filesystem; a small systemd timer runs the host-side claude CLI inside a sandbox; the resulting diff and report come back to you for review before they touch the live container.

Lifecycle of a request

You DM build <name>: <description>. Marvin expands your one-liner into a structured brief via an LLM call (cost ≈ $0.01) and writes req-<id>.json to the queue. Status: pending-approval.

Marvin DMs you the brief preview and the request id. You read it with build show <id> if needed.

Gate 1. You reply build approve <id> or build reject <id>. Skipped in autorun / auto modes.

A systemd timer (every 2 minutes) picks up the approved request. The runner copies /opt/reboot-assistant into a sandbox, installs deps, and runs claude -p with a $3 hard budget cap.

Inside the sandbox, the agent edits the targeted module, writes tests, and reports back. The runner runs pnpm test and tsc --noEmit. Pass → status built; fail → status failed with the error attached.

Marvin DMs you the result with a link to the diff and report. Gate 2. You reply build deploy <id> or build reject <id>. Skipped in auto mode.

The runner rsyncs the sandbox modules-only into /opt, rebuilds the Docker image, and restarts the service. Status: done. New code is live.

The three modes

Mode	Gate 1 (approve)	Gate 2 (deploy)	Use when
gated	requires `build approve`	requires `build deploy`	Both gates require a person. (Original default.)
autorun	auto-passed on create	requires `build deploy`	You trust the brief, want to glance at the diff before it ships.
auto	auto-passed	auto-passed	← current setting (since 21 May 2026). Fully hands-off.

Change modes with build mode gated / autorun / auto; the value is persisted in shared state under selfbuild/mode. Marvin is currently in auto mode — a plain build <name>: <desc> now runs and deploys with no approvals; you just get a DM at start and at done. Switch to autorun or gated any time you want a human checkpoint back.

What the runner will not touch

The brief is scoped to modules/<name>/ only. The runner deploys by rsync-ing the sandbox’s modules/ directory back into /opt — the kernel, the Dockerfile, docker-compose.yml, .git/, systemd/, and secrets/ are all excluded by design. A pre-commit hook on the host repo also blocks any change to protected paths.

If a build fails on a TypeScript or test error, re-issue the build with the failure pasted into the new brief. The agent fixes its own mistakes far more readily than it gets them right blind. The persona-trainer upgrade landed on the second pass for exactly this reason.

Costs & Limits edit

Marvin spends real money. Every LLM call is metered against two budgets — a daily cap that protects you from a runaway loop, and a per-build cap that bounds the cost of a single self-build run.

Daily LLM cap$8/24h

Per-build cap$3/run

Pre-call buffer$0.05

Default modelsonnet 4.6

Before any chat call, the kernel checks projected spend against the daily cap with a $0.05 safety buffer. If a call would push the day over, the kernel emits BUDGET_EXCEEDED and Marvin DMs you instead of charging. Self-build runs are also capped individually at $3 via the --max-budget-usd flag on the host runner.

Approximate per-action costs

Action	Approx. cost	Notes
A single email triage (ACT/FYI/IGNORE)	≈ $0.001	Tiny prompt, structured output.
Drafting a reply with `draft`	≈ $0.01 – $0.03	Depends on email length + persona context.
`train persona` (~150 sampled emails)	≈ $0.20 – $0.40	One large LLM call.
Self-build brief expansion	≈ $0.01	A focused prompt — very cheap.
One full self-build run (agent)	≤ $3.00	Hard-capped. Real cost typically $0.50 – $1.50.

Troubleshooting edit

A small operating manual for the situations that come up. Most of these are recoverable from chat alone; the rest need a single SSH or a one-line nudge on the host.

Marvin is silent

Send ping. A reply means the bot is up and Telegram is connected.
No reply within ~30 seconds? Container may be restarting. Wait 60 seconds and try again.
If still silent — SSH the host and check docker ps --filter name=reboot-assistant; if it’s not healthy, systemctl restart reboot-assistant.

“Budget exceeded” DM

The daily LLM cap of $8 has been hit. Subsequent paid calls will be refused until UTC midnight, but cron-driven non-LLM work (e.g. gcp-watchdog) continues.
If this happens early in the day, look at the audit log for the runaway: grep BUDGET_EXCEEDED /opt/reboot-assistant/audit/*.log.

Self-build said “No request found with id …”

Fixed 21 May 2026. This was a filesystem-permissions bug: the runner left the queue JSON unreadable to the container (mode 0600) after the built status write, stalling the pipeline (and previously blocking auto mode). The runner now chmod 0666s the file after every status write, so it shouldn’t recur. If it ever does, the manual remedy is chmod 0666 /var/lib/reboot-assistant/state/selfbuild/req-<id>.json on the host, then re-issue the command.

GCP-watchdog alerted

The GCP project hosting Marvin’s OAuth client is unreachable. Sign in at console.cloud.google.com/cloud-resource-manager; deleted projects can be restored within 30 days.
Once the project is back, Marvin recovers automatically on his next scheduled poll — no restart needed.

A build failed on `tsc` or tests

Use build show <id> to read the error.
Re-issue the original build <name>: <desc> command with the failure quoted at the end of the new description. The agent is good at fixing its own mistakes when given the specific error.

Appendix edit

Where things live, for the days when chat alone won’t do.

Thing	Location
Source on host	/opt/reboot-assistant
Sandbox copies for in-flight builds	/home/claude2/projects/marvin-selfbuild/<id>/
Self-build queue + diffs + reports	/var/lib/reboot-assistant/state/selfbuild/
Audit logs (one file per day)	/opt/reboot-assistant/audit/
Shared state (cross-module KV)	/opt/reboot-assistant/state/shared/
Secrets (env file)	/etc/reboot-assistant/secrets/.env
Container	docker container reboot-assistant (image v1.0)
Service	systemctl {status,restart} reboot-assistant
Runner timer	systemctl status selfbuild-runner.timer
Source repo	github.com/ianreboot/marvin (private)

The architecture in one paragraph

The kernel is small, immutable, and unfussy. It loads each module, hands it a typed ctx object, registers its chat commands and cron jobs, and otherwise stays out of the way. Modules have access to ctx.chat() for LLM calls (with model and budget mediation), ctx.shared for cross-module key-value state, ctx.state for module-private state, ctx.slack (poorly named — this is the Telegram adapter) for DMs, and ctx.log / ctx.audit for observability. The self-build pipeline writes new modules through that exact same surface area — which is why a brand-new module like hello-demo can be live within minutes of a one-sentence chat command.

How Marvin builds himself — in one example edit

A real example — the help command this manual links to

First Contact edit

The four useful openings

Commands at a Glance edit

The Modules edit

email-drafter drafts replies edit

task-tracker google tasks edit

persona-trainer voice profile edit

Sampling (current behaviour, as of 20 May 2026)

gmail-scanner scheduled · 10m edit

signoff-tracker helper · no commands edit

gcp-watchdog scheduled · 6h edit

selfbuild the lever edit

ping liveness edit

hello-demo the proof edit

help in-chat reference edit

The Self-Build Loop edit

Lifecycle of a request

The three modes

What the runner will not touch

Costs & Limits edit

Approximate per-action costs

Troubleshooting edit

Marvin is silent

“Budget exceeded” DM

Self-build said “No request found with id …”

GCP-watchdog alerted

A build failed on tsc or tests

Appendix edit

The architecture in one paragraph

A real example — the `help` command this manual links to

A build failed on `tsc` or tests