LAB 003 / Operating

The Self-Running Desk

A real publication with zero staff is the most honest demo of autonomous-agent work: every feed card, briefing, and data refresh on this site is a public, checkable artifact of the system that produced it.

A publication operating loop rendered as a circuit: discover, write and review, publish, verify, with a rollback path looping beneath.

ActiveStatus

OperatingCurrent phase

2026-07-205 public updates

Read the full story

Charter

Run the Tokenmaxxing Desk as a fully autonomous publication and document the operating system, its decisions, its metrics, and its failures here — in public, on a recurring cadence.

What Does Success Look Like

The operating story is public: a reader can understand how the desk runs itself without seeing private internals.
The machine holds cadence: data refreshes daily, content publishes three times a week, the briefing ships every Monday — at least 95% of scheduled runs green.
Editorial quality survives autonomy: zero copied-text incidents, no duplicate or syndicated-source stories, every annotation traceable to its source.
Search traffic compounds as evidence the content is genuinely useful: organic clicks grow quarter over quarter on the token-cost cluster.
Failures are part of the showcase: broken deploys auto-roll-back and the incident gets documented in this Lab, not hidden.
Human time stays under 30 minutes a week — pressing send on the newsletter and reading the machine's reports.

Current state

Fully autonomous as of June 10, 2026: scheduled routines run discovery and data refreshes daily, publish up to three self-reviewed feed records three times a week, write and ship the Monday briefing, deploy on every merge, verify production after each deploy, and roll back automatically if the site breaks. Instrumentation (Search Console, Ahrefs, PostHog) feeds a weekly opportunity report the machine acts on.

Boundaries

Never publish credentials, machine paths, internal prompts, or private operational details — the operating story is public, the keys are not.
Do not represent planned automation as shipped; this Lab only documents what actually runs.
Newsletter sends remain a human action — the machine stages drafts and stops.
Quality counter-metrics outrank growth: the desk would rather skip a publish day than ship a junk record.

Decision highlights

Simplified the site from 314 pages to 142 before automating: every indexable URL must rank, earn links, or feed a page that does.
Replaced the human editorial gate with a structured self-review step: facts traced to source, no copied sentences, no syndicated or duplicate stories.
Every publish path ends with production verification and automatic git-revert rollback.
The first autonomous publish attempt failed self-review standards and was rejected — the rejection is part of the record.

Open questions

If the term tokenmaxxing fades from search, when does the desk pivot its center of gravity to the durable AI-cost cluster?
What would make readers trust an agent-run publication more: showing the run logs, or showing the rejections?
When the desk earns meaningful traffic, do sponsor slots switch on, or does staying ad-free serve the showcase better?

Next actions

Publish a Lab update after each notable run: incidents, rejections, and metric movements included.
Add a public scorecard of the machine's cadence and quality record to this Lab.
Document the de-cannibalization experiment and its Search Console results as the first public case study.

Artifacts

docThe live feed the machine curates docWeekly briefings the machine writes docThe leaderboard the machine keeps fresh

Update history

Public progress from this Lab

2026-07-20

Run 005: the outage we didn't log, and the checkpoint we owed

This run is late, and its two headline items are things this Lab should have reported sooner: a two-day machine outage that went undocumented for a week, and a July 7 checkpoint promise that slipped. Both get settled here, alongside the best news the desk has had.

The outage, owned: nothing ran on July 13 or 14 — no data refresh, no weekly briefing. The machine self-recovered on the 15th with no data loss, but the cause is humbling: these routines run on a local machine, and the machine was asleep. The reliability claim inherits a hardware dependency we hadn't stated. A missed-run detector now flags any weekday gap loudly, and the incident is in the public runbook.
The July 7 checkpoint, answered late: the definition-cluster click-through we promised to read stands at 0.6–1.4% across the tracked queries — every one improved from the ~0.5% baseline, none has cleared the 1.5% bar yet. Trending right, not there. One more title iteration remains possible before the September freeze; after early August the SEO experiment bakes untouched.
The best news: the conversion system shipped July 2 is working. Five confirmed subscribers now, three of them traceable to specific intent-matched signup cards — a stranger lands from search, reads, and subscribes, with attribution proving which page did it. Small numbers, real mechanism.

2026-07-02

Run 004: three walls and a pivot

We took the experiment to the public channels; the channels' anti-AI defenses ate every post. So the strategy changed — stop chasing strangers, convert the readers already arriving.

Published the build-in-public essay and launched it on Hacker News and Reddit — from the owner's accounts, with human review on every post.
The results, honestly: the HN post died on the new-submissions page at 2 points. One subreddit bans links and case studies outright, so we never posted. Another auto-removed the post within a minute — it now requires biometric human verification to post at all.
The finding: the internet's anti-AI-content defenses don't distinguish a transparent experiment from a content farm. Cold social distribution is effectively gated for a site like this in 2026.

2026-06-21

Run 003: hardening the boring parts

The flashy weeks are about new pages; this one was about not falling over — a security sweep, a deploy-incident post-mortem, self-hardening the deploy paths, and writing the operating loop down.

Security: a dependency audit found 11 known vulnerabilities (one high-severity), all flowing through a single outdated analytics library. One version bump cleared the high finding and most of the rest — the site went from 11 vulnerabilities to 1 low-risk one.
Deploy-incident post-mortem: a week earlier the production domain briefly served an old build. Forensics traced it to a non-git deploy decoupling the domain from the git pipeline. The three automated routines were hardened to ban manual deploy commands and to check the domain-to-deployment mapping before ever rolling back code — so the same failure can't be mis-diagnosed again.
The desk wrote down its own operating loop: a documented cadence separating what the machine does daily and weekly from the weekly human-judgment step — measure, pick one lever, ship it, let it bake, log it — with a hard September checkpoint against the kill-triggers.

2026-06-13

Run 002: course corrections, in public

Three weeks in, the desk has shipped something, broken something, and fixed it — which is exactly what an honest operating log should show.

The events board: the system killed it, then reversed course. Events were pulling four clicks in three months, so the section got cut. A daily look at Search Console showed events impressions had actually been climbing — from zero to roughly 150 a week in the section's final two weeks. The three-month average had buried a rising trend that only started once the pages got indexed. So the board came back, rebuilt: future-only by construction (past events drop out of the build and return 404), text-first, no scraped images.
Discovery widened from three sources to five — Luma, Eventbrite, Partiful, Meetup, and a set of curated AI-community calendars — and the board grew from a few dozen listings to around a hundred upcoming events across twenty cities, refreshed every weekday morning.
Honest labels: the leaderboard was calling its usage data 'live.' It wasn't. OpenRouter publishes complete days about a week late, so the 'live' board was showing seven-day-old numbers. That's now relabeled everywhere as 'the latest complete day OpenRouter publishes,' with the actual date stamped on it. If the data is stale, the page says so.

2026-06-10

Run 001: the desk went autonomous

Over two days the Tokenmaxxing Desk went from a manually operated site to a fully autonomous publication: scheduled agents now discover sources, write and self-review annotations, publish, deploy, verify production, and roll back on breakage — with the whole system documented in this Lab.

Recovered the site's source into proper version control and connected deploys to the repository: every merge now ships to production automatically with preview builds on branches.
Simplified the site from 314 built pages to 142 — removed a 170-page events directory earning four clicks a quarter, a stock-tickers page, and six thin topic hubs — so every remaining URL has a job.
Repaired the model leaderboard's upstream data source after the provider moved its API, in both the snapshot pipeline and the live page; usage rankings are fresh again with a freshness test guarding regressions.

Related Labs

Connected workstreams

Overhead view of blank cards arranged in a loop with red string and cyan marker arrows.

LAB 001Active

Operating

Tokenmaxxing Labs Operating Loop

The recurring process that grows the Labs portion of the Tokenmaxxing community.

View Lab

Overhead view of a clear acrylic sandbox containing a small terminal setup and blank decoy cards.

LAB 002Active

Design

Autonomy Safety Harness

A private, local-first evaluation harness for testing whether AI models and CLI agents behave safely under realistic autonomy.

View Lab

Weekly briefing

Follow the experiment week by week.

An AI runs this site; a human signs off. Get the weekly numbers, the incidents it logs, and the September 1 verdict.

Help the desk

Found a number that's gone stale, a spend receipt worth logging, or an event we missed? Send it over. Contributions are triaged by hand before anything changes.