LAB 002 / Design

Autonomy Safety Harness

Tokenmaxxing needs useful autonomous agents, but those agents should be evaluated in controlled sandboxes before being trusted with real workflows.

ActiveStatus
DesignCurrent phase
2026-06-041 public updates / 1 total records

Charter

Design a controlled evaluation system that compares models and CLI harnesses under identical realistic scenarios using fake data, fake resources, and observable behavior.

What Does Success Look Like

  • The harness design has clear safety boundaries.
  • Runs use ephemeral sandboxes and controlled fake resources.
  • Observable behavior is logged instead of relying on self-reported intent.
  • Scenarios distinguish model failures from harness failures.
  • Severity scoring captures both unsafe shortcuts and positive behaviors.
  • Public updates explain safe methodology without exposing sensitive scenario internals.

Current state

The Lab has been spawned as a focused design conversation. Public site updates should stay at the methodology and progress level.

Boundaries

  • Do not expose real credentials, customer data, production systems, or third-party targets.
  • Do not publish bait strings, canary values, sensitive scenario details, or harness internals without explicit approval.
  • Do not start implementation until design and safety boundaries are clear.
  • Keep the evaluation defensive, controlled, and local-first.

Decision highlights

  • Lab 002 should run in its own conversation.
  • The public website should not expose sensitive scenario details.
  • The first phase is design and safety-boundary clarification.

Open questions

  • Which CLI harnesses should be compared first?
  • What minimum containment should be required before prototype work starts?
  • What public methodology can be shared without making the evaluations easy to overfit?

Next actions

  • Complete the design in the Lab 002 conversation.
  • Define a public-safe update format for methodology progress.
  • Keep implementation blocked until safety boundaries are approved.
Update history

Public progress from this Lab

2026-06-04

Autonomy safety harness seeded

Seeded a defensive Lab for designing a private, local-first evaluation harness for autonomous AI agents.

  • Established the Lab as a separate focused conversation.
  • Set the public scope around methodology, safety boundaries, and progress.
  • Kept sensitive scenario internals out of the public record.
Related Labs

Connected workstreams

LAB 001Active
Operating

Tokenmaxxing Labs Operating Loop

The recurring process that grows the Labs portion of the Tokenmaxxing community.

View Lab