›_InferoaGitHub

Inference-nativeTokenmaxxingAgentHarness

Inference-nativeTokenmaxxingLoop Engineering
Install latest dev
Why loops break

Loops fail when inference is invisible.

Loop engineering works when a model can run against goals, rubrics, feedback, memory, and verification. But every loop is also an inference workload: prefixes drift, cache reuse collapses, stale evidence fills context, routing gets harder, and serving constraints start to shape the result. Inferoa keeps those tokenmaxxing surfaces inside the harness.

Three words, one runtime

Loop Engineering needs Tokenmaxxing.

01

Loop Engineering

Design the goal, feedback, verifier, memory, tools, and stop condition instead of hand-steering every prompt.

02

Tokenmaxxing

Keep each turn cache-aware, context-bounded, route-conscious, and measurable as the horizon grows.

03

Inference-native runtime

Expose context windows, prefix cache, model paths, endpoint signals, and serving constraints to the loop.

04

Proof-oriented loops

Use plans, tests, tool evidence, autoresearch metrics, reflection, and completion reports to decide when to stop.

Mission

Design loops with inference feedback.

Inferoa starts with coding because coding exposes loop pressure clearly: changing goals, tool failures, repeated model calls, context limits, memory needs, verifier signals, and proof through tests. The goal is to co-design the agent harness, goal loop, and inference stack so every turn spends context, cache, route choice, and serving capacity deliberately.

01Goal and rubric feedback

One durable outcome expands through horizons, evidence, reflection, recovery, and completion reports.

02Verifier-ready evidence

Plans, tests, tool results, and autoresearch metrics give the loop concrete feedback to improve against.

03Inference stays visible

Prefix cache, context pressure, routing, multimodal endpoints, and serving constraints stay in the loop.

Quick Look

Inside a Session

01

Welcome

A restrained entry point for the configured model, workspace, and core commands.

Inferoa Welcome session demo
02

Goal Mode

Run /goal to start a long-horizon recursive goal with horizons, evidence, and reflection.

Inferoa Goal Mode session demo
03

Plan Mode

Ambiguous scope becomes an inspectable plan before execution starts.

Inferoa Plan Mode session demo
04

Autoresearch

Benchmark runs, failures, fixes, and metrics stay in one research loop.

Inferoa Autoresearch session demo
Cross-stack path

Across the Tokenmaxxing Stack

  1. 01Goal Looprecursive horizons + reflection
  2. 02Agent Harnesssessions, tools, evidence
  3. 03Tokenmaxxingprefix, context, routing
  4. 04vLLM ServingEngine + Omni