Loop Engineering
Design the objective, feedback, verifier, memory, tools, and stop condition instead of hand-steering every prompt.
Loop engineering works when a model can run against goals, rubrics, feedback, memory, and verification. But every loop is also an inference workload: prefixes drift, cache reuse collapses, stale evidence fills context, routing gets harder, and serving constraints start to shape the result. Inferoa keeps those tokenmaxxing surfaces inside the harness.
Design the objective, feedback, verifier, memory, tools, and stop condition instead of hand-steering every prompt.
Keep each turn cache-aware, context-bounded, route-conscious, and measurable as the loop grows.
Expose context windows, prefix cache, model paths, endpoint signals, and serving constraints to the loop.
Use plans, tests, tool evidence, research metrics, verification, decisions, and completion reports to decide when to stop.
Inferoa starts with coding because coding exposes loop pressure clearly: changing goals, tool failures, repeated model calls, context limits, memory needs, verifier signals, and proof through tests. The goal is to co-design the agent harness, loop controller, and inference stack so every turn spends context, cache, route choice, and serving capacity deliberately.
One durable outcome expands through loop tasks, evidence, decisions, recovery, and completion reports.
Plans, tests, tool results, and research metrics give the loop concrete feedback to improve against.
Prefix cache, context pressure, routing, multimodal endpoints, and serving constraints stay in the loop.
A restrained entry point for the configured model, workspace, and core commands.

Run /loop to start a long-horizon recursive loop with loop tasks, attempts, evidence, and decisions.

Ambiguous scope becomes an inspectable plan before execution starts.

Benchmark runs, failures, fixes, and metrics stay inside the loop decision flow.


High-performance serving is the base. inferoa treats prefix-cache stability and endpoint signals as agent state.

Routing belongs in the loop. Cost, safety, privacy, capability, and session pressure can choose the model path.

Multimodal work stays native. Image, video, and audio understanding or generation live in the same durable session.