Loop Engineering
Design the goal, feedback, verifier, memory, tools, and stop condition instead of hand-steering every prompt.
Loop engineering works when a model can run against goals, rubrics, feedback, memory, and verification. But every loop is also an inference workload: prefixes drift, cache reuse collapses, stale evidence fills context, routing gets harder, and serving constraints start to shape the result. Inferoa keeps those tokenmaxxing surfaces inside the harness.
Design the goal, feedback, verifier, memory, tools, and stop condition instead of hand-steering every prompt.
Keep each turn cache-aware, context-bounded, route-conscious, and measurable as the horizon grows.
Expose context windows, prefix cache, model paths, endpoint signals, and serving constraints to the loop.
Use plans, tests, tool evidence, research metrics, reflection, and completion reports to decide when to stop.
Inferoa starts with coding because coding exposes loop pressure clearly: changing goals, tool failures, repeated model calls, context limits, memory needs, verifier signals, and proof through tests. The goal is to co-design the agent harness, goal loop, and inference stack so every turn spends context, cache, route choice, and serving capacity deliberately.
One durable outcome expands through horizons, evidence, reflection, recovery, and completion reports.
Plans, tests, tool results, and research metrics give the loop concrete feedback to improve against.
Prefix cache, context pressure, routing, multimodal endpoints, and serving constraints stay in the loop.
A restrained entry point for the configured model, workspace, and core commands.

Run /goal to start a long-horizon recursive goal with horizons, evidence, and reflection.

Ambiguous scope becomes an inspectable plan before execution starts.

Benchmark runs, failures, fixes, and metrics stay inside the goal reflection loop.


High-performance serving is the base. inferoa treats prefix-cache stability and endpoint signals as agent state.

Routing belongs in the loop. Cost, safety, privacy, capability, and session pressure can choose the model path.

Multimodal work stays native. Image, video, and audio understanding or generation live in the same durable session.