Long-running Autonomy & State Adaptation

Multi-step workflows with memory, interruptions, state changes, and replanning requirements across longer task horizons.

11 tasksHooks on selected tasksOracle + LLM scoring