Overview
This page is a pragmatic checklist for shipping Coevolved-based agents in production. Use it to sanity-check the parts that tend to break first: loops, tools, costs, and data handling.Reliability
- Stop conditions: define explicit stop conditions for every loop; unit test them.
- Retries: wrap flaky steps with retry policies (network calls, external APIs).
- Checkpointing: checkpoint before/after iterations for long-running agents.
- Idempotency: make tools safe to retry (or detect duplicates).
- Timeouts: enforce wall-clock time limits at the loop level.
Cost controls
- Budgets: enforce
UsagePolicycaps for steps/time/LLM calls/tool calls. - Model choices: default to smaller models; promote to bigger models only when needed.
- Tool discipline: avoid unnecessary tool calls; validate tool args to reduce wasted iterations.
Security and data handling
- Logging: avoid logging full prompts and raw responses by default.
- Redaction: scrub sensitive fields before hashing or exporting traces.
- Secrets: don’t pass secrets through state unless required; prefer environment configuration.
- Tool sandboxing: treat tools as privileged code paths; validate inputs and outputs.
Operations
- Tracing: emit events to your logging/observability pipeline.
- Alerts: monitor error rates, timeouts, and budget exceeded events.
- Prompt versioning: use stable prompt IDs and bump versions intentionally.
- Run metadata: tag runs with environment/version identifiers in your own sink pipeline.