Incident Timeline Reconstruction with Claude Code: Cost Controls
A production playbook for incident timeline reconstruction in cross-industry operations using Claude Code: cost controls, run-scoped inputs, logs, typed results, and artifacts.
Audience: SRE and engineering leaders
The problem
SRE and engineering leaders need incident timeline reconstruction to run repeatedly against logs, deploy events, alerts, and chat exports. In cross-industry operations, the pain is not one good answer; it is repeatability, auditability, exception handling, and evidence that survives handoff.
Implementation path
Set explicit limits for incident timeline reconstruction: input size, run time, tool calls, artifacts, retries, and concurrent runs per organization.
Tradeoffs and failure modes
Limits reject pathological runs, but they keep one workflow from turning into an unbounded infrastructure bill. For incident timeline reconstruction, the practical test is whether a second run can be debugged, retried, and consumed by a product without reading the raw agent transcript.
Run limits
max_run_seconds=1800
max_input_bytes=104857600
max_artifact_bytes=104857600
max_tool_calls=120
retry_after_seconds=60
Run this on Argo