State & Checkpoints Guide
State checkpoints let you save and restore agent data at any point. This is essential for recovering from corrupted outputs, debugging issues, and auditing what changed.
The Problem
Without checkpoints:
- Agent writes bad data → data corrupted
- No way to see what changed
- No way to roll back
- Weeks of work lost
With checkpoints:
- Agent writes bad data
- You identify the issue in audit logs
- Roll back to the checkpoint before corruption
- Continue from known-good state
Creating Checkpoints
from anchor import Anchor
anchor = Anchor(api_key="anc_...")
# Create a checkpoint before risky operations
checkpoint = anchor.checkpoints.create(
agent_id="agent-123",
label="before-migration" # Optional human-readable label
)
print(f"Checkpoint ID: {checkpoint.id}")
print(f"Created: {checkpoint.created_at}")
print(f"Data entries: {checkpoint.entry_count}")When to Checkpoint
Create checkpoints at natural boundaries:
# Before batch operations
checkpoint = anchor.checkpoints.create(agent.id, label="pre-batch-import")
try:
for item in large_dataset:
anchor.data.write(agent.id, item.key, item.value)
except Exception:
anchor.checkpoints.restore(agent.id, checkpoint.id)
raise
# Before deployments
checkpoint = anchor.checkpoints.create(agent.id, label="pre-deploy-v2.1")
deploy_new_agent_version()
# Daily automatic checkpoints
checkpoint = anchor.checkpoints.create(agent.id, label=f"daily-{date.today()}")Rolling Back
# Roll back to a specific checkpoint
result = anchor.checkpoints.restore(
agent_id="agent-123",
checkpoint_id="chk_abc123"
)
print(f"Rolled back to: {result.checkpoint_id}")
print(f"Entries restored: {result.entries_restored}")
print(f"Entries removed: {result.entries_removed}")Safe Operations Pattern
def safe_batch_operation(agent_id: str, items: list):
# 1. Create checkpoint
checkpoint = anchor.checkpoints.create(agent_id, label="pre-batch")
try:
# 2. Perform operation
for item in items:
anchor.data.write(agent_id, item.key, item.value)
# 3. Verify results
if not verify_data_integrity(agent_id):
raise ValueError("Data integrity check failed")
except Exception as e:
# 4. Rollback on failure
anchor.checkpoints.restore(agent_id, checkpoint.id)
raise RuntimeError(f"Batch failed, rolled back: {e}")Comparing Checkpoints
# List checkpoints
checkpoints = anchor.checkpoints.list(agent.id)
for cp in checkpoints:
print(f"{cp.id}: {cp.label} ({cp.created_at})")
# Get checkpoint details to compare
details = anchor.checkpoints.get(agent.id, cp.id)Best Practices
- Checkpoint before risky operations (batch imports, deployments, experiments)
- Use descriptive labels (e.g., "pre-deploy-v2.1" not "backup")
- Set up retention policies to avoid keeping checkpoints forever
- Verify after rollback to ensure it worked correctly
- Document rollbacks in audit logs
For more details, see the Checkpoints API reference.