Decision · TL;DR
A first-class AgentPlan any agent can produce: research → present_plan → approve → execute. Lives in mods/ai_agent.
"Let this agent plan" is a config flag (any agent, default off). On → a planning prompt tier + the present_plan tool. The agent does the tool-use itself; no bespoke planner extractor.
HumanReview · Auto · RiskGated. Bulk SMS → HumanReview (sends real messages). Low-risk plans can auto-approve.
A BulkSmsCampaign plan action. On approval it creates the bulk operation + spawns today's drafting engine. Engine barely changes.
What this buys over v1
Reusable across every agent (not bulk-only) · the agent's real reasoning + tools build the audience (not a one-shot extractor) · auto-approval for low-risk tasks falls out of the same model · and the bulk engine shrinks back to "create + draft + send," with the pre-drafting work owned by the plan.
01Why a planning primitive
Planning a complex task is a capability we want for any agent, not a feature of one workflow. Claude Code already models this well — we adapt it.
What we borrow from Claude Code plan mode
- Read-only plan phase. The agent explores with safe/read tools; mutations are withheld until a plan is approved. (In CC, writes never auto-approve in
planmode.) - Present → approve → execute. A discrete proposal, a decision, then execution under a chosen permission posture.
- Approval is a policy, not always a human. CC's permission modes (
default·acceptEdits·plan·auto·dontAsk·bypassPermissions) span interactive → autonomous, andcanUseTool/ hooks decide approval programmatically per action.
Where we deliberately diverge
In Claude Code the plan is free-form markdown, consumed once by a human while the same agent keeps going and executes its own tool calls. We can't do that here — a separate durable engine executes (we rejected "agent sends 200 messages in a thread," option ⑤). So our plan is a structured, executable artifact a downstream worker consumes. We take CC's lifecycle and approval model; the plan itself is typed. This is the "plan-and-execute" agent pattern rather than "plan-then-continue."
Approval policy ← permission modes
| Claude Code mode | Posture | Our ApprovalPolicy |
|---|---|---|
plan → review & approve | human gates the plan | HumanReview — persist plan, surface card, pause for approve/edit/reject |
auto / acceptEdits | proceeds with soft/no gate | Auto — approved on submission, execute immediately |
canUseTool / permission_policy | programmatic per-action | RiskGated — a hook inspects the plan: gate high-impact, auto low-impact |
"Some plans auto-approve so the human isn't in the loop" = Auto / RiskGated. Sending real SMS is high-impact → HumanReview in v1, but the enum + hook exist so other agents/tasks can auto-approve without a rebuild.
Lands directly on #1476's tiered system prompt
#1476 is rebuilding agent instructions as layers in build_system_prompt: Tier 0 PLATFORM_CORE (identity · "you act only by calling tools" · operator-vs-contact trust · grounding), Tier 0.5 org_rules + agent_rules (from config_payload), Tier 1 delivery_directive. Planning is just another tier — a PLANNING_BLOCK the executor injects when the agent is planning-enabled, exactly like it injects PLATFORM_CORE. No new prompt machinery; we add one gated block + one tool.
02Anatomy of an AgentPlan
A general envelope (human-readable + auditable) wrapping a typed, extensible action. The action is what an executor knows how to run.
struct AgentPlan {
id: Uuid,
organization_id: Uuid,
agent_id: Uuid, // which agent authored it (e.g. the assistant clone)
thread_id: Option<Uuid>, // the conversation it came from
title: String, // "Follow up with 142 quiet quote leads"
summary: String, // human-readable rationale, rendered at review
action: PlanAction, // the typed, executable payload (below)
approval_policy: ApprovalPolicy,
status: PlanStatus,
created_by: Option<Uuid>,
approved_by: Option<Uuid>, approved_at: Option<DateTime>,
executed_ref: Option<Uuid>, // e.g. the assistant_operation it spawned
}
enum PlanAction { // extensible — one variant implemented in v1
BulkSmsCampaign {
audience: Vec<ContactDecision>, // { contact_id, include, reason, kind }
message_concept: String, // the idea, not the final text
},
// future: ContactCleanup { … }, ScheduleFollowUps { … }, EnrichContacts { … }
}
enum ApprovalPolicy { HumanReview, Auto, RiskGated }
enum PlanStatus { Drafting, AwaitingApproval, Approved, Rejected, Executing, Done, Failed }
Structure comes free from the tool schema
The agent fills this by calling present_plan(plan) — a built-in ai_agent tool whose input schema is the plan. So we get the agent's tool-use reasoning and a typed artifact, with none of the "force a submit tool to get JSON" fragility — the plan submission is that tool. (CC's ExitPlanMode is the same idea; ours just carries structure because an engine executes it.)
03The plan lifecycle
Two gates remain, but they're now different kinds: plan approval (general, in the thread — the audience + the idea) and draft review (bulk-specific, the existing sheet — the actual messages). Auto-approval skips the first; the second stays a deliberate safety stop for sending real SMS.
04Approval & auto-approve
One resolver decides whether a plan pauses for a human. It's the seam where "agent autonomy" is tuned.
fn resolve_approval(agent: &Agent, plan: &AgentPlan) -> Decision {
match plan.approval_policy {
HumanReview => Decision::Gate,
Auto => Decision::Proceed,
RiskGated => assess_risk(plan), // the canUseTool / permission_policy analog
}
}
// v1 assess_risk: sends messages OR spends money OR mutates > N rows ⇒ Gate; else Proceed.
- v1 stance:
BulkSmsCampaignis hardwired toHumanReview— real messages, real cost, irreversible. No auto-send in v1. - The hook is the autonomy dial. Per-agent default policy + per-action override +
assess_risk. Today it always gates bulk; tomorrow a trusted agent can auto-approve a read-only enrichment plan. - Audit: every plan records policy + decision + approver + timestamp, so an auto-approved plan is as inspectable as a gated one.
Read-only planning is the safety invariant
The planning turn must not mutate — it only proposes. The single mutation (creating + sending the operation) happens in PlanExecutor after approval. That mirrors CC's "no writes in plan mode" and makes auto-approval safe to reason about: an un-approved plan has changed nothing.
05Bulk SMS as the first consumer
The feature you started from is now one PlanAction + one executor handler. Everything campaign-specific lives here; the primitive stays generic.
v1 design (retired)
- Bespoke
bulk_plan_engine_servicerunning a one-shotCampaignPlanextractor. - New
planning/awaiting_plan_approvalstatuses on the bulk operation. - Audience + concept baked into the bulk engine.
v2 design (this)
- The assistant agent plans via its own tool-use; emits
present_plan(BulkSmsCampaign). - Plan lifecycle lives in
ai_agent; the operation is created only after approval and starts atdraftingas today. delegate_bulk_operationis no longer agent-facing — it becomes the executor handler.
What happens to delegate_bulk_operation
Today the agent calls it to create-and-send directly. After: the agent calls present_plan instead; the PlanExecutor calls the existing create_bulk_operation + spawn_draft_engine path (the body of today's delegate) on approval. The create/spawn code is reused verbatim — only its trigger moves from "agent tool" to "plan execution."
06The executor we reuse
The bulk engine is already durable and well-built. It executes the approved plan unchanged, save for consuming the concept.
| Piece | Where | Role under the plan |
|---|---|---|
create_bulk_operation + spawn_draft_engine | services/bulk_operation_engine_service.rs | Executor handler for BulkSmsCampaign reuse |
| Drafter extractor | services/follow_up_drafter_extractor_service.rs | Per-contact draft; now also takes the concept small change |
| Gate 2 review sheet | components/operation_review_sheet_component.rs | Per-message review before send reuse |
| Cards · websockets · reaper | operation_card · bulk_operation_event_type · reap_stale…job | Live progress + durability reuse |
| Statuses | bulk_operation_status_type.rs | Unchanged — drafting→awaiting_approval→sending reuse |
Compliance gap to close regardless (Phase 1)
create_bulk_operation pre-skips only no_phone — it never consults contact_channel_opt_out. Keep a deterministic opt-out hard-filter in the executor as a safety net even though the planner already avoids opted-out contacts. Defense in depth; ships on its own.
07Data model
One new general table; a two-column touch on the bulk operation. The audience/exclusions live in the plan, so the item schema is untouched.
New new — ai_agent_plan (general)
CREATE TABLE ai_agent_plan (
id UUID PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organization(id),
agent_id UUID NOT NULL REFERENCES ai_agent(id),
thread_id UUID REFERENCES ai_thread(id),
title VARCHAR(255) NOT NULL,
summary TEXT NOT NULL,
action_type VARCHAR(48) NOT NULL, // "bulk_sms_campaign"
action_payload JSONB NOT NULL, // the typed PlanAction (audience + concept)
approval_policy VARCHAR(24) NOT NULL, // human_review | auto | risk_gated
status VARCHAR(24) NOT NULL, // drafting | awaiting_approval | …
created_by UUID REFERENCES "user"(id),
approved_by UUID REFERENCES "user"(id),
approved_at timestamptz,
executed_ref UUID, // the assistant_operation it spawned
created_at timestamptz NOT NULL,
updated_at timestamptz NOT NULL
);
CREATE INDEX idx_ai_agent_plan_org_status ON ai_agent_plan (organization_id, status);
Touch change — assistant_operation
ALTER TABLE assistant_operation
ADD COLUMN message_concept TEXT, // the approved idea, used by the drafter
ADD COLUMN plan_id UUID REFERENCES ai_agent_plan(id); // provenance
No new column — the toggle lives in config_payload reuse
Planning is opt-in per agent via the existing ai_agent.config_payload JSONB (same place #1476 reads agent_rules): planning_enabled: bool (default false) + plan_approval: "human_review"|"auto"|"risk_gated" (default human_review). No schema change on ai_agent — mirrors how agent_rules is stored, and keeps the cached static prompt prefix byte-stable when off.
What we no longer need (vs v1)
No new bulk-operation statuses; no item excluded status; no item exclusion columns; no AiArea::AssistantBulkPlan. The pre-drafting phase is the plan's, not the operation's — items exist only for included contacts and draft as today.
Schemas are gitignored & auto-generated
After the migration run just generate (dev DB must be migrated). Never hand-edit src/bases/db/schemas/.
08Planning as an agent capability
Not assistant-only. Any agent can be given planning — the owner toggles it on, and the runtime layers the capability onto #1476's tiered prompt.
The toggle (per-agent config)
Planning rides on the same config_payload that #1476/#1478 use for agent_rules. Two keys, default off:
// ai_agent.config_payload
{
"agent_rules": "…", // (#1476)
"planning_enabled": true, // "Let this agent plan complex tasks" — default false
"plan_approval": "human_review" // human_review | auto | risk_gated — default human_review
}
- Surfaced in the agent config UI (the #1478 instructions/rules editor): a "Planning" toggle + an approval-mode selector ("Always review my agent's plans" / "Let it proceed on low-risk plans"). This is the "user can check if they want the agents to plan" control.
- Generic across agent types — Text Reply, voice, the assistant, custom agents. Default off means today's behaviour is unchanged until an owner opts in.
What turning it on does — two gated additions
1 · A planning prompt tier
- The executor injects a
PLANNING_BLOCKintobuild_system_prompt— a sibling tier to #1476'sPLATFORM_CORE, gated byplanning_enabledexactly likeinclude_platform_core. - It tells the agent: for a complex, multi-step, or high-impact task, don't act immediately — research with read tools, then call
present_planand wait for approval.
2 · The present_plan tool
- Added to the agent's toolset when
planning_enabled(likeattach_domain_toolsgates the domain bundle). - Input schema = the
AgentPlanenvelope. On call the runtime persists the plan, resolvesplan_approval, then gates (card) or auto-dispatches.
The planning turn itself
- List-aware via existing domain tools. A domain agent already attaches the Session-bound toolset (
attach_domain_tools). Planning reuses those reads — contacts, tags, custom fields, recent messages, opt-out — to build and justify the audience. No new planner model area. - Read-only invariant. During a planning turn the runtime withholds mutating tools (send SMS, write contact) and offers
present_planas the terminal action. The only mutation is post-approval execution. (This is the one place #1476's "you act only by calling tools" contract needs a planning-mode complement — withhold the writing tools.)
This is where "the agent does the tool use" lives
Unlike v1's one-shot extractor, the audience is the product of the agent actually querying the CRM with its tools and reasoning over the results — then committing the result as the present_plan argument. Structured output, real reasoning, no extra model area — and it's the same mechanism whether the agent is the assistant or a Text Reply agent.
09Execution & the worker
On approval, a thin dispatcher maps the action to a handler. The per-contact drafter barely changes.
async fn dispatch(plan: AgentPlan) -> Result<(), AppError> {
match plan.action {
PlanAction::BulkSmsCampaign { audience, message_concept } => {
let included = audience.iter().filter(|d| d.include).map(|d| d.contact_id);
let op = create_bulk_operation(included, message_concept, …).await?; // opt-out safety net inside
spawn_draft_engine(op.operation_id);
set_executed_ref(plan.id, op.operation_id).await?;
}
}
}
Drafter today
draft_follow_up_message(org, persona, instruction, context)- system =
persona + HARD_RULES
Drafter after
draft_follow_up_message(org, persona, concept, instruction, context)- system =
persona + concept + HARD_RULES(concept from the approved plan)
Fan-out (20 concurrent), claim batch (50), prefetch, counters, websockets, send path — all untouched.
10UI surfaces
Plan card (new · in the agent thread)
- Title + summary; audience counts ("142 in · 31 out").
- Grouped exclusions w/ reasons (opted out · already signed · off-campaign).
- Editable message concept; Approve · Edit · Reject.
- Generic to
AgentPlan— renders any action's summary; bulk adds an audience detail view.
Draft sheet (existing · Gate 2)
- Reused unchanged — per-message review before send.
OperationCard+use_realtime_operationkeep live progress.- Surfaces once the executor has created the operation.
Progressive disclosure: the plan card defaults to summary + concept + counts; the full included/excluded lists expand on demand (mirrors the engine review sheet and the UX-principles disclosure rules). Auto-approved plans skip the card and post a "plan auto-approved → executing" note instead.
11Learning loop later
Both decisions are signal, and now they attach to a clean entity — the plan.
Plan-level signal
- Audience edits (operator drops / re-includes) → audience-judgment lessons.
- Concept rewrites + approve/reject → strategy lessons.
Draft-level signal
- Per-message edits in the existing sheet → voice lessons.
Both feed the authoring agent's ai_agent_learning / ai_agent_lesson (evergreen, versioned; weekly agent cadence). Because the plan records its agent_id, this works for any planning agent, not just the campaign case.
12Build phases
Build the primitive thin, shaped by the one real consumer; prove it with bulk SMS. Compliance fix lands first.
- P1Opt-out pre-filter (compliance)S
Add opt-out + dupe to
create_bulk_operation's deterministic skip. Independent of everything else — closes the live gap now.bulk_operation_engine_service.rs · opt-out read helper
- P2
ai_agent_planentity + lifecycleMMigration + types (PlanStatus, ApprovalPolicy, PlanAction envelope); persistence;
resolve_approval+assess_riskstub.just generate.migration/… · mods/ai_agent/types/… · services/ai_agent_plan_service.rs (new)
- P3
present_plantool +PLANNING_BLOCKtier + read-only postureLNew ai_agent tool (schema = AgentPlan); runtime persists, resolves policy, gates or auto-dispatches. Add the
PLANNING_BLOCKprompt tier tobuild_system_prompt+ offerpresent_plan, both gated byconfig_payload.planning_enabled; withhold mutating tools during a planning turn. Builds on #1476's tieredbuild_system_prompt— rebase on it.mods/ai_agent/tools/present_plan_tool.rs (new) · build_system_prompt_service.rs (#1476) · run_ai_agent_thread_service.rs · tool_registry
- P4PlanExecutor + BulkSmsCampaign handlerM
Dispatcher;
BulkSmsCampaign→create_bulk_operation+spawn_draft_engine; threadmessage_conceptinto the drafter; setexecuted_ref. Retargetdelegate_bulk_operationas the handler.services/plan_executor_service.rs (new) · bulk_operation_engine_service.rs · follow_up_drafter_extractor_service.rs · delegate_bulk_operation.rs
- P5Plan APIs + review cardM
GET plan · PUT plan (edit concept / include-exclude, opt-out re-include blocked) · POST approve / reject. Plan card in the thread; auto-approve note path.
mods/ai_agent/api/ai_agent_plan_api.rs (new) · components/agent_plan_card_component.rs (new)
- P6Per-agent toggle + config UIM
Read/write
config_payload.planning_enabled+plan_approval; add a "Planning" toggle + approval-mode selector to the agent instructions/rules editor. Lands in #1478's config UI (or extends it). Default off → no behaviour change until an owner opts in.#1478 agent config UI · ai_agent config DTOs / api
- P7Learning signals laterM
Plan-gate + draft-gate edits →
ai_agent_lessonon the authoring agent. Gated behind the flow shipping.ai_agent learning services · plan card + draft sheet
13Open decisions
Plan = structured artifact, or free-form text like Claude Code?
- Structured (
PlanActiontyped payload) — a separate engine executes it, so it must be machine-runnable. - Free-form markdown — only works if the same agent re-reads & executes (the ⑤ path we rejected).
summary field covers the "readable plan" need; the typed action covers execution.How general in v1?
- Thin-but-general: ship the
AgentPlanenvelope + lifecycle +present_plan, with exactly onePlanAction(BulkSmsCampaign). Generalize when a 2nd consumer appears. - Build the full multi-action planner framework now.
Does the planning turn pause-and-resume, or hand off?
- Hand off: the turn ends at
present_plan; approval + execution happen out-of-band (APIs + executor + websockets). Simpler with the durable engine. - Pause-and-resume the same agent turn after approval (closer to CC, but holds a turn open across human latency).
Keep Gate 2 (per-message draft review) under the plan model?
- Keep it for bulk SMS — plan approval is about who + the idea; draft review is the last stop before real messages send.
- Let plan approval subsume drafts (auto-approve sends) — faster, riskier; make it a policy later.
ApprovalPolicy choice later."Already signed / off-campaign" signal source
- Whatever the agent's tools can read: tags, custom fields, notes, messages. Plan review is the human catch.
- Block semantic exclusions until a first-class deal/contract-stage signal exists.
Who can plan, and what's the default?
- A per-agent toggle (
config_payload.planning_enabled), default off, available to every agent type; surfaced in the #1478 config UI. - Hardwire planning to the assistant only.
plan_approval) is a sibling setting on the same toggle.14Files to touch
New
migration/src/m{ts}_create_ai_agent_plan.rs(+assistant_operationcolumns)mods/ai_agent/types/agent_plan_*.rs(plan, status, policy, action)mods/ai_agent/services/ai_agent_plan_service.rsmods/ai_agent/services/plan_executor_service.rsmods/ai_agent/tools/present_plan_tool.rsmods/ai_agent/api/ai_agent_plan_api.rscomponents/agent_plan_card_component.rs
Changed
build_system_prompt_service.rs(#1476 — add thePLANNING_BLOCKtier, gated byplanning_enabled)run_ai_agent_thread_service.rs(read-only posture · gate present_plan + block onplanning_enabled)tool_registry_service.rs(offerpresent_planwhen enabled)- #1478 agent config UI + config DTOs/api (the planning toggle + approval selector)
services/bulk_operation_engine_service.rs(opt-out filter · concept)services/follow_up_drafter_extractor_service.rs(concept arg)ai/rig/tools/assistant/delegate_bulk_operation.rs(becomes executor handler)
15Risks & non-goals
Scope creep — keep the primitive thin.
A general planning subsystem invites over-design. v1 ships one action and a hand-off lifecycle. Resist building a multi-step plan DAG, plan templates, or a risk-scoring engine before a second consumer exists.
Read-only planning isn't enforced for free.
The runtime must actually withhold mutating tools during a planning turn (or an agent could send before approval). This is the one place the CC "no writes in plan mode" guarantee has to be re-implemented; get it right in P3.
Semantic exclusion is bounded by signals.
"Contract already signed" needs that state where the agent's tools can read it (tag/custom field/note/message). Plans/deals aren't a prod signal. Promise a reviewable first pass, not perfect curation — Gate 1 is the catch.
- Non-goal: agent-per-contact drafting (⑤) or a live operation thread (④). Drafting stays a cheap extractor the executor fans out.
- Non-goal: auto-sending bulk SMS in v1.
BulkSmsCampaignis alwaysHumanReview; auto-approve is for future low-risk actions. - Non-goal: per-segment concepts, multi-action plans, plan templates — the envelope leaves room; none built now.