Chapter 32: Tool Calling & Autonomy
The fine-tuned, RAG-grounded analyst from Chapters 30 and 31 can diagnose a degraded cluster and narrate the correct remediation sequence — but narration is not remediation, and a 3 AM hardware failure demands more than a well-worded runbook delivered to an on-call engineer who may be thirty minutes from a keyboard. Tool calling closes the last gap: by coercing the LLM to emit structured JSON action descriptors instead of Markdown prose, and by wiring a Go feedback loop that dispatches those descriptors to the WAM and the CLP(FD) solver, the system transitions from a passive diagnostician to an active agentic reasoner that can autonomously explore cluster state, invoke the Chapter 27 bin-packer to construct a mathematically valid VM migration plan, and then stop — presenting the plan to the operator for a cryptographic approval click before any physical mutation occurs.
32.1 From Conversation to Tool Calling
32.1.1 The Mechanics of Structured Output
A standard LLM completion produces natural language. Tool calling is a technique for coercing the model to produce machine-parseable structured data instead: a JSON object that names a function and supplies its arguments. The model is not executing code. It is predicting that, given the system prompt's JSON schema definition and the conversation history, the most appropriate next token sequence is a valid JSON object conforming to a known schema. The Go orchestrator intercepts that prediction, executes the real function, and injects the result back into the message history before the next model call.
The Ollama API supports tool calling natively via the /api/chat endpoint's tools parameter, which accepts JSON Schema function definitions. However, the sovereign-analyst-v2 model from Chapter 30 was fine-tuned on the infrastructure's specific alert schema and runbook vocabulary — it was not fine-tuned on generic function-calling conventions. Relying on the native tool-call pathway for a fine-tuned model introduces a distribution mismatch: the model has learned to produce structured responses in one format, and the native tool pathway expects a different structural convention. The more robust approach for a fine-tuned model is schema enforcement via the system prompt: the system prompt defines the tool schema as a JSON contract, the model is instructed to emit a JSON block when it needs to call a tool, and the Go layer parses that block. This is the approach used here.
32.1.2 Read-Only WAM Predicates as Tools
The LLM is given access to a defined set of read-only WAM predicates through the tool interface. Read-only means the predicates do not mutate the WAM heap: they query existing facts and rules but do not call assertz, retract, or any predicate that modifies dynamic clauses. The permitted tool predicates are:
node_health(+Node, -Status) — retrieves the current Chapter 22 health verdict for a node. Read from live_state. Returns nominal, degraded, critical, or unknown.
live_link(+A, +B, -Cost) — queries the Chapter 24 health-aware topology edge. Succeeds only if both endpoints are healthy. Returns the link cost.
can_reach(+From, +To) — a derived predicate wrapping live_shortest_path/3 that succeeds if a healthy path between two nodes currently exists.
active_alerts(+Node, -ConditionIDs) — returns the list of currently active alert_active/4 condition IDs for a node, within the 300-second retention window.
current_vm_host(+VMID, -Host) — returns the host currently asserted for a VMID.
schedule_ha_proposal(+Hosts, +VMs, -Actions) — the single write-adjacent tool. This invokes ha_scheduler:schedule_ha_timed/6 and ha_scheduler:compute_migration_delta/4, returning the list of required migration actions as migrate(VMID, TargetHost) terms serialised to JSON. It does not execute the migrations. It is a planning tool, not an execution tool. The mutations it implies require the HITL approval barrier in §32.4.
32.2 The Build: JSON Schema Enforcement Prompt
32.2.1 The Agentic System Prompt
The system prompt replaces the Chapter 31 advisory persona with an agentic one. It defines the available tools as a JSON schema, mandates the output format for tool calls, and specifies the termination condition for the reasoning loop.
# File: /opt/logic-node/prompts/agent_system_prompt.txt
# This file is read once at server startup into the agentSystemPrompt constant.
# It is NOT editable at runtime. Changes require a server restart.
You are an autonomous Proxmox infrastructure agent operating in a sovereign,
air-gapped environment. You have access to a set of tools that query the live
WAM (Prolog logic engine). You use these tools to gather ground-truth cluster
state and to invoke the CLP(FD) constraint solver for VM placement decisions.
## Operational Rules
1. When you need information, emit a TOOL_CALL block. Do not guess or infer.
2. When you have enough information to produce a final answer or migration plan,
emit a FINAL_ANSWER block. Do not mix prose with tool calls.
3. Never emit both a TOOL_CALL and a FINAL_ANSWER in the same response.
4. If you cannot resolve the problem within the available tools, emit a
FINAL_ANSWER explaining why resolution is not possible.
5. Do not emit text outside of a TOOL_CALL or FINAL_ANSWER block.
## Available Tools
{"tools": [
{
"name": "query_wam",
"description": "Execute a read-only Prolog goal against the live WAM and return the result.",
"parameters": {
"goal": {
"type": "string",
"description": "A Prolog goal string. Variables must be capitalised. Example: node_health(pve3, Status)",
"pattern": "^[a-z_][a-zA-Z0-9_(), ']*$"
}
},
"required": ["goal"]
},
{
"name": "schedule_ha_proposal",
"description": "Invoke the CLP(FD) HA scheduler. Returns proposed migrate/start actions. Does NOT execute migrations.",
"parameters": {
"hosts": {"type": "array", "description": "List of {name, ram_mib, cpu_mc} objects."},
"vms": {"type": "array", "description": "List of {vmid, ram_mib, cpu_mc, ha_group} objects."},
"use_rack_constraints": {"type": "boolean"}
},
"required": ["hosts", "vms", "use_rack_constraints"]
},
{
"name": "execute_migration",
"description": "Propose a migration plan for human approval. SUSPENDS the agent. Does NOT execute.",
"parameters": {
"plan": {"type": "array", "description": "List of {action, vmid, target} objects."},
"reason": {"type": "string", "description": "One sentence explaining why migrations are required."}
},
"required": ["plan", "reason"]
}
]}
## Output Format
For a tool call:
{"tool_call": {"name": "<tool_name>", "parameters": {<parameters>}}}
For a final answer:
{"final_answer": {"summary": "<prose>", "confidence": "<high|medium|low>"}}
## WAM Ground Truth Context
<<WAM_CONTEXT>>
Resolve the following operator request using the tools above.
The <<WAM_CONTEXT>> placeholder is replaced at runtime by the Chapter 31 RAG context string. The agent always starts with ground-truth cluster state from the current WAM heap, and uses tool calls to gather additional depth.
32.2.2 Malformed JSON Recovery
The LLM may produce syntactically invalid JSON — truncated objects, trailing commas, unescaped characters in goal strings. The Go agent loop treats malformed JSON as a recoverable error: it constructs a correction message and re-injects it into the conversation history as a user role message, prompting the model to re-emit the tool call correctly.
// parseToolResponse attempts to parse the LLM's raw output as a tool call or
// final answer. If parsing fails, it returns a correction prompt for re-injection.
func parseToolResponse(raw string) (toolCall *ToolCall, finalAnswer *FinalAnswer, correctionPrompt string) {
// Extract the first JSON object from the raw output (model may emit preamble):
jsonStart := strings.Index(raw, "{")
jsonEnd := strings.LastIndex(raw, "}") + 1
if jsonStart == -1 || jsonEnd <= jsonStart {
return nil, nil, fmt.Sprintf(
"Your response did not contain a JSON object. "+
"Emit exactly one JSON block in the format specified. "+
"Your output was: %q", truncate(raw, 120))
}
candidate := raw[jsonStart:jsonEnd]
// Try tool_call first:
var tc struct {
ToolCall *ToolCall `json:"tool_call"`
}
if err := json.Unmarshal([]byte(candidate), &tc); err == nil && tc.ToolCall != nil {
return tc.ToolCall, nil, ""
}
// Try final_answer:
var fa struct {
FinalAnswer *FinalAnswer `json:"final_answer"`
}
if err := json.Unmarshal([]byte(candidate), &fa); err == nil && fa.FinalAnswer != nil {
return nil, fa.FinalAnswer, ""
}
// Both failed — return correction prompt:
return nil, nil, fmt.Sprintf(
"JSON parse error. Your output was not a valid tool_call or final_answer object. "+
"Check for missing quotes, unclosed braces, or invalid characters in goal strings. "+
"Malformed output: %q. Retry with corrected JSON.", truncate(candidate, 200))
}
func truncate(s string, n int) string {
if len(s) <= n {
return s
}
return s[:n] + "…"
}
32.3 The Build: Go Dynamic Feedback Loop
32.3.1 Data Types
// File: /opt/logic-node/go/orchestrator/llm_agent.go
package main
import (
"context"
"crypto/rand"
"encoding/hex"
"encoding/json"
"fmt"
"log"
"net/http"
"os/exec"
"strings"
"sync"
"time"
)
// ToolCall is the structured JSON payload emitted by the LLM when it needs
// to invoke a WAM predicate or the CLP(FD) solver.
type ToolCall struct {
Name string `json:"name"`
Parameters json.RawMessage `json:"parameters"`
}
// FinalAnswer is the structured JSON payload emitted when the agent has
// completed its reasoning and is ready to present a conclusion.
type FinalAnswer struct {
Summary string `json:"summary"`
Confidence string `json:"confidence"` // "high" | "medium" | "low"
}
// QueryWAMParams holds the parameters for the "query_wam" tool.
type QueryWAMParams struct {
Goal string `json:"goal"`
}
// ScheduleHAParams holds the parameters for the "schedule_ha_proposal" tool.
type ScheduleHAParams struct {
Hosts []HostSpec `json:"hosts"`
VMs []VMSpec `json:"vms"`
UseRackConstraints bool `json:"use_rack_constraints"`
}
type HostSpec struct {
Name string `json:"name"`
RAMMIB int `json:"ram_mib"`
CPUMC int `json:"cpu_mc"`
}
type VMSpec struct {
VMID int `json:"vmid"`
RAMMIB int `json:"ram_mib"`
CPUMC int `json:"cpu_mc"`
HAGroup string `json:"ha_group,omitempty"` // empty = standalone
}
// ExecuteMigrationParams holds the parameters for the "execute_migration" tool.
// This tool triggers the HITL suspension path — it never executes directly.
type ExecuteMigrationParams struct {
Plan []MigrationAction `json:"plan"`
Reason string `json:"reason"`
}
type MigrationAction struct {
Action string `json:"action"` // "migrate" | "start"
VMID int `json:"vmid"`
Target string `json:"target"`
}
// AgentSession holds the running message history for one agent reasoning loop.
type AgentSession struct {
Messages []OllamaChatMessage
TurnCount int
MaxTurns int
}
const agentMaxTurns = 5
32.3.2 The Reasoning Loop
// AgentRequest is the JSON body for POST /api/v1/agent.
type AgentRequest struct {
Node string `json:"node,omitempty"`
Question string `json:"question"`
}
// handleAgent is the HTTP handler for /api/v1/agent.
// It runs the full tool-calling feedback loop and streams progress via SSE.
func (s *Server) handleAgent(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
http.Error(w, "method not allowed", http.StatusMethodNotAllowed)
return
}
var req AgentRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil || strings.TrimSpace(req.Question) == "" {
http.Error(w, "invalid request", http.StatusBadRequest)
return
}
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")
w.Header().Set("X-Accel-Buffering", "no")
flusher := w.(http.Flusher)
ctx, cancel := context.WithTimeout(r.Context(), 5*time.Minute)
defer cancel()
sseEvent := func(event, data string) {
fmt.Fprintf(w, "event: %s\ndata: %s\n\n", event, data)
flusher.Flush()
}
// ── Gather RAG context (§31.3 path) ──────────────────────────────────────
target := "all"
if n := sanitiseNodeName(req.Node); n != "" {
target = n
}
ragGoal := fmt.Sprintf("rag_context:gather_rag_context(%s, Context)", target)
ragResult, err := s.pool.Dispatch(WorkItem{Goal: ragGoal}, 3*time.Second)
if err != nil {
log.Printf("[Agent] RAG context failed: %v", err)
ragResult = &WorkResult{StringVar: "## RAG Context\n*Unavailable.*"}
}
// ── Build initial message history ────────────────────────────────────────
systemPrompt := buildAgentSystemPrompt(ragResult.StringVar)
session := &AgentSession{
Messages: []OllamaChatMessage{
{Role: "system", Content: systemPrompt},
{Role: "user", Content: req.Question},
},
TurnCount: 0,
MaxTurns: agentMaxTurns,
}
sseEvent("agent_start", fmt.Sprintf(`{"max_turns":%d}`, agentMaxTurns))
// ── Reasoning loop ────────────────────────────────────────────────────────
// Each iteration: call Ollama with the full message history, parse the
// response as a tool call or final answer, dispatch or terminate.
for session.TurnCount < session.MaxTurns {
session.TurnCount++
sseEvent("agent_turn", fmt.Sprintf(`{"turn":%d}`, session.TurnCount))
var responseBuf strings.Builder
streamErr := s.ollama.StreamWithHistory(ctx, session.Messages,
func(token string) { responseBuf.WriteString(token) })
if streamErr != nil {
sseEvent("agent_error", fmt.Sprintf(`{"error":"Ollama stream failed: %s"}`,
jsonEscape(streamErr.Error())))
return
}
rawResponse := responseBuf.String()
session.Messages = append(session.Messages, OllamaChatMessage{
Role: "assistant", Content: rawResponse,
})
toolCall, finalAnswer, correctionPrompt := parseToolResponse(rawResponse)
if correctionPrompt != "" {
sseEvent("agent_correction", `{"reason":"malformed_json"}`)
session.Messages = append(session.Messages, OllamaChatMessage{
Role: "user", Content: correctionPrompt,
})
continue
}
if finalAnswer != nil {
payload, _ := json.Marshal(finalAnswer)
sseEvent("agent_final", string(payload))
return
}
if toolCall != nil {
result, hitlTriggered := s.dispatchToolCall(ctx, toolCall, sseEvent)
if hitlTriggered {
// HITL suspension: agent loop terminates. Approval workflow takes over.
return
}
// Inject the tool result as a "tool" role message before next turn:
session.Messages = append(session.Messages, OllamaChatMessage{
Role: "tool",
Content: result,
})
}
}
// max_turns exhausted:
sseEvent("agent_max_turns", fmt.Sprintf(
`{"turns_used":%d,"message":"Agent exhausted %d turns without a final answer."}`,
agentMaxTurns, agentMaxTurns))
}
// dispatchToolCall routes a ToolCall to its handler.
// Returns (resultString, hitlTriggered).
func (s *Server) dispatchToolCall(
ctx context.Context,
tc *ToolCall,
sseEvent func(string, string),
) (string, bool) {
switch tc.Name {
case "query_wam":
return s.toolQueryWAM(ctx, tc.Parameters, sseEvent), false
case "schedule_ha_proposal":
return s.toolScheduleHAProposal(ctx, tc.Parameters, sseEvent), false
case "execute_migration":
s.toolExecuteMigration(ctx, tc.Parameters, sseEvent)
return "", true
default:
return fmt.Sprintf(`{"error":"unknown tool %q"}`, tc.Name), false
}
}
// toolQueryWAM dispatches a read-only Prolog goal to the WAM pool.
func (s *Server) toolQueryWAM(
ctx context.Context,
params json.RawMessage,
sseEvent func(string, string),
) string {
var p QueryWAMParams
if err := json.Unmarshal(params, &p); err != nil {
return `{"error":"invalid query_wam parameters"}`
}
if err := validateReadOnlyGoal(p.Goal); err != nil {
return fmt.Sprintf(`{"error":"goal rejected: %s"}`, err.Error())
}
sseEvent("agent_tool_call", fmt.Sprintf(`{"tool":"query_wam","goal":%s}`,
jsonEscape(p.Goal)))
result, err := s.pool.Dispatch(WorkItem{Goal: p.Goal}, 5*time.Second)
if err != nil {
return fmt.Sprintf(`{"error":"WAM dispatch failed: %s"}`, err.Error())
}
sseEvent("agent_tool_result", fmt.Sprintf(`{"tool":"query_wam","result":%s}`,
jsonEscape(result.StringVar)))
return fmt.Sprintf(`{"result":%s}`, jsonEscape(result.StringVar))
}
// toolScheduleHAProposal invokes the CLP(FD) scheduler and serialises the
// proposed migration actions without executing them.
func (s *Server) toolScheduleHAProposal(
ctx context.Context,
params json.RawMessage,
sseEvent func(string, string),
) string {
var p ScheduleHAParams
if err := json.Unmarshal(params, &p); err != nil {
return `{"error":"invalid schedule_ha_proposal parameters"}`
}
// Host and VM Prolog terms are constructed from typed Go structs —
// never from raw LLM-emitted strings. See §32.5.2 for the injection analysis.
hostsProlog := buildHostsProlog(p.Hosts)
vmsProlog := buildVMsProlog(p.VMs)
rackAtom := "false"
if p.UseRackConstraints {
rackAtom = "true"
}
goal := fmt.Sprintf(
"ha_scheduler:schedule_ha_timed(%s,%s,%s,10,Matrix,_),"+
"ha_scheduler:compute_migration_delta(%s,%s,Matrix,Actions),"+
"term_to_atom(Actions,ActionsAtom)",
hostsProlog, vmsProlog, rackAtom,
hostsProlog, vmsProlog,
)
sseEvent("agent_tool_call", `{"tool":"schedule_ha_proposal"}`)
result, err := s.pool.Dispatch(WorkItem{Goal: goal}, 15*time.Second)
if err != nil {
return fmt.Sprintf(`{"error":"CLP(FD) solver failed: %s"}`, err.Error())
}
sseEvent("agent_tool_result", fmt.Sprintf(
`{"tool":"schedule_ha_proposal","actions":%s}`,
jsonEscape(result.StringVar)))
return fmt.Sprintf(`{"proposed_actions":%s}`, jsonEscape(result.StringVar))
}
// validateReadOnlyGoal rejects any goal not beginning with a known safe predicate.
// This guard runs in Go before the CGO boundary — no Prolog evaluation occurs.
func validateReadOnlyGoal(goal string) error {
allowed := []string{
"node_health(",
"live_link(",
"can_reach(",
"active_alerts(",
"current_vm_host(",
"known_node(",
"proxmox_topology:known_node(",
"live_state:node_health(",
"live_state:node_metric(",
"rag_context:gather_rag_context(",
"cluster_aggregator:query_single_node_health(",
}
lower := strings.ToLower(strings.TrimSpace(goal))
for _, prefix := range allowed {
if strings.HasPrefix(lower, strings.ToLower(prefix)) {
return nil
}
}
return fmt.Errorf("goal %q not in read-only whitelist", goal)
}
// buildHostsProlog serialises []HostSpec into a Prolog list of host/3 terms.
// All values come from the typed Go struct — no string interpolation from LLM output.
func buildHostsProlog(hosts []HostSpec) string {
parts := make([]string, len(hosts))
for i, h := range hosts {
name := sanitiseNodeName(h.Name)
if name == "" {
name = "unknown"
}
parts[i] = fmt.Sprintf("host(%s,%d,%d)", name, h.RAMMIB, h.CPUMC)
}
return "[" + strings.Join(parts, ",") + "]"
}
// buildVMsProlog serialises []VMSpec into a Prolog list of vm/4 terms.
func buildVMsProlog(vms []VMSpec) string {
parts := make([]string, len(vms))
for i, v := range vms {
haTag := "standalone"
if v.HAGroup != "" {
if group := sanitiseNodeName(v.HAGroup); group != "" {
haTag = fmt.Sprintf("ha(%s)", group)
}
}
parts[i] = fmt.Sprintf("vm(%d,%d,%d,%s)", v.VMID, v.RAMMIB, v.CPUMC, haTag)
}
return "[" + strings.Join(parts, ",") + "]"
}
// buildAgentSystemPrompt reads the agent_system_prompt.txt template and
// substitutes the live WAM context for the <<WAM_CONTEXT>> placeholder.
func buildAgentSystemPrompt(ragContext string) string {
return strings.ReplaceAll(agentSystemPromptTemplate, "<<WAM_CONTEXT>>", ragContext)
}
32.3.3 StreamWithHistory
The standard StreamWithRetry sends a single user message. The agent loop requires sending the full accumulated message history. A thin extension to OllamaClient handles this:
// StreamWithHistory sends a full message history to Ollama and calls tokenFn
// for each response token. Used exclusively by the agent reasoning loop.
// temperature=0.1: lower than the Chapter 29 sovereign-analyst default (0.2)
// to reduce character-level errors in JSON structure during tool-call turns.
func (c *OllamaClient) StreamWithHistory(
ctx context.Context,
history []OllamaChatMessage,
tokenFn func(string),
) error {
req := OllamaChatRequest{
Model: c.model,
Messages: history,
Stream: true,
Options: OllamaOptions{
Temperature: 0.1,
NumPredict: 400, // tool-call turns are brief JSON, not essays
NumCtx: 8192,
},
}
return c.doStreamRequest(ctx, req, tokenFn)
}
32.4 The Build: Human-in-the-Loop (HITL) Veto
32.4.1 The Suspension Architecture
When the agent emits execute_migration, the Go server does not execute the migrations. It generates a cryptographic approval token, stores the pending migration plan in a sync.Map keyed by the token, publishes a migration_proposal SSE event to the Chapter 19 operator dashboard, and terminates the agent loop. The operator's browser renders an Approve/Deny card. On approval click, the browser POSTs the token to /api/v1/agent/approve. The Go handler validates the token, retrieves the plan, and executes each migration via exec.CommandContext. No migration occurs without a valid token POST. The token cannot be guessed (256 bits of entropy from crypto/rand). It expires after 10 minutes.
32.4.2 Go: HITL Suspension and Approval Handler
// pendingMigrations holds proposals awaiting operator approval.
// Key: approval token (hex string). Value: *ExecuteMigrationParams.
var pendingMigrations sync.Map
// toolExecuteMigration is called when the LLM emits execute_migration.
// It suspends execution and routes the proposal to the operator dashboard.
func (s *Server) toolExecuteMigration(
ctx context.Context,
params json.RawMessage,
sseEvent func(string, string),
) {
var p ExecuteMigrationParams
if err := json.Unmarshal(params, &p); err != nil || len(p.Plan) == 0 {
sseEvent("agent_error", `{"error":"invalid or empty execute_migration parameters"}`)
return
}
token, err := generateApprovalToken()
if err != nil {
sseEvent("agent_error", `{"error":"token generation failed"}`)
return
}
pendingMigrations.Store(token, &p)
// Token expiry: remove from map after 10 minutes if not acted on.
go func() {
time.Sleep(10 * time.Minute)
if _, loaded := pendingMigrations.LoadAndDelete(token); loaded {
log.Printf("[HITL] Token %s… expired without operator action", token[:8])
}
}()
// Publish proposal to operator dashboard:
planJSON, _ := json.Marshal(p.Plan)
s.broker.Publish(fmt.Sprintf(
"event: migration_proposal\ndata: {\"token\":%q,\"plan\":%s,\"reason\":%s,\"expires_in\":600}\n\n",
token, planJSON, jsonEscape(p.Reason),
))
sseEvent("agent_suspended", fmt.Sprintf(
`{"token":%q,"plan_count":%d,"message":"Migration plan submitted for operator approval. Agent suspended."}`,
token, len(p.Plan)))
log.Printf("[HITL] Proposal suspended. Token: %s… Actions: %d", token[:8], len(p.Plan))
}
// handleAgentApprove handles POST /api/v1/agent/approve.
func (s *Server) handleAgentApprove(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
http.Error(w, "method not allowed", http.StatusMethodNotAllowed)
return
}
var body struct {
Token string `json:"token"`
}
if err := json.NewDecoder(r.Body).Decode(&body); err != nil || body.Token == "" {
http.Error(w, "token required", http.StatusBadRequest)
return
}
val, ok := pendingMigrations.LoadAndDelete(body.Token)
if !ok {
http.Error(w, "token not found or expired", http.StatusNotFound)
return
}
plan := val.(*ExecuteMigrationParams)
log.Printf("[HITL] Operator APPROVED %d-action migration plan.", len(plan.Plan))
s.broker.Publish(fmt.Sprintf(
"event: migration_approved\ndata: {\"token\":%q,\"plan_count\":%d}\n\n",
body.Token, len(plan.Plan)))
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
defer cancel()
var results []string
for _, action := range plan.Plan {
var execErr error
switch action.Action {
case "migrate":
execErr = executeMigration(ctx, action.VMID, action.Target)
case "start":
execErr = executeVMStart(ctx, action.VMID, action.Target)
default:
execErr = fmt.Errorf("unknown action %q", action.Action)
}
if execErr != nil {
results = append(results, fmt.Sprintf("FAILED vmid=%d: %v", action.VMID, execErr))
log.Printf("[HITL] Migration failed: vmid=%d err=%v", action.VMID, execErr)
// Partial failure: continue remaining actions.
} else {
results = append(results, fmt.Sprintf("OK vmid=%d -> %s", action.VMID, action.Target))
}
}
w.Header().Set("Content-Type", "application/json")
resp, _ := json.Marshal(map[string]any{"executed": len(results), "results": results})
w.Write(resp)
}
// handleAgentDeny handles POST /api/v1/agent/deny.
func (s *Server) handleAgentDeny(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
http.Error(w, "method not allowed", http.StatusMethodNotAllowed)
return
}
var body struct{ Token string `json:"token"` }
if err := json.NewDecoder(r.Body).Decode(&body); err != nil || body.Token == "" {
http.Error(w, "token required", http.StatusBadRequest)
return
}
if _, ok := pendingMigrations.LoadAndDelete(body.Token); !ok {
http.Error(w, "token not found or expired", http.StatusNotFound)
return
}
log.Printf("[HITL] Operator DENIED migration plan. Token: %s…", body.Token[:8])
s.broker.Publish(fmt.Sprintf(
"event: migration_denied\ndata: {\"token\":%q}\n\n", body.Token))
w.WriteHeader(http.StatusOK)
}
// executeMigration calls qm migrate via exec.CommandContext.
// Arguments are passed as a slice — no shell interpolation.
func executeMigration(ctx context.Context, vmid int, target string) error {
target = sanitiseNodeName(target)
if target == "" {
return fmt.Errorf("invalid target node name")
}
cmd := exec.CommandContext(ctx,
"qm", "migrate", fmt.Sprintf("%d", vmid), target,
"--online", "--migration_network", "10.40.0.0/24",
)
if out, err := cmd.CombinedOutput(); err != nil {
return fmt.Errorf("qm migrate: %v — %s", err, out)
}
return nil
}
func executeVMStart(ctx context.Context, vmid int, target string) error {
target = sanitiseNodeName(target)
if target == "" {
return fmt.Errorf("invalid target node name")
}
cmd := exec.CommandContext(ctx, "pvesh", "create",
fmt.Sprintf("/nodes/%s/qemu/%d/status/start", target, vmid),
)
if out, err := cmd.CombinedOutput(); err != nil {
return fmt.Errorf("pvesh start: %v — %s", err, out)
}
return nil
}
func generateApprovalToken() (string, error) {
b := make([]byte, 32)
if _, err := rand.Read(b); err != nil {
return "", err
}
return hex.EncodeToString(b), nil
}
32.4.3 Route Registration
// In server.go:
mux.Handle("/api/v1/agent", apiGuards(s.handleAgent))
mux.Handle("/api/v1/agent/approve", apiGuards(s.handleAgentApprove))
mux.Handle("/api/v1/agent/deny", apiGuards(s.handleAgentDeny))
32.4.4 Vanilla JS: The HITL Approval UI
The dashboard's SSE listener already handles runbook_generated and answer_fragment events from Chapters 28–31. Three new event handlers are added: migration_proposal renders the approval card, migration_approved and migration_denied update it.
// File: /opt/logic-node/static/dashboard.js
// (Addition to the existing SSE event listener block from Chapter 19.)
es.addEventListener('migration_proposal', (e) => {
renderMigrationApprovalCard(JSON.parse(e.data));
});
es.addEventListener('migration_approved', (e) => {
const ev = JSON.parse(e.data);
const card = document.getElementById(`hitl-card-${ev.token.slice(0, 8)}`);
if (card) card.innerHTML =
`<div class="hitl-status approved">✓ Approved — executing ${ev.plan_count} migrations</div>`;
});
es.addEventListener('migration_denied', (e) => {
const ev = JSON.parse(e.data);
const card = document.getElementById(`hitl-card-${ev.token.slice(0, 8)}`);
if (card) card.innerHTML = `<div class="hitl-status denied">✗ Denied</div>`;
});
function renderMigrationApprovalCard(proposal) {
const tokenShort = proposal.token.slice(0, 8);
const planRows = proposal.plan.map(a =>
`<tr>
<td>${h(a.action)}</td>
<td><code>vmid ${h(String(a.vmid))}</code></td>
<td><code>${h(a.target)}</code></td>
</tr>`
).join('');
const card = document.createElement('div');
card.id = `hitl-card-${tokenShort}`;
card.className = 'hitl-card';
card.innerHTML = `
<div class="hitl-header">
<span class="hitl-badge">⚠ AI-Proposed Migration — Awaiting Approval</span>
<span class="hitl-expiry" id="hitl-expiry-${tokenShort}">
Expires in <b>${proposal.expires_in}s</b>
</span>
</div>
<div class="hitl-reason">${h(proposal.reason)}</div>
<table class="hitl-plan">
<thead><tr><th>Action</th><th>VM</th><th>Target</th></tr></thead>
<tbody>${planRows}</tbody>
</table>
<div class="hitl-controls">
<button class="hitl-approve" id="hitl-approve-${tokenShort}"
onclick="approveProposal('${h(proposal.token)}','${tokenShort}')">
✓ Approve
</button>
<button class="hitl-deny"
onclick="denyProposal('${h(proposal.token)}','${tokenShort}')">
✗ Deny
</button>
</div>`;
document.getElementById('hitl-proposals').appendChild(card);
// Countdown timer — UI feedback only. Server-side expiry is authoritative.
let remaining = proposal.expires_in;
const timer = setInterval(() => {
remaining--;
const el = document.getElementById(`hitl-expiry-${tokenShort}`);
if (el) el.innerHTML = `Expires in <b>${remaining}s</b>`;
if (remaining <= 0) {
clearInterval(timer);
const c = document.getElementById(`hitl-card-${tokenShort}`);
if (c) c.innerHTML = `<div class="hitl-status expired">⊘ Proposal expired</div>`;
}
}, 1000);
}
async function approveProposal(token, tokenShort) {
// Disable button immediately to prevent double-submission:
const btn = document.getElementById(`hitl-approve-${tokenShort}`);
if (btn) btn.disabled = true;
const resp = await fetch(`${API}/agent/approve`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${getAuthToken()}`,
},
body: JSON.stringify({ token }),
});
if (!resp.ok) {
const card = document.getElementById(`hitl-card-${tokenShort}`);
if (card) {
card.querySelector('.hitl-controls').innerHTML =
`<div class="hitl-status error">Approval failed: ${resp.status}</div>`;
}
}
// Success path: SSE migration_approved event updates the card.
}
async function denyProposal(token, tokenShort) {
await fetch(`${API}/agent/deny`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${getAuthToken()}`,
},
body: JSON.stringify({ token }),
});
// SSE migration_denied event updates the card.
}
32.5 Sovereign Security: The Agentic Boundary
32.5.1 Deductive Power Without Mutation Authority
The architectural property achieved by this chapter is precise: the LLM has unrestricted access to the cluster's logical state but zero access to its physical state. It can invoke node_health(pve7, Status) as many times as it wants, traverse the topology with can_reach, enumerate active alerts across all 14 nodes, and invoke the CLP(FD) solver to compute a mathematically optimal migration plan — none of these operations alter a single bit of cluster state. The WAM heap that backs these queries is mutated only by the Chapter 22 CGO ingestor, which reads from VictoriaMetrics. The LLM's query_wam tool calls are read-only by enforcement, not by convention.
This separation is not a rule the LLM is asked to follow. It is a structural property of the tool dispatch layer. validateReadOnlyGoal compares the goal string's first token against a static whitelist. A goal beginning with assertz, retract, or any predicate not in the whitelist is rejected in Go before it reaches the WAM pool — without Prolog evaluation, without any possibility of partial execution. The LLM cannot bypass this check by rephrasing the goal: the check operates on the syntactic form of the goal string, not on the model's intent.
32.5.2 The Two-Stage Injection Defence
The tool interface introduces a new injection surface absent from the Chapter 31 RAG architecture: the LLM now emits Prolog goal strings that are dispatched to the WAM. A compromised or jailbroken model might emit a goal string designed to escape the whitelist and execute a mutation.
The defence is two-stage. First, validateReadOnlyGoal — a prefix whitelist in Go that rejects any goal not beginning with a known safe predicate. Second, for the schedule_ha_proposal tool, the host and VM Prolog terms are constructed entirely from typed Go structs (HostSpec, VMSpec) whose fields are validated integers and sanitised node-name atoms. The LLM's JSON output is parsed into those structs; the Prolog goal string is constructed from the struct fields by buildHostsProlog and buildVMsProlog, not by string interpolation of LLM-emitted content. If the LLM were to emit {"name": "pve1), assertz(foo", "ram_mib": 32768}, sanitiseNodeName would reduce the name to an empty string, which maps to the atom unknown. The injection characters never appear in the goal string dispatched to the WAM.
32.5.3 The HITL Token as a Constitutional Guarantee
The approval token provides a guarantee stronger than a policy: it is a physical custody requirement. A migration cannot execute unless a valid token is POSTed to /api/v1/agent/approve. A valid token can only be generated by generateApprovalToken() using crypto/rand — it cannot be predicted by the LLM, computed by another process, or inferred by observing the SSE stream. The token expires after 10 minutes. The pending plan lives only in the server process's sync.Map — not on disk, not accessible via any endpoint other than approve and deny, not transmitted back to the LLM after the agent suspension event.
The operator's physical click is the final gate. The approve button calls fetch() from the operator's authenticated browser session whose JWT was issued at login. The sequence — LLM proposes, operator reads, operator decides, operator clicks, token POSTed, migrations execute — cannot be collapsed or short-circuited by the LLM, by the Go orchestrator, or by any automated process. It can only be completed by the human holding the authenticated session.
This is the operational definition of deductive power without executive authority: the system reasons to the correct mathematical answer and proves it via constraint propagation, but it cannot translate that proof into physical action without a human in the causal chain. The CLP(FD) solver's migration plan is a theorem; the operator's click is the only instrument that can give that theorem physical consequence.
No comments to display
No comments to display