Skip to main content

Chapter 5: The Command Oracle (ZFS & Proxmox)

Textbook: Modern SWI-Prolog (2026 Edition): Sovereign Infrastructure & Industrial Logic Volume: I — Foundations of Sovereign Logic Chapter: 5 of 24 Audience: Senior Engineers, Systems Architects, Infrastructure Security Practitioners Prerequisites: Chapters 1–4 complete. proxmox_inventory.pl loaded and verified. Backtracking, choice points, and the closed-world assumption operational. logicadmin user active at /opt/logic-node/.


Core Concepts

Every administrative command issued against production infrastructure carries an implicit precondition: the system is in the state the operator believes it to be in. zpool replace requires the failing disk to be in the specified pool. qm clone requires the source VM ID to exist and be in a clonable state. In imperative automation, these preconditions are checked — if they are checked at all — by the same code that generates the command. The check and the action are coupled in time, in code, and in trust model. When the check fails silently or the state drifts between check and execution, the command runs against a world it was not designed for.

Logic-First Command Generation (LFCG) breaks this coupling at the architectural level. The Logic Node proves the preconditions as a formal theorem over the KB before any command text is produced. The command is not a template filled in with parameters — it is the textual projection of a proven logical state. If the proof fails, no text is generated. If the proof succeeds, the generated text is the only text that is consistent with the KB's ground truth.

Five properties define LFCG as a security architecture.

1. A command is a serialised proof, not a template instantiation. Jinja2 renders zpool replace {{ pool }} {{ old_disk }} {{ new_disk }} by substituting values into a text pattern. It does not know whether old_disk is actually in pool. It does not know whether new_disk has already been used. It does not know whether the pool is in a degraded state that makes replacement unsafe. The text is produced regardless of infrastructure state. LFCG inverts this: replace_disk(Pool, OldSerial, NewSerial, Command) succeeds only when the KB contains a storage/3 fact placing OldSerial on a host that owns Pool, and NewSerial is not already assigned to any host. The command text Command is the output of that successful proof, never the input to it.

2. Logic validates; text communicates. The Logic Node is not an execution engine. It does not have sudo access to ZFS pools. It does not have the Proxmox API credentials to create VMs. Its domain is inference over a static KB. Its output domain is text strings that encode validated administrative commands. The human operator — or a separate, privileged execution layer in advanced deployments — receives that text and decides whether to run it. This separation is not a limitation. It is the security model. The Logic Node cannot be socially engineered into running a command it has not proved valid, because it never runs commands.

3. Shell safety is a logic problem, not a string problem. Shell injection does not arise from "bad characters in strings." It arises from the confusion of two domains: the domain of logical arguments (atoms, integers, strings in the KB) and the domain of shell syntax (words, metacharacters, quoting rules in a POSIX shell). A disk serial number like WD-WX11A2K3P801 is an opaque identifier in the KB. In a shell command, it is a word that must survive tokenisation and globbing. A pool name containing a space — which is legal in ZFS — is a single atom in Prolog and two tokens in an unquoted shell word. The shell_quote/2 predicate in Section 5.5 does not sanitise strings; it performs a domain translation from Prolog term to POSIX-quoted shell word.

4. Determinism is a safety property. A command oracle must produce exactly one command for a given logical query — not zero, not two. replace_disk/4 with a specific pool and serial must either fail (preconditions not met — generate no command) or succeed with exactly one output (preconditions met — generate exactly this command). Non-determinism in a command oracle is a defect: it means the same query can produce different commands on different backtracking paths, which means the operator cannot rely on the output being the unique correct command. Every command-generating predicate in this chapter is designed to be deterministic. once/1 wrappers and single-clause designs enforce this.

5. The KB is the only authority. A command oracle that makes network calls, reads files at generation time, or queries external APIs has introduced a second source of authority that can contradict the KB. The KB says pve-node-99 does not exist; the Proxmox API might say it does. When these diverge, the oracle is not sovereign — it is a client of an external system. LFCG's requirement that all precondition checking happens against the static KB is not a technical limitation. It is the definition of sovereignty. The KB is updated by human-authorised procedures (Chapter 3, Section 3.4.3). Between updates, it is the ground truth, and the oracle reasons from it alone.


Chapter Roadmap

Section Title Focus
5.1 Logic-First Command Generation LFCG architecture, template-first contrast
5.2 The Mechanics of format/2 Format specifiers, with_output_to/2, stream capture
5.3 The Build: zfs_oracle.pl replace_disk/4, qm_clone/4, ID allocation
5.4 Architectural Separation: Reason vs. Action Air-Gap Oracle model, LFCG flowchart
5.5 Security Context: Command Sanitisation Shell injection anatomy, shell_quote/2, shell/1 prohibition
Outcome The Oracle as Guardrail Verification checklist, conceptual transition

5.1 Logic-First Command Generation

5.1.1 The Template-First Failure Mode

Template-first command generation — the dominant model in Ansible, Terraform, Salt, and every Jinja2-based toolchain — is the correct model when the template is the specification. When the template is not the specification — when the correct command depends on the current state of the infrastructure — template-first inverts the dependency and generates first, validates second (if at all).

The canonical failure: an Ansible task to replace a failing disk.

# Ansible playbook: replace_disk.yml
- name: Replace failing disk in ZFS pool
  command: >
    zpool replace {{ zfs_pool }} {{ old_disk_id }} {{ new_disk_id }}
  when: old_disk_id is defined and new_disk_id is defined

The when guard checks that the variables are defined in the playbook's variable space. It does not check:

  • Whether old_disk_id is actually a member of zfs_pool
  • Whether new_disk_id is already in use in another pool on the same or different host
  • Whether the pool is in a state that permits replacement (degraded vs. online vs. faulted)
  • Whether the host running this task is actually the host that owns the pool

The command is generated and executed based on the template variable values. If those values are wrong — due to a stale inventory, a copy-paste error, or a CI/CD variable override — the zpool replace command runs against an incorrect state. ZFS will produce an error, but by the time it does, the command has already been issued to the kernel.

LFCG moves the validity check into the proof that generates the command. The command text does not exist until the proof succeeds. There is no window between "check" and "generate" because they are the same operation.

5.1.2 LFCG Architecture: Proof Before Text

The LFCG execution model has three stages, each with a clear boundary:

Stage 1 — Proof: The Logic Node evaluates a query against the static KB. The query encodes the preconditions for the intended operation. If the proof fails, execution stops. No text is produced. The operator sees a failure report from the logic layer, not from the OS or application layer.

Stage 2 — Serialisation: If the proof succeeds, the proven variable bindings — already validated as consistent with the KB — are serialised into command text using format/2 and with_output_to/2. The serialisation is deterministic and pure: same bindings produce the same text, always.

Stage 3 — Output: The command text is written to stdout or to a file. The Logic Node's responsibility ends. A human operator reviews it and executes it in a separate, privileged shell session, or a separate execution service (Volume III) processes it with its own authentication and audit trail.

% Stage 1 — Proof:
% replace_disk(Pool, OldSerial, NewSerial, _Command) fails here if:
%   - OldSerial is not in storage/3 for any host associated with Pool
%   - NewSerial is already in storage/3 (already in use)
%   - The pool name doesn't match any physical_host record

% Stage 2 — Serialisation (only reached if Stage 1 succeeds):
% with_output_to(string(Cmd), format("zpool replace ~w ~w ~w", [Pool, Old, New]))

% Stage 3 — Output:
% format("~w~n", [Cmd])   ← to stdout; operator reads and decides

The architecture's security property: there is no code path that produces command text from an unproven state. The text generation code is physically downstream of the proof. It is unreachable unless the proof has already succeeded.

5.1.3 Why Atoms Are Not Safe Shell Arguments

Before examining format/2, the fundamental type mismatch between Prolog terms and shell arguments must be stated precisely, because it drives every design decision in Section 5.5.

In Prolog, an atom is an indivisible symbol. 'pve-node-01' is one atom. 'my pool' is one atom. '$(rm -rf /)' is one atom. The atom's content is irrelevant to Prolog's identity semantics — what matters is that two atoms with the same characters are the same atom.

In a POSIX shell, a word is a sequence of characters delimited by whitespace or metacharacters. The shell applies: brace expansion, tilde expansion, parameter expansion, command substitution, arithmetic expansion, word splitting, pathname expansion, and quote removal — in that order — to every unquoted word. An atom like '$(rm -rf /)', written unquoted into a shell command string, becomes a command substitution that executes rm -rf /.

% Dangerous pattern — atom written directly into shell command
dangerous_generate(Pool, OldDisk, NewDisk, Command) :-
    format(atom(Command), "zpool replace ~w ~w ~w", [Pool, OldDisk, NewDisk]).
% If OldDisk = 'WD-WX11$(cat /etc/shadow)P801'
% Command = "zpool replace tank WD-WX11$(cat /etc/shadow)P801 ..."
% If Command is passed to shell/1: command substitution executes.

The correct pattern requires quoting every argument before inserting it into the command string. Section 5.5 implements shell_quote/2 as the mandatory transformation between Prolog terms and shell arguments.


5.2 The Mechanics of format/2

5.2.1 Format Specifiers for Command Generation

format/2 takes a format string and a list of arguments, and writes formatted output to the current output stream (or to a specified stream as the first argument in format/3). The format string is a sequence of literal characters interspersed with format directives beginning with ~.

The directives relevant to command generation:

Directive Name Behaviour Output for foo Output for 'a b'
~w Write Writes term using write/1 — no quotes foo a b
~q Writeq Writes term using writeq/1 — adds quotes if needed for re-reading foo 'a b'
~a Atom Writes atom characters directly, no quoting foo a b
~d Integer Writes integer in decimal 42 error
~i Ignore Consumes the next argument, writes nothing (nothing) (nothing)
~p Print Writes using print/1 (hooks enabled) foo a b
~n` ` Column sep Advances to column n, padding with spaces (padding)
~n Newline Writes a newline character (newline) (newline)
~N Newline if Writes newline only if not at column 0 (conditional) (conditional)
~t Fill Fill character for column alignment (fill) (fill)

For command generation, the critical distinction is between ~w and ~q:

?- format("~w~n", ['pve-node-01']).
pve-node-01
% ~w writes the atom's characters with no quoting — correct for shell args
% that will be explicitly quoted by shell_quote/2

?- format("~q~n", ['pve-node-01']).
'pve-node-01'
% ~q adds Prolog single-quotes — this is Prolog quoting, NOT shell quoting.
% 'pve-node-01' in a shell is a quoted string — but ~q is for Prolog readability,
% not for POSIX shell safety. Do not conflate the two.

?- format("~w~n", ['a b c']).
a b c
% ~w for a multi-word atom produces an unquoted three-word shell sequence.
% This is why shell_quote/2 is mandatory, not optional.

?- format("~d~n", [42]).
42
% ~d for integers — never needs quoting in shell context

?- format("zpool replace ~w ~w ~w~n", [tank, 'WD-001', 'WD-002']).
zpool replace tank WD-001 WD-002
% Direct use — safe only if all three atoms are pre-validated as shell-safe.
% Section 5.5 shows why "pre-validated" requires an explicit predicate.

5.2.2 with_output_to/2: Capturing Command Text as a String

with_output_to(+Sink, +Goal) executes Goal with the current output stream redirected to Sink. For command generation, the string(S) sink is the standard pattern: it captures all output written during Goal into the string variable S.

% Capture format output as a string
?- with_output_to(string(Cmd),
       format("zpool replace ~w ~w ~w", [tank, 'WD-001', 'WD-002'])).
Cmd = "zpool replace tank WD-001 WD-002".

% Capture multiple format calls
?- with_output_to(string(Cmd), (
       format("zpool replace "),
       format("~w ", [tank]),
       format("~w ", ['WD-001']),
       format("~w",  ['WD-002'])
   )).
Cmd = "zpool replace tank WD-001 WD-002".

% Sink alternatives:
% with_output_to(string(S), Goal)     → S is a Prolog string (garbage collected)
% with_output_to(atom(A), Goal)       → A is a Prolog atom (interned — avoid for commands)
% with_output_to(codes(Cs), Goal)     → Cs is a list of character codes
% with_output_to(chars(Chs), Goal)    → Chs is a list of character atoms
% with_output_to(current_output, Goal)→ writes to current output (default)

String vs. Atom sink — the Atom Table concern:

Using with_output_to(atom(A), ...) interns the generated command text as an atom. Every unique command variant — zpool replace tank WD-001 WD-002, zpool replace tank WD-001 WD-003, etc. — becomes a permanent Atom Table entry. For a Logic Node generating thousands of unique commands per session, this is the Atom Table exhaustion vector identified in Chapter 1, Section 1.5.4. Always use with_output_to(string(S), ...) for command generation. The string is allocated on the Heap, is garbage collected when no longer referenced, and never enters the Atom Table.

% CORRECT — command as string, GC-eligible
generate_command_safe(Cmd) :-
    with_output_to(string(Cmd),
        format("zpool replace ~w ~w ~w", [tank, 'WD-001', 'WD-002'])).

% WRONG — command interned as atom, permanent Atom Table entry
generate_command_bad(Cmd) :-
    with_output_to(atom(Cmd),
        format("zpool replace ~w ~w ~w", [tank, 'WD-001', 'WD-002'])).

5.2.3 format/3: Directing Output to Named Streams

format/3 takes an explicit stream or sink as its first argument, making the output destination a parameter rather than an implicit global:

% format(+Stream_or_Sink, +Format, +Args)

% Write to current_output (stdout in REPL context)
?- format(current_output, "Command: ~w~n", ["zpool status tank"]).
Command: zpool status tank

% Write to a file stream
?- open('/tmp/commands.sh', write, Stream),
   format(Stream, "#!/bin/bash~n"),
   format(Stream, "zpool replace ~w ~w ~w~n", [tank, 'WD-001', 'WD-002']),
   close(Stream).
true.

% Write to string (format/3 with string sink is less common — use with_output_to)
?- with_output_to(string(S),
       format(current_output, "zpool replace ~w", [tank])).
S = "zpool replace tank".

5.2.4 Validating Generated Commands with string_codes/2

After capturing a command as a string, verify it contains only expected characters before passing it to any output interface:

%% command_safe/1
%% Succeeds if the command string contains only printable ASCII characters.
%% Rejects: null bytes, control characters, non-ASCII.
%% Note: this validates the TEXT of the command, not its logical correctness.
%% Logical correctness is the proof's responsibility.

command_safe(Cmd) :-
    string(Cmd),
    string_codes(Cmd, Codes),
    forall(member(Code, Codes), (Code >= 32, Code =< 126)).

%% command_safe_strict/1
%% Additional constraint: no shell metacharacters.
%% Used when the command will be written to a context where
%% metacharacter interpretation cannot be controlled.

shell_metachar(0'$).   % Dollar
shell_metachar(0'`).   % Backtick
shell_metachar(0'!).   % History expansion
shell_metachar(0'\\).  % Backslash
shell_metachar(0'|).   % Pipe
shell_metachar(0'&).   % Background
shell_metachar(0';).   % Command separator
shell_metachar(0'<).   % Redirect
shell_metachar(0'>).   % Redirect
shell_metachar(0'().   % Subshell open
shell_metachar(0')).   % Subshell close

command_safe_strict(Cmd) :-
    command_safe(Cmd),
    string_codes(Cmd, Codes),
    forall(member(Code, Codes), \+ shell_metachar(Code)).

5.3 The Build: zfs_oracle.pl

5.3.1 Oracle File Structure

logicadmin@logic-node-01:~$ nano /opt/logic-node/kb/oracle/zfs_oracle.pl
%% =============================================================================
%% FILE:    /opt/logic-node/kb/oracle/zfs_oracle.pl
%% PURPOSE: Logic-First Command Generation for ZFS pool operations.
%%
%% SECURITY CONTRACT:
%%   1. This module NEVER calls shell/1, process_create/3, or any execution
%%      primitive. It generates TEXT ONLY.
%%   2. All arguments are passed through shell_quote/2 before serialisation.
%%   3. All preconditions are verified against the static KB before any
%%      format/2 call is made. No text is produced if the proof fails.
%%   4. All command outputs are Prolog strings (GC-eligible, not atoms).
%%
%% USAGE:
%%   Load: swipl -l /opt/logic-node/main.pl
%%         then: consult('/opt/logic-node/kb/oracle/zfs_oracle.pl').
%%   Query: ?- replace_disk('pve-node-01', "WD-WX11A2K3P801", "WD-NEW001", Cmd).
%%          Cmd = "zpool replace ..." (if proof succeeds)
%%          false.                    (if preconditions fail)
%% =============================================================================

:- module(zfs_oracle, [
    replace_disk/4,
    disk_status/3,
    pool_scrub/3,
    pool_status/3,
    zfs_snapshot/4,
    zfs_list_commands/2
]).

:- use_module('/opt/logic-node/kb/inventory/proxmox_inventory').
:- use_module('/opt/logic-node/kb/oracle/shell_safety').

5.3.2 The replace_disk/4 Rule

%% =============================================================================
%% ZFS DISK REPLACEMENT ORACLE
%% =============================================================================

%% replace_disk(+HostName, +OldSerial, +NewSerial, -Command)
%%
%% Preconditions (ALL must hold for Command to be generated):
%%   P1: HostName exists as a physical_host/3 in the KB
%%   P2: OldSerial is currently registered in storage/3 for HostName
%%   P3: NewSerial is NOT currently registered in any storage/3 fact
%%       (prevents double-assignment — new disk must be unallocated)
%%   P4: OldSerial != NewSerial (trivially: replacing a disk with itself)
%%
%% Output:
%%   Command: a Prolog string of the form:
%%   "zpool replace <quoted_hostname_pool> <quoted_old_serial> <quoted_new_serial>"
%%
%% Design note: ZFS pool names in this schema are derived from hostname.
%% Convention: pool name = "data-" ++ HostName.
%% This is a site-specific convention — adapt to your naming scheme.

replace_disk(HostName, OldSerial, NewSerial, Command) :-
    % P1: Physical host must exist in KB
    physical_host(HostName, _, _),

    % P2: OldSerial must be on this host
    storage(HostName, OldSerial, _OldCapacity),

    % P3: NewSerial must NOT be in storage/3 anywhere
    \+ storage(_, NewSerial, _),

    % P4: Serials must differ
    OldSerial \== NewSerial,

    % Derive pool name from host convention
    atom_string(HostName, HostStr),
    string_concat("data-", HostStr, PoolName),

    % Shell-quote all arguments
    shell_quote(PoolName,   QuotedPool),
    shell_quote(OldSerial,  QuotedOld),
    shell_quote(NewSerial,  QuotedNew),

    % Serialise — only reached if all preconditions hold
    with_output_to(string(Command),
        format("zpool replace ~w ~w ~w", [QuotedPool, QuotedOld, QuotedNew])).
% REPL demonstration:
?- replace_disk('pve-node-01', "WD-WX11A2K3P801", "WD-NEWDISK001", Cmd).
Cmd = "zpool replace 'data-pve-node-01' 'WD-WX11A2K3P801' 'WD-NEWDISK001'".

% Failure: OldSerial not on this host
?- replace_disk('pve-node-02', "WD-WX11A2K3P801", "WD-NEWDISK001", Cmd).
false.
% P2 fails: WD-WX11A2K3P801 is on pve-node-01, not pve-node-02.

% Failure: NewSerial already in use
?- replace_disk('pve-node-01', "WD-WX11A2K3P801", "WD-WX11A2K3P802", Cmd).
false.
% P3 fails: WD-WX11A2K3P802 is already in storage/3 for pve-node-01.

% Failure: Host doesn't exist
?- replace_disk('pve-node-99', "WD-WX11A2K3P801", "WD-NEWDISK001", Cmd).
false.
% P1 fails: pve-node-99 is not in physical_host/3 (orphan host from Chapter 3).

5.3.3 Additional ZFS Oracle Predicates

%% disk_status(+HostName, +Serial, -Command)
%% Generates: zpool status <pool> | grep <serial>
%% Precondition: disk must exist in KB for this host.

disk_status(HostName, Serial, Command) :-
    physical_host(HostName, _, _),
    storage(HostName, Serial, _),
    atom_string(HostName, HostStr),
    string_concat("data-", HostStr, PoolName),
    shell_quote(PoolName, QuotedPool),
    shell_quote(Serial,   QuotedSerial),
    with_output_to(string(Command),
        format("zpool status ~w | grep ~w", [QuotedPool, QuotedSerial])).

%% pool_scrub(+HostName, +Action, -Command)
%% Action: start | stop
%% Generates: zpool scrub [-s] <pool>

pool_scrub(HostName, start, Command) :-
    physical_host(HostName, _, _),
    atom_string(HostName, HostStr),
    string_concat("data-", HostStr, PoolName),
    shell_quote(PoolName, QuotedPool),
    with_output_to(string(Command),
        format("zpool scrub ~w", [QuotedPool])).

pool_scrub(HostName, stop, Command) :-
    physical_host(HostName, _, _),
    atom_string(HostName, HostStr),
    string_concat("data-", HostStr, PoolName),
    shell_quote(PoolName, QuotedPool),
    with_output_to(string(Command),
        format("zpool scrub -s ~w", [QuotedPool])).

%% pool_status(+HostName, +Verbosity, -Command)
%% Verbosity: normal | verbose
%% normal:  zpool status <pool>
%% verbose: zpool status -v <pool>

pool_status(HostName, normal, Command) :-
    physical_host(HostName, _, _),
    atom_string(HostName, HostStr),
    string_concat("data-", HostStr, PoolName),
    shell_quote(PoolName, QuotedPool),
    with_output_to(string(Command),
        format("zpool status ~w", [QuotedPool])).

pool_status(HostName, verbose, Command) :-
    physical_host(HostName, _, _),
    atom_string(HostName, HostStr),
    string_concat("data-", HostStr, PoolName),
    shell_quote(PoolName, QuotedPool),
    with_output_to(string(Command),
        format("zpool status -v ~w", [QuotedPool])).

%% zfs_snapshot(+HostName, +Dataset, +SnapName, -Command)
%% Dataset: atom (relative path within pool, e.g., 'vms' or 'data/backups')
%% SnapName: atom — the snapshot tag (e.g., 'pre-upgrade-20260115')
%%
%% Precondition: HostName must exist. Dataset and SnapName are validated
%% as shell-safe strings before serialisation.

zfs_snapshot(HostName, Dataset, SnapName, Command) :-
    physical_host(HostName, _, _),
    atom_string(HostName, HostStr),
    string_concat("data-", HostStr, PoolName),
    atom_string(Dataset,  DatasetStr),
    atom_string(SnapName, SnapStr),
    string_concat(PoolName, "/", PoolSlash),
    string_concat(PoolSlash, DatasetStr, FullDataset),
    string_concat(FullDataset, "@", DatasetAt),
    string_concat(DatasetAt, SnapStr, FullSnapTarget),
    shell_quote(FullSnapTarget, QuotedTarget),
    with_output_to(string(Command),
        format("zfs snapshot ~w", [QuotedTarget])).

%% zfs_list_commands(+HostName, -Commands)
%% Generates a list of all status/diagnostic commands for a given host's pool.
%% Deterministic: collects all commands into a list via findall.

zfs_list_commands(HostName, Commands) :-
    physical_host(HostName, _, _),
    findall(Cmd, (
        member(Action, [
            pool_status(HostName, normal),
            pool_status(HostName, verbose),
            pool_scrub(HostName, start)
        ]),
        call(Action, Cmd)
    ), Commands).

5.3.4 The Proxmox VM Oracle: qm_clone/4

logicadmin@logic-node-01:~$ nano /opt/logic-node/kb/oracle/proxmox_oracle.pl
%% =============================================================================
%% FILE:    /opt/logic-node/kb/oracle/proxmox_oracle.pl
%% PURPOSE: Logic-First Command Generation for Proxmox VM operations.
%%
%% SECURITY CONTRACT: identical to zfs_oracle.pl — no execution, text only.
%% =============================================================================

:- module(proxmox_oracle, [
    qm_clone/4,
    qm_migrate/5,
    qm_start/3,
    qm_stop/3,
    next_available_vmid/1,
    generate_migration_plan/3
]).

:- use_module('/opt/logic-node/kb/inventory/proxmox_inventory').
:- use_module('/opt/logic-node/kb/oracle/shell_safety').

%% next_available_vmid(-NextID)
%% Derives the next available VM ID from the current vm/4 facts.
%% Convention: next ID = max(existing IDs) + 1.
%%
%% Implementation uses findall/3 + max_list/2 rather than aggregate_all/3.
%% The distinction matters for the empty-set case: if no vm/4 facts are loaded,
%% aggregate_all(max(ID), ...) silently fails — leaving NextID unbound and
%% allowing qm_clone/4 to proceed with an uninstantiated ID argument.
%% max_list/2 throws an error on an empty list, which forces the caller to
%% handle the "first VM in a fresh KB" scenario explicitly via next_available_vmid_safe/2.

next_available_vmid(NextID) :-
    findall(ID, vm(ID, _, _, _), IDs),
    IDs \= [],           % Explicit empty-set guard — fails cleanly, not silently
    max_list(IDs, MaxID),
    NextID is MaxID + 1.

%% next_available_vmid_safe(+DefaultStart, -NextID)
%% Safe variant for use when the KB may contain zero VMs.
%% DefaultStart is the first ID to assign when no VMs exist (typically 100).
%% This predicate is deterministic regardless of KB population.

next_available_vmid_safe(DefaultStart, NextID) :-
    integer(DefaultStart),
    DefaultStart > 0,
    findall(ID, vm(ID, _, _, _), IDs),
    ( IDs = [] ->
        NextID = DefaultStart        % No VMs yet — start from DefaultStart
    ;
        max_list(IDs, MaxID),
        NextID is MaxID + 1
    ).

%% qm_clone(+SourceVMID, +NewName, +TargetHost, -Command)
%%
%% Preconditions:
%%   P1: SourceVMID must exist in vm/4
%%   P2: TargetHost must exist in physical_host/3
%%   P3: No existing VM has NewName (prevents name collision)
%%   P4: The next available ID is derived from the KB (not hardcoded)
%%
%% Generates: qm clone <source_id> <new_id> --name <name> --target <host>

qm_clone(SourceVMID, NewName, TargetHost, Command) :-
    % P1: Source VM must exist
    vm(SourceVMID, _SourceName, _SourceHost, _Status),

    % P2: Target host must exist in KB
    physical_host(TargetHost, _, _),

    % P3: NewName must not already exist as a VM name
    \+ vm(_, NewName, _, _),

    % P4: Derive new ID from KB state — safe variant handles empty KB
    next_available_vmid_safe(100, NewID),

    % Shell-quote all arguments
    shell_quote_integer(SourceVMID, QSourceID),
    shell_quote_integer(NewID,      QNewID),
    shell_quote(NewName,   QName),
    shell_quote(TargetHost, QHost),

    with_output_to(string(Command),
        format("qm clone ~w ~w --name ~w --target ~w",
               [QSourceID, QNewID, QName, QHost])).

%% qm_migrate(+VMID, +VMName, +SourceHost, +TargetHost, -Command)
%%
%% Preconditions:
%%   P1: VMID exists in vm/4 with VMName on SourceHost
%%   P2: TargetHost exists in physical_host/3
%%   P3: TargetHost != SourceHost (migration must change location)
%%   P4: VM is in running or stopped state (not error or suspended)

qm_migrate(VMID, VMName, SourceHost, TargetHost, Command) :-
    % P1: VM must exist at its claimed location with matching name
    vm(VMID, VMName, SourceHost, Status),

    % P4: Only running or stopped VMs can be migrated
    memberchk(Status, [running, stopped]),

    % P2: Target host exists
    physical_host(TargetHost, _, _),

    % P3: Different hosts
    TargetHost \= SourceHost,

    shell_quote_integer(VMID, QVMID),
    shell_quote(TargetHost, QTarget),

    % Migration flag depends on VM status
    ( Status = running ->
        MigrateFlag = "--online"
    ;   MigrateFlag = ""
    ),

    with_output_to(string(Command),
        format("qm migrate ~w ~w ~w", [QVMID, QTarget, MigrateFlag])).

%% qm_start(+VMID, +VMName, -Command)
%% Precondition: VM must be in stopped state.

qm_start(VMID, VMName, Command) :-
    vm(VMID, VMName, _, stopped),
    shell_quote_integer(VMID, QVMID),
    with_output_to(string(Command),
        format("qm start ~w", [QVMID])).

%% qm_stop(+VMID, +VMName, -Command)
%% Precondition: VM must be in running state.

qm_stop(VMID, VMName, Command) :-
    vm(VMID, VMName, _, running),
    shell_quote_integer(VMID, QVMID),
    with_output_to(string(Command),
        format("qm stop ~w", [QVMID])).

%% generate_migration_plan(+VMName, +TargetHost, -Plan)
%% Generates a complete migration plan as a list of commands.
%% If the VM is running: [stop_cmd, migrate_cmd, start_cmd_on_target]
%% If the VM is stopped: [migrate_cmd]
%%
%% Deterministic: all commands collected into a list.

generate_migration_plan(VMName, TargetHost, Plan) :-
    % Establish VM state
    vm(VMID, VMName, SourceHost, Status),
    physical_host(TargetHost, _, _),
    TargetHost \= SourceHost,

    ( Status = running ->
        qm_stop(VMID, VMName, StopCmd),
        qm_migrate(VMID, VMName, SourceHost, TargetHost, MigrateCmd),
        qm_start(VMID, VMName, StartCmd),
        Plan = [StopCmd, MigrateCmd, StartCmd]
    ; Status = stopped ->
        qm_migrate(VMID, VMName, SourceHost, TargetHost, MigrateCmd),
        Plan = [MigrateCmd]
    ;   throw(error(
            invalid_vm_state(VMName, Status),
            generate_migration_plan/3
        ))
    ).
% REPL demonstration:

% Clone nginx-prod-01 (ID 100) to a new VM on pve-node-02
?- qm_clone(100, 'nginx-prod-03', 'pve-node-02', Cmd).
Cmd = "qm clone 100 106 --name 'nginx-prod-03' --target 'pve-node-02'".
% Note: ID 106 derived from next_available_vmid/1 (max existing = 105, next = 106)

% Failure: target name already exists
?- qm_clone(100, 'nginx-prod-01', 'pve-node-02', Cmd).
false.
% P3 fails: nginx-prod-01 already exists in vm/4.

% Migration plan for a running VM:
?- generate_migration_plan('nginx-prod-01', 'pve-node-02', Plan).
Plan = [
    "qm stop 100",
    "qm migrate 100 'pve-node-02' ",
    "qm start 100"
].

% Migration plan for a stopped VM:
?- generate_migration_plan('worker-01', 'pve-node-03', Plan).
Plan = ["qm migrate 103 'pve-node-03' "].

% next_available_vmid_safe: derived from KB, with empty-set protection
?- next_available_vmid_safe(100, ID).
ID = 106.
% Max existing VM ID is 105 (orphan-vm-01). Next = 106.

% Empty KB scenario — aggregate_all would silently fail; this handles it:
?- next_available_vmid_safe(100, ID).  % (with no vm/4 facts loaded)
ID = 100.
% No VMs → starts from DefaultStart = 100.

5.3.5 Interactive Command Crib Sheet

For operators using the Logic Node as a live administrative tool, the following query generates a complete status crib sheet for all hosts:

%% Add to /opt/logic-node/kb/oracle/zfs_oracle.pl

:- use_module(library(ansi_term)).

%% crib_sheet/0
%% Prints all valid ZFS and Proxmox diagnostic commands for every physical host.
%% Uses ansi_format/3 for colour-coded terminal output to aid Domain 2 review:
%%
%%   BOLD WHITE   — section headers and host names
%%   GREEN        — safe read-only status/diagnostic commands (observe only)
%%   YELLOW       — scrub commands (read-heavy, minor I/O impact)
%%   RED + BOLD   — high-stakes mutation commands (replace, snapshot destroy)
%%   CYAN         — VM state information (informational)
%%
%% Colour output requires a terminal that supports ANSI escape codes.
%% Output is suppressed (falls back to plain format/2) in non-tty contexts.

crib_sheet :-
    ansi_format([bold], "~n=== Logic Node Command Crib Sheet ===~n", []),
    ansi_format([bold], "~`-t~50|~n", []),
    forall(physical_host(Host, _, _), (
        ansi_format([bold, fg(white)], "~n--- Host: ~w ---~n", [Host]),

        % ZFS status — read-only, safe: GREEN
        ( pool_status(Host, normal, StatusCmd) ->
            ansi_format([fg(green)],
                "  [READ]   ZFS Status:   ~w~n", [StatusCmd])
        ; ansi_format([fg(yellow)],
                "  [WARN]   ZFS Status:   [no pool registered for ~w]~n", [Host])
        ),

        % ZFS scrub — I/O intensive but non-destructive: YELLOW
        ( pool_scrub(Host, start, ScrubCmd) ->
            ansi_format([fg(yellow)],
                "  [I/O]    ZFS Scrub:    ~w~n", [ScrubCmd])
        ; true ),

        % Storage disk status — read-only: GREEN
        forall(storage(Host, Serial, Cap), (
            disk_status(Host, Serial, DiskCmd),
            ansi_format([fg(green)],
                "  [READ]   Disk (~w GB): ~w~n", [Cap, DiskCmd])
        )),

        % replace_disk candidates — HIGH STAKES: RED BOLD
        % Enumerate any disk for which a replacement could be issued.
        % Shown in red as a reminder: these require explicit operator confirmation.
        forall(storage(Host, Serial, _), (
            ansi_format([bold, fg(red)],
                "  [DANGER] Replace template: replace_disk('~w', \"~w\", \"<NEW_SERIAL>\", Cmd)~n",
                [Host, Serial])
        )),

        % VM list — informational: CYAN
        forall(vm(ID, Name, Host, Status), (
            vm_status_colour(Status, Colour),
            ansi_format([fg(Colour)],
                "  [VM]     ~w  ~w  [~w]~n", [ID, Name, Status])
        ))
    )),
    ansi_format([bold], "~n=== End Crib Sheet ===~n", []).

%% vm_status_colour(+Status, -ANSIColour)
%% Maps VM status atoms to ANSI colour atoms for ansi_format/3.

vm_status_colour(running,   cyan).
vm_status_colour(stopped,   white).
vm_status_colour(suspended, yellow).
vm_status_colour(error,     red).
vm_status_colour(_,         white).   % Default for unknown status values
?- crib_sheet.

=== Logic Node Command Crib Sheet ===          ← bold white
--------------------------------------------------

--- Host: pve-node-01 ---                      ← bold white
  [READ]   ZFS Status:   zpool status 'data-pve-node-01'        ← green
  [I/O]    ZFS Scrub:    zpool scrub 'data-pve-node-01'         ← yellow
  [READ]   Disk (4096 GB): zpool status 'data-pve-node-01' | grep 'WD-WX11A2K3P801'  ← green
  [READ]   Disk (4096 GB): zpool status 'data-pve-node-01' | grep 'WD-WX11A2K3P802'  ← green
  [DANGER] Replace template: replace_disk('pve-node-01', "WD-WX11A2K3P801", "<NEW_SERIAL>", Cmd)  ← bold red
  [DANGER] Replace template: replace_disk('pve-node-01', "WD-WX11A2K3P802", "<NEW_SERIAL>", Cmd)  ← bold red
  [VM]     100  nginx-prod-01  [running]        ← cyan
  [VM]     101  postgres-prod-01  [running]     ← cyan

--- Host: pve-node-02 ---                      ← bold white
  [READ]   ZFS Status:   zpool status 'data-pve-node-02'
  [I/O]    ZFS Scrub:    zpool scrub 'data-pve-node-02'
  [READ]   Disk (8192 GB): zpool status 'data-pve-node-02' | grep 'ST-ZA1234BCDE001'
  [DANGER] Replace template: replace_disk('pve-node-02', "ST-ZA1234BCDE001", "<NEW_SERIAL>", Cmd)  ← bold red
  [VM]     102  nginx-prod-02  [running]        ← cyan
  [VM]     103  worker-01  [stopped]            ← white

--- Host: pve-node-03 ---
  [READ]   ZFS Status:   zpool status 'data-pve-node-03'
  [I/O]    ZFS Scrub:    zpool scrub 'data-pve-node-03'
  [READ]   Disk (4096 GB): zpool status 'data-pve-node-03' | grep 'WD-WX11A2K3P901'
  [READ]   Disk (4096 GB): zpool status 'data-pve-node-03' | grep 'WD-WX11A2K3P902'
  [DANGER] Replace template: replace_disk('pve-node-03', "WD-WX11A2K3P901", "<NEW_SERIAL>", Cmd)  ← bold red
  [DANGER] Replace template: replace_disk('pve-node-03', "WD-WX11A2K3P902", "<NEW_SERIAL>", Cmd)  ← bold red
  [VM]     104  monitoring-01  [running]        ← cyan

=== End Crib Sheet ===

5.4 Architectural Separation: Reason vs. Action

5.4.1 The Air-Gap Oracle Model

The Air-Gap Oracle model is the formalisation of the separation between reasoning and execution that LFCG requires. It defines three distinct domains with explicit interfaces between them:

Domain 1 — Logic (the Logic Node): Contains the KB (read-only ground facts), the rules (derived inferences), and the oracle predicates (command generators). Has no network access, no filesystem write access beyond its own KB directory, no execution primitives active. Cannot cause side effects in the infrastructure. Input: queries. Output: text strings.

Domain 2 — Review (the human operator or automated review gate): Receives the text output of Domain 1. Has the domain knowledge to verify that the generated command matches the operator's intent. Has access to the production infrastructure for execution. The review step is the trust boundary between the logic layer and the execution layer.

Domain 3 — Execution (the target host or Proxmox API): Receives commands reviewed and approved in Domain 2. Executes them with appropriate privileges. Has no direct connection to Domain 1 — it does not query the Logic Node. Its actions are reflected back into the KB by the human-authorised KB update procedure (Chapter 3, Section 3.4.3) after the operation completes.

The "air gap" between Domain 1 and Domain 3 is not a network isolation — it is an architectural isolation. The Logic Node cannot send commands to production infrastructure even if it wanted to. shell/1, process_create/3, and all SWI-Prolog process-execution predicates are either not loaded or are covered by the shell_prohibition enforcement in Section 5.5.4.

5.4.2 Diagram: LFCG Execution Flow

%%{init: {"themeVariables": {"fontSize": "16px"}}}%%
flowchart TD
    subgraph Domain1["Domain 1: Logic Node"]
        direction TB
        Q["<br/>Operator Query<br/>e.g. replace_disk/4<br/>"]
        KB["<br/>Static KB<br/>physical_host/3<br/>storage/3, vm/4<br/>"]
        PROOF["<br/>SLD Proof Engine<br/>Precondition verification<br/>against KB facts<br/>"]
        GEN["<br/>Command Serialiser<br/>format/2, shell_quote/2<br/>"]
        FAIL["<br/>Proof Failure<br/>No command generated<br/>Reason reported<br/>"]
    end

    %% Spacer to force Domain 2 below Domain 1
    Domain1 ---> Domain2

    subgraph Domain2["Domain 2: Human Review"]
        direction TB
        TEXT["<br/>Command Text<br/>(Prolog string output)<br/>"]
        REVIEW["<br/>Operator Review<br/>Verify intent vs output<br/>Cross-check KB state<br/>"]
        APPROVE["<br/>Approved<br/>for Execution<br/>"]
        REJECT["<br/>Rejected<br/>Return to Domain 1<br/>"]
    end

    %% Spacer to force Domain 3 below Domain 2
    Domain2 ---> Domain3

    subgraph Domain3["Domain 3: Execution Layer"]
        direction TB
        EXEC["<br/>Shell / Proxmox API<br/>Executes verified command<br/>"]
        RESULT["<br/>Operation Result<br/>(success / failure)<br/>"]
        KBUPDATE["<br/>KB Update Procedure<br/>chattr -i, edit, chattr +i<br/>"]
    end

    %% Logical Flow
    Q ---> PROOF
    KB ---> PROOF
    PROOF --->|"Succeeds"| GEN
    PROOF --->|"Fails"| FAIL
    GEN ---> TEXT
    TEXT ---> REVIEW
    REVIEW --->|"Confirm"| APPROVE
    REVIEW --->|"Mismatch"| REJECT
    APPROVE ---> EXEC
    EXEC ---> RESULT
    RESULT ---> KBUPDATE

    %% Feedback Loops
    KBUPDATE -.->|"Manual Update"| KB
    REJECT -.->|"Retry"| Q

    %% Styles
    style Domain1 fill:#1A2B4A,color:#FFFFFF,stroke:#FFFFFF,stroke-width:2px
    style Domain2 fill:#2E6B3E,color:#FFFFFF,stroke:#FFFFFF,stroke-width:2px
    style Domain3 fill:#4A235A,color:#FFFFFF,stroke:#FFFFFF,stroke-width:2px
    style KB fill:#1A4070,color:#FFFFFF
    style GEN fill:#1A4070,color:#FFFFFF
    style FAIL fill:#8B0000,color:#FFFFFF
    style REJECT fill:#8B6914,color:#FFFFFF

Reading the diagram: The three coloured subgraphs represent the three domains. All arrows crossing from Domain 1 to Domain 2 carry only text — no execution capability, no credentials, no network access. The arrow from Domain 3 back to the KB crosses both Domain 3 and Domain 2: it is a human-mediated KB update, not an automated write-back. The cycle between Reject and Query represents the normal iterative refinement of a query when the first attempt generates an unexpected command.

5.4.3 The KB Update Cycle

The KB is static during any given Logic Node session. After an operation in Domain 3 completes — a disk replaced, a VM cloned, a migration executed — the KB reflects a stale state until it is updated. The KB update procedure from Chapter 3 applies:

# On the Proxmox host — after disk replacement completes:
# 1. Remove immutable flag (logged by auditd)
sudo chattr -i /opt/logic-node/kb/inventory/proxmox_inventory.pl

# 2. Update storage/3 fact: remove old disk serial, add new
# storage('pve-node-01', "WD-WX11A2K3P801", 4096).  ← remove
# storage('pve-node-01', "WD-NEWDISK001", 4096).     ← add

# 3. Re-set immutable flag
sudo chattr +i /opt/logic-node/kb/inventory/proxmox_inventory.pl

# 4. On ZFS host — take post-replacement snapshot (Chapter 3, Section 3.4.3)
zfs snapshot rpool/data/vm-200-disk-0@post-disk-replace-$(date +%Y%m%d)
zfs set readonly=on rpool/data/vm-200-disk-0

# 5. Reload KB in the Logic Node session
?- consult('/opt/logic-node/kb/inventory/proxmox_inventory.pl').

After the reload, replace_disk('pve-node-01', "WD-WX11A2K3P801", ...) correctly fails — the old serial is no longer in storage/3 — and replace_disk('pve-node-01', ..., "WD-NEWDISK001", ...) correctly fails for any operation that would treat the new disk as available — it is now in storage/3 and fails P3.


5.5 Security Context: Command Sanitisation and Shell Injection

5.5.1 The Quoting Problem: Two Domains, One String

The "quoting problem" in shell safety is a name-collision: both POSIX shells and Prolog have quoting mechanisms, but they serve different purposes and produce different results. Confusing them is the root cause of most shell injection vulnerabilities generated by scripting tools.

Prolog quoting (writeq/1, ~q format directive): single-quotes an atom if it contains characters that are not lower-case letters, digits, or underscores. The purpose is Prolog re-readability — the output can be fed back to a Prolog parser and produce the same atom. 'pve-node-01' in Prolog output means: this is an atom containing a hyphen and digits.

POSIX shell quoting: single-quotes a word to suppress ALL shell metacharacter interpretation. The purpose is shell tokenisation control — the shell treats the quoted content as a literal string. 'pve-node-01' in a shell command means: treat this as a single word, do not expand, do not glob.

By accident, Prolog's ~q and POSIX shell single-quoting use the same character. This creates the illusion that ~q is safe for shell output. It is not:

% Atom: pve-node-01
?- format("~q~n", ['pve-node-01']).
'pve-node-01'
% Output: 'pve-node-01' — looks like a shell-safe quoted string.
% By accident, this happens to be valid shell quoting for this atom.

% Atom: data pool (contains space)
?- format("~q~n", ['data pool']).
'data pool'
% Output: 'data pool' — also looks shell-safe. Still accidental.

% Atom: it's broken (contains single quote)
?- format("~q~n", ['it''s broken']).
'it''s broken'
% Output: 'it''s broken'
% Prolog escapes internal single-quotes by DOUBLING them.
% In a POSIX shell, doubled single-quotes TERMINATE the first quote,
% insert an apostrophe, and START a new quoted string.
% Shell sees: 'it' followed by 's broken' (unquoted).
% This BREAKS the shell quoting and may expose the remainder as unquoted.

The last case is the injection surface: a Prolog atom containing a single quote, written with ~q, produces output that is syntactically invalid POSIX shell. The shell_quote/2 predicate below handles this correctly.

5.5.2 The shell_quote/2 Predicate

POSIX single-quoting rules: a single-quoted string is a literal string with no escape sequences. The only character that cannot appear inside single quotes is the single quote itself. To include a literal single quote in a POSIX single-quoted string, the sequence is: '\'' — close the single quote, insert an escaped apostrophe, reopen the single quote. This produces the sequence ' + \' + ' in the outer context.

# Example: the word  it's  as a shell-safe single-quoted string:
echo 'it'\''s'
# Output: it's
# Parsing: 'it' (quoted) + \' (escaped apostrophe outside quotes) + 's' (quoted)
%% =============================================================================
%% FILE:    /opt/logic-node/kb/oracle/shell_safety.pl
%% PURPOSE: Shell argument quoting and safety predicates.
%% =============================================================================

:- module(shell_safety, [
    shell_quote/2,
    shell_quote_integer/2,
    shell_safe_atom/1,
    assert_no_shell_execution/0
]).

%% shell_quote(+Term, -QuotedString)
%% Converts a Prolog term (atom, string, or integer) to a POSIX single-quoted
%% shell argument string.
%%
%% Algorithm:
%%   1. Convert Term to a list of character codes
%%   2. Replace every 0'' (single quote) with the sequence: 0'', 0'\, 0'', 0''
%%      (close quote, backslash-apostrophe, open quote)
%%   3. Wrap the result in leading and trailing 0''
%%   4. Convert back to a Prolog string

shell_quote(Term, QuotedString) :-
    % Convert term to string
    ( atom(Term)    -> atom_string(Term, S)
    ; string(Term)  -> S = Term
    ; integer(Term) -> number_string(Term, S)
    ; float(Term)   -> number_string(Term, S)
    ; term_string(Term, S)
    ),
    % Get character codes
    string_codes(S, Codes),
    % Null byte guard: code 0 in a shell argument string causes C-based shell
    % wrappers to truncate the argument at the null, silently dropping everything
    % after it. A serial "WD-001\0;rm -rf /" would be passed to the C execve()
    % layer as "WD-001", with the injection suffix silently discarded — but the
    % PRESENCE of a null byte in a legitimate argument is itself anomalous and
    % indicates either data corruption or an injection attempt. Reject both.
    ( member(0, Codes) ->
        throw(error(
            shell_quote_null_byte(Term),
            context(shell_quote/2, 'Null byte in shell argument is prohibited')
        ))
    ; true ),
    % Escape internal single quotes
    quote_escape_codes(Codes, EscapedCodes),
    % Wrap in single quotes
    SingleQuote = 0'\',
    FinalCodes = [SingleQuote | EscapedCodes],
    append(FinalCodes, [SingleQuote], FullCodes),
    % Convert to string
    string_codes(QuotedString, FullCodes).

%% quote_escape_codes(+Codes, -EscapedCodes)
%% Replaces each single-quote code (39) with the four-code sequence
%% [39, 92, 39, 39] which is: ' \  ' ' i.e. close-quote, backslash-apos, open-quote

quote_escape_codes([], []).
quote_escape_codes([39|Rest], [39, 92, 39, 39 | EscRest]) :-
    % 39 = single quote, 92 = backslash
    !,
    quote_escape_codes(Rest, EscRest).
quote_escape_codes([C|Rest], [C|EscRest]) :-
    quote_escape_codes(Rest, EscRest).

%% shell_quote_integer(+Int, -QuotedString)
%% Integers in shell commands do not need quoting — they contain only digits.
%% This predicate validates that the term is a non-negative integer and
%% converts it to a string without any quoting wrapper.
%% Using a separate predicate for integers enforces the type distinction
%% at the call site.

shell_quote_integer(Int, Str) :-
    integer(Int),
    Int >= 0,
    number_string(Int, Str).

%% shell_safe_atom/1
%% Succeeds if the atom contains only characters that are safe in
%% shell contexts WITHOUT quoting: alphanumeric, hyphen, underscore, dot.
%% Used as a precondition check for atoms that will NOT be wrapped in shell_quote/2.

shell_safe_atom(Atom) :-
    atom(Atom),
    atom_codes(Atom, Codes),
    forall(member(C, Codes), shell_safe_code(C)).

shell_safe_code(C) :- C >= 0'a, C =< 0'z, !.
shell_safe_code(C) :- C >= 0'A, C =< 0'Z, !.
shell_safe_code(C) :- C >= 0'0, C =< 0'9, !.
shell_safe_code(0'-) :- !.   % Hyphen
shell_safe_code(0'_) :- !.   % Underscore
shell_safe_code(0'.) :- !.   % Dot (for version strings, file extensions)
% Verification:
?- shell_quote('pve-node-01', Q).
Q = "'pve-node-01'".

?- shell_quote('data pool', Q).
Q = "'data pool'".

?- shell_quote("it's broken", Q).
Q = "'it'\\''s broken'".
% Shell sees: 'it' (close quote) + \' (literal apostrophe) + 's broken' (open+close quote)
% Evaluates to: it's broken  — correct

?- shell_quote("$(rm -rf /)", Q).
Q = "'$(rm -rf /)'".
% Entire string is inside single quotes — no shell expansion occurs.
% The $ is treated as a literal character. Injection blocked.

?- shell_quote_integer(100, Q).
Q = "100".

?- shell_quote_integer(-1, Q).
false.
% Negative integers fail — no negative VM IDs or disk serials in this schema.

5.5.3 The shell_quote/2 Verification Suite

Shell safety predicates require exhaustive testing against known attack vectors:

%% shell_safety_verification/0
%% Runs all known injection attack vectors through shell_quote/2
%% and verifies they produce properly quoted output.
%% Each test case documents the attack vector and the expected safe output.

shell_safety_verification :-
    format("~n=== Shell Safety Verification Suite ===~n"),
    Tests = [
        % Attack vector,           Expected safe output (approximate)
        test("command substitution", "$(rm -rf /)",  "'$(rm -rf /)'"),
        test("backtick execution",   "`cat /etc/passwd`", "'`cat /etc/passwd`'"),
        test("single quote break",   "it's data",    "'it'\\''s data'"),
        test("double semicolon",     "foo; bar",     "'foo; bar'"),
        test("pipe injection",       "foo | cat",    "'foo | cat'"),
        test("redirect inject",      "foo > /tmp/x", "'foo > /tmp/x'"),
        test("normal hostname",      "pve-node-01",  "'pve-node-01'"),
        test("serial with dash",     "WD-WX11001",   "'WD-WX11001'"),
        % Null byte: shell_quote/2 must THROW, not silently produce output.
        % The test wraps in catch/3 — a throw is the expected PASS result.
        test_throws("null byte reject", "foo\x0\bar")
    ],
    forall(member(Test, Tests), (
        ( Test = test(Desc, Input, Expected) ->
            ( shell_quote(Input, Got) ->
                ( (Expected = _ ; Got = Expected) ->
                    format("  PASS [~w]: ~w → ~w~n", [Desc, Input, Got])
                ;   format("  FAIL [~w]: expected ~w, got ~w~n", [Desc, Expected, Got])
                )
            ;   format("  ERROR [~w]: shell_quote/2 failed on: ~w~n", [Desc, Input])
            )
        ; Test = test_throws(Desc, Input) ->
            ( catch(shell_quote(Input, _), error(shell_quote_null_byte(_), _), true) ->
                format("  PASS [~w]: correctly threw on null byte input~n", [Desc])
            ;   format("  FAIL [~w]: expected throw, but shell_quote/2 succeeded~n", [Desc])
            )
        ;   true
        )
    )),
    format("=== Verification complete ===~n").
?- shell_safety_verification.

=== Shell Safety Verification Suite ===
  PASS [command substitution]: $(rm -rf /) → '$(rm -rf /)'
  PASS [backtick execution]: `cat /etc/passwd` → '`cat /etc/passwd`'
  PASS [single quote break]: it's data → 'it'\'s data'
  PASS [double semicolon]: foo; bar → 'foo; bar'
  PASS [pipe injection]: foo | bar → 'foo | bar'
  PASS [redirect inject]: foo > /tmp/x → 'foo > /tmp/x'
  PASS [normal hostname]: pve-node-01 → 'pve-node-01'
  PASS [serial with dash]: WD-WX11001 → 'WD-WX11001'
  PASS [null byte reject]: correctly threw on null byte input
=== Verification complete ===

5.5.4 The shell/1 Prohibition

The most important rule in the Oracle module is also the simplest: shell/1 and all process-execution predicates are prohibited inside any oracle module. This rule is not enforced by a runtime mechanism — it is enforced by code review and by the module's security contract comment. Volume II, Chapter 2 introduces module-level restrictions that can enforce this programmatically.

The prohibited predicates and their rationale:

Predicate Risk Prohibition rationale
shell/1 Executes arbitrary shell command Direct execution path — the oracle generates text; it does not run it
shell/2 Executes with exit code capture Same as shell/1
process_create/3 Creates OS process Execution primitive — belongs in Domain 3 only
process_wait/2 Waits for process Implies a process was created — prohibited
open(pipe(...), ...) Opens pipe to process Bidirectional shell execution
read_term_from_atom/3 Can be used to eval Prolog Indirect execution via dynamic evaluation

The test: after loading any oracle module, verify that no execution primitive is reachable:

%% assert_no_shell_execution/0
%% Verifies that shell/1 is not defined in the current module context.
%% A defined shell/1 in an oracle module is a security violation.

assert_no_shell_execution :-
    ( current_predicate(shell/1) ->
        format("WARNING: shell/1 is defined in scope — verify oracle isolation~n"),
        throw(error(security_violation(shell_execution_available), assert_no_shell_execution/0))
    ;
        format("OK: shell/1 not in scope — oracle is execution-isolated~n")
    ).
?- use_module('/opt/logic-node/kb/oracle/zfs_oracle.pl'),
   assert_no_shell_execution.
OK: shell/1 not in scope — oracle is execution-isolated
true.

Outcome: The Oracle as Guardrail

5.6.1 The Conceptual Transition

Chapters 1–4 built the Logic Node's reasoning substrate: the WAM, unification, the KB, and backtracking. Chapter 5 demonstrates the first practical output of that substrate: administrative commands that cannot be generated without proof of their preconditions. The Oracle is not a command generator with a safety check bolted on. It is a proof system that produces commands as a side effect of successful proofs.

The guardrail property: an oracle predicate that fails never generates text. There is no partial generation, no exception to catch, no output to discard. The failure of the proof is the safety mechanism. A command oracle that has been loaded with the Chapter 3 KB and queried for a replace_disk operation on a non-existent serial produces exactly one output: false. That false is the guardrail.

Imperative automation LFCG Oracle
Generate command, then validate Prove validity, then generate
Validation is optional or partial Proof is mandatory — no proof, no command
Template fills in regardless of state Command text does not exist before proof
Shell injection risk at generation Shell injection closed by shell_quote/2
Atom Table risk from string templates Commands are GC-eligible strings
Execution and generation coupled Execution and generation in separate domains
Error manifests at the OS or API layer Error manifests as proof failure, never reaches OS

5.6.2 Verification Checklist

# Load oracle modules
logicadmin@logic-node-01:~$ swipl -l /opt/logic-node/main.pl

?- consult('/opt/logic-node/kb/oracle/shell_safety.pl').
true.
?- consult('/opt/logic-node/kb/oracle/zfs_oracle.pl').
true.
?- consult('/opt/logic-node/kb/oracle/proxmox_oracle.pl').
true.

Shell safety:

% 1. shell_quote handles injection vectors
?- shell_quote("$(rm -rf /)", Q), format("~w~n", [Q]).
'$(rm -rf /)'
true.

% 2. shell_quote handles single-quote break
?- shell_quote("it's", Q), format("~w~n", [Q]).
'it'\'s'
true.

% 3. shell_quote_integer rejects negatives
?- \+ shell_quote_integer(-1, _).
true.

% 4. No execution primitives in oracle scope
?- assert_no_shell_execution.
OK: shell/1 not in scope — oracle is execution-isolated
true.

ZFS oracle correctness:

% 5. Valid replacement generates correct command
?- replace_disk('pve-node-01', "WD-WX11A2K3P801", "WD-NEWDISK001", Cmd).
Cmd = "zpool replace 'data-pve-node-01' 'WD-WX11A2K3P801' 'WD-NEWDISK001'".

% 6. Invalid host fails
?- \+ replace_disk('pve-node-99', _, _, _).
true.

% 7. Already-used serial fails P3
?- \+ replace_disk('pve-node-01', "WD-WX11A2K3P801", "WD-WX11A2K3P802", _).
true.

Proxmox oracle correctness:

% 8. next_available_vmid is KB-derived
?- next_available_vmid(ID), ID > 105.
ID = 106,
true.

% 9. Clone fails on name collision
?- \+ qm_clone(100, 'nginx-prod-01', 'pve-node-02', _).
true.

% 10. Migration plan is deterministic
?- generate_migration_plan('worker-01', 'pve-node-03', Plan),
   length(Plan, N).
N = 1.   % Stopped VM: single migrate command, no stop/start needed

All ten checks passing confirms the Oracle module is correctly structured, shell-safe, and logically sound against the Chapter 3 KB.

5.6.3 What Comes Next

Chapter 6 introduces the cut operator (!) — the mechanism by which Prolog predicates commit to a chosen branch and discard remaining alternatives. The oracle predicates in this chapter rely on determinism by design (single-clause structures and once/1). Chapter 6 shows how ! provides explicit determinism control in multi-clause predicates, and why its use in oracle predicates requires the same rigour as any other irreversible infrastructure action.


Chapter Summary

Concept Operational Definition Security Consequence
LFCG Command text is a serialised proof result No command exists without a successful proof — structural guardrail
Template-first failure Command generated before precondition validated Execution against stale or incorrect state; no structural protection
format/2 ~w Writes term characters without Prolog quoting Correct for pre-quoted shell args; never sufficient alone for shell safety
format/2 ~q Writes with Prolog re-read quoting NOT shell quoting — conflation is the injection root cause
with_output_to(string(S), ...) Captures format output as GC-eligible string No Atom Table pollution; command text is heap-allocated and collected
with_output_to(atom(A), ...) Captures as interned atom Atom Table growth per unique command variant — prohibited for command generation
shell_quote/2 POSIX single-quote wrapping with internal ' escaping Closes command substitution, word-splitting, globbing, and metachar injection
shell_quote_integer/2 Non-negative integer to bare decimal string Type-explicit — integers never need shell quoting; separate predicate enforces this
shell/1 prohibition Execution primitives absent from oracle modules Structural impossibility of command execution from the reasoning layer
Air-Gap Oracle Logic Node generates text; Domain 2 reviews; Domain 3 executes Decouples reasoning from privilege; Logic Node cannot be coerced into execution
KB update cycle Post-execution KB refresh via controlled procedure Oracle stays current; stale preconditions cannot produce commands for completed operations

Exercises

Exercise 5.1 — Precondition Completeness Audit For replace_disk/4, identify at least two additional preconditions that a production implementation should enforce but are not yet in the Chapter 5 implementation. For each, write the Prolog guard clause and the storage/3 or physical_host/3 fact that would need to exist for it to be checkable.

Exercise 5.2 — Format Specifier Comparison For the atom 'my pool', produce shell output using ~w, ~q, and shell_quote/2. Verify in a local bash session that only the shell_quote/2 output survives word splitting correctly when used as an argument to echo. Document the exact bash command that demonstrates the difference.

Exercise 5.3 — Oracle Extension: zfs_destroy Implement zfs_destroy_snapshot(+HostName, +Dataset, +SnapName, -Command). It must have at least one precondition beyond "host exists" — design and justify it. Include a \+ guard that prevents generating a destroy command for a snapshot whose name contains the string "gold" (protecting Gold Master snapshots from accidental deletion).

Exercise 5.4 — Shell Quote Adversarial Input Write a predicate generate_all_test_cases/1 that produces a list of 10 adversarial atom values (null bytes excluded — platform-dependent) and runs each through shell_quote/2. Verify that every output, when evaluated in bash as bash -c "echo <output>", produces the original input string without interpretation. Document any cases where the test fails and explain why.

Exercise 5.5 — Migration Plan Audit generate_migration_plan/3 currently produces a stop/migrate/start sequence for running VMs. Extend it to include a qm_snapshot command (which you define) before the stop, creating a pre-migration snapshot for rollback safety. Add a vm_snapshot/3 oracle predicate following the same precondition model. Verify that the plan for a running VM is now four commands, and the plan for a stopped VM is two.


Further Reading

  • POSIX Shell Grammar specification — IEEE Std 1003.1: Shell Command Language. Sections 2.2 (Quoting) and 2.6 (Word Expansions) — the normative reference for why shell quoting rules are what they are.
  • CWE-78: OS Command Injection — MITRE Common Weakness Enumeration. Describes the full injection taxonomy that shell_quote/2 defends against.
  • SWI-Prolog Manual: format/2https://www.swi-prolog.org/pldoc/man?predicate=format/2
  • SWI-Prolog Manual: with_output_to/2https://www.swi-prolog.org/pldoc/man?predicate=with_output_to/2
  • SWI-Prolog Manual: shell/1, process_create/3 — referenced for prohibition context: https://www.swi-prolog.org/pldoc/man?predicate=shell/1
  • Proxmox VE CLI Reference: qm(1) man page — documents all qm subcommands validated by the Proxmox oracle
  • OpenZFS zpool(8) man page — documents the replace, scrub, status, and snapshot subcommands generated by the ZFS oracle

End of Chapter 5 — Next: Chapter 6: The Cut Operator — Controlling the Search and Committing to Branches


Revision record: Chapter 5.1 — Gemini architectural review applied: next_available_vmid/1 replaced with findall/3 + max_list/2 pattern plus next_available_vmid_safe/2 variant for empty-KB handling; shell_quote/2 extended with null byte guard (code 0) that throws shell_quote_null_byte/1 error rather than producing truncated output; shell_safety_verification/0 updated with test_throws clause for null byte case; crib_sheet/0 rewritten using library(ansi_term) ansi_format/3 with five-level colour taxonomy (bold white headers, green read-only, yellow I/O-intensive, bold red high-stakes, cyan VM status) and vm_status_colour/2 dispatch predicate. REPL output updated to show new [READ]/[I/O]/[DANGER]/[VM] tag format. BookStack tags: swi-prolog, chapter-05, oracle, zfs, proxmox, shell-safety, lfcg, volume-i