Skip to main content

Chapter 24: The Physics of CLP(FD)

Classical Prolog solves constraint problems by generating candidate values and testing them — a strategy that scales with combinatorial brutality on any problem where the variable count exceeds a handful and the domain is large, because every failed test point wastes work that propagation would have pruned before the search began. Constraint Logic Programming over Finite Domains (library(clpfd)) inverts this: constraints are posted as relations that immediately reduce the domains of all variables they mention, so that by the time the search phase begins the space of possible values has been mathematically compressed to a fraction of its original size — and the search is bounded, deterministic in its termination guarantee, and productive even when run backwards or with partially-instantiated variables.


24.1 Relational Arithmetic vs Imperative Evaluation

24.1.1 The Directionality Problem with is/2

Prolog's built-in arithmetic evaluator is/2 is an imperative instruction: evaluate the right-hand side expression using the current bindings and unify the result with the left-hand side. The right-hand side must be fully ground at the moment of evaluation. If it is not, is/2 throws error(instantiation_error, context(is/2, _)) — not because the arithmetic is impossible, but because the evaluator is a function that requires all inputs before it can produce any output.

?- X is 3 + Y.
ERROR: Arguments are not sufficiently instantiated
ERROR: In:
ERROR:   [2] _G1 is 3+_G2

The error is not recoverable by posting additional constraints later. The evaluation is a one-shot imperative that either succeeds immediately or fails permanently. For arithmetic expressions where all values are known this design is correct — is/2 has served classical Prolog well for 40 years. It becomes a liability in any domain where the relationship between variables is known before the specific values are. That is the definition of every combinatorial optimisation problem: we know that total RAM usage must not exceed host capacity before we know which VMs are assigned to which host.

The mismatch forces classical Prolog into generate-and-test: enumerate concrete candidates by backtracking, then evaluate is/2 on each candidate to check whether constraints are satisfied. The test is cheap; the enumeration is expensive. For a 14-hypervisor, 50-VM placement problem the enumeration space is 14^50 before the first constraint is evaluated.

24.1.2 CLP(FD) Relational Arithmetic

#=/2 is a constraint, not an evaluator. It posts a relation between two arithmetic expressions over integer variables and immediately begins propagating its consequences into the domains of those variables. The expressions on either side may contain uninstantiated variables; the constraint does not require ground input.

?- use_module(library(clpfd)).
true.

?- X + Y #= 10.
X+Y#=10.

The result is not a value — it is a residual constraint: X+Y#=10 printed back as evidence that the constraint has been stored and will be enforced as subsequent information arrives. No value has been computed because none are yet known. The constraint is alive.

?- X + Y #= 10, X #= 3.
X = 3,
Y = 7.

When X #= 3 is posted, the propagator for X+Y#=10 fires immediately: it knows X=3, so 3+Y#=10 reduces to Y#=7, and Y is instantiated. No search was performed. No backtracking occurred. The second constraint provided enough information for the first to compute a unique solution by pure propagation.

The directionality is symmetrical — the system computes Y from X and X from Y equally:

?- X + Y #= 10, Y #= 7.
X = 3,
Y = 7.

Constraints can also detect infeasibility without any search:

?- X + Y #= 10, X #> 6, Y #> 6.
false.

X > 6 and Y > 6 implies X + Y > 12, which contradicts X + Y = 10. The contradiction is detected during propagation, before any labeling call. No search was attempted; the system proved impossibility through constraint combination alone.

24.1.3 The Full Set of CLP(FD) Relational Operators

CLP(FD) provides relational counterparts for all six comparison operators, plus equivalences and reification:

% Equality and inequality:
?- X #= 5.        X = 5.
?- X #\= 5, X in 3..7, labeling([], [X]).
   X = 3 ; X = 4 ; X = 6 ; X = 7.

% Ordering:
?- X #< Y, X in 1..3, Y in 1..3, labeling([], [X,Y]).
   X=1,Y=2 ; X=1,Y=3 ; X=2,Y=3.

?- X #> 5, X in 1..8.
   X in 6..8.

% Greater-or-equal, less-or-equal:
?- X #>= Y, X in 1..3, Y in 1..3, X #\= Y, labeling([], [X,Y]).
   X=2,Y=1 ; X=3,Y=1 ; X=3,Y=2.

All operators work in all directions: X #< Y constrains both X and Y simultaneously. Posting X #< Y when X in 5..10 immediately propagates Y in 6..inf (Y must exceed X's minimum), and X in -inf..9 (X must be below Y's maximum), intersected with Y's declared domain.

24.1.4 The Computational Contract

The computational contract of CLP(FD) is explicit and enforced: constraint posting is deterministic, runs in polynomial time, and monotonically reduces domains. Labeling (the search phase, §24.4) is non-deterministic, may backtrack, and uses the reduced domains produced by propagation. The two phases are strictly separated — posting constraints never searches, and labeling never posts new propagating constraints of its own. This separation is the source of both the correctness guarantee and the efficiency advantage.

% The correct CLP(FD) pattern for a capacity planning predicate:

fits_on_host(VMList, HostCapacity) :-
    % Phase 1: post constraints (deterministic, O(N), no search)
    maplist([vm(_, Size)]>>(Size #>= 0), VMList),
    maplist([vm(_, Size), S]>>(S = Size), VMList, Sizes),
    sum(Sizes, #=<, HostCapacity).
    % No labeling/2 call here. Constraint posting is the complete predicate body.
    % If called with ground arguments, it succeeds or fails by propagation.
    % If called with unground arguments, it produces residual constraints.

24.2 The Mechanics of Domain Propagation

24.2.1 in/2 and ins/2

in/2 declares the domain of a single CLP(FD) variable. ins/2 declares the same domain for a list of variables simultaneously. Both take a domain expression: Low..High for a contiguous integer range, Low..High \/ Low2..High2 for a union of ranges, or {V1, V2, V3} for an explicit finite set.

?- X in 1..10.
X in 1..10.

?- [A, B, C] ins 0..7.
A in 0..7,
B in 0..7,
C in 0..7.

?- X in 1..5 \/ 8..12.
X in 1..5\/8..12.

?- X in {2, 4, 6, 8}.
X in 2\/4\/6\/8.

?- X in 0..sup.      % sup = +infinity
X in 0..sup.

?- X in inf..(-1).   % inf = -infinity
X in inf.. -1.

The domain is not a set of stored values — it is a constraint on what values the variable may eventually take. Domain declarations participate in constraint propagation: if a subsequent constraint eliminates all values from a domain, the system fails immediately without search.

% Domain intersection produces the tightest declared range:
?- X in 1..10, X in 5..20.
X in 5..10.

% Domain declaration + constraint → immediate reduction:
?- X in 1..10, X #> 7.
X in 8..10.

% Domain exhaustion → immediate failure (no search needed):
?- X in 1..5, X #> 10.
false.

24.2.2 Arc Consistency and Bounds Propagation

Arc consistency is a property of the constraint network: for every value remaining in a variable's domain, there exists at least one consistent assignment of values for every other variable that the variable shares a constraint with. When a propagator achieves arc consistency, it removes all values that cannot participate in any solution — not just values that obviously violate a single constraint, but values that are locally valid but globally unsatisfiable.

Bounds propagation is a weaker but computationally cheaper approach: propagators only update the minimum and maximum bounds of each variable's domain, not every value within it. SWI-Prolog's CLP(FD) uses bounds propagation by default for arithmetic constraints and escalates to full arc consistency (via propagator_constraint/3 internals) for constraints like all_different/1 where bound-only propagation would miss significant reductions.

The following REPL transcript demonstrates domain reduction in sequence — each constraint posting narrows one or both variable domains before any values are searched:

% Step 1: Declare initial domains.
?- X in 1..10, Y in 5..15.
X in 1..10,
Y in 5..15.

% Step 2: Post X #> Y.
% X must exceed Y. Y's minimum is 5, so X must be at least 6 (X > 5).
% X's maximum is 10, so Y must be at most 9 (Y < 10).
% Both domains shrink in a single constraint posting.
?- X in 1..10, Y in 5..15, X #> Y.
X in 6..10,
Y in 5..9.

% Step 3: Add X #< 8.
% X is now bounded above by 7. X in 6..7.
% Y must still satisfy Y < X. X's new maximum is 7, so Y < 7, Y =< 6.
% Y in 5..6.
?- X in 1..10, Y in 5..15, X #> Y, X #< 8.
X in 6..7,
Y in 5..6.

% Step 4: Add Y #>= 6.
% Y is forced to 6 (the only value in 5..6 that is >= 6).
% Y = 6. Now X must satisfy X > 6, and X in 6..7, so X = 7.
% Both variables fully instantiated by propagation alone. No labeling.
?- X in 1..10, Y in 5..15, X #> Y, X #< 8, Y #>= 6.
X = 7,
Y = 6.

% Step 5: Demonstrate failure detection without labeling.
% After the domain reduction in step 3 (X in 6..7, Y in 5..6),
% add X #< 6. This requires X < 6, but X in 6..7 means X >= 6.
% Domain becomes empty — immediate failure, no search.
?- X in 1..10, Y in 5..15, X #> Y, X #< 8, X #< 6.
false.

The key observation: every domain reduction was achieved by constraint posting, not by search. At the end of step 3, without having tried a single concrete value, the system knows that X can only be 6 or 7, and Y can only be 5 or 6. The search tree that labeling/2 would need to explore has been reduced from 10 × 11 = 110 nodes to 2 × 2 = 4 nodes — a 96% reduction before the first labeling choice.

24.2.3 Constrain-and-Generate vs Generate-and-Test

%%{init: {"themeVariables": {"fontSize": "14px"}}}%%
flowchart TD
    ROOT["Search Problem\nX in 1..10, Y in 1..10\nGoal: X + Y = 7 and X > Y\n100 total candidate pairs"]

    GANDT["Generate-and-Test\n(Classical Prolog)\nEnumerate ALL pairs first\nTest constraint after generation"]

    GT1["X=1: try Y=1..10\nTest X+Y=7 AND X>Y\nY=6 satisfies sum but 1>6 fails\n10 tests, 0 solutions"]
    GT2["X=2: try Y=1..10\nY=5 satisfies sum but 2>5 fails\n10 tests, 0 solutions"]
    GT3["X=3: try Y=1..10\nY=4 satisfies sum but 3>4 fails\n10 tests, 0 solutions"]
    GT4["X=4: try Y=1..10\nY=3 satisfies sum AND 4>3 ✓\n10 tests, 1 solution found\nafter 40 wasted tests"]
    GT5["X=5..10: try Y=1..10\nX+Y=7 impossible for X >= 5\n60 more useless tests\nTotal wasted: 97 of 100"]

    CLAG["Constrain-and-Generate\n(CLP(FD))\nPost constraints first\nDomains reduce before search"]

    CP1["Post X + Y #= 7\nX in 1..10, Y in 1..10\n→ X in 1..6  (Y >=1 forces X <=6)\n→ Y in 1..6  (X >=1 forces Y <=6)\nDomain reduced: 100 → 36 pairs"]
    CP2["Post X #> Y\n→ X must exceed Y\n→ X in 2..6  (Y >=1 forces X >=2)\n→ Y in 1..5  (X <=6 forces Y <=5)\nDomain reduced: 36 → valid triples only\nSearch space: 3 candidate pairs remain"]
    CP3["Label [ff] — 3 nodes only\nX=4, Y=3 ✓\nX=5, Y=2 ✓\nX=6, Y=1 ✓\nComplete solution set\n97 nodes never visited"]

    ROOT --> GANDT
    ROOT --> CLAG
    GANDT --> GT1
    GANDT --> GT2
    GANDT --> GT3
    GANDT --> GT4
    GANDT --> GT5
    CLAG --> CP1
    CP1 --> CP2
    CP2 --> CP3

    style ROOT fill:#1A2B4A,color:#FFFFFF
    style GANDT fill:#7A1A1A,color:#FFFFFF
    style GT1 fill:#5A0A0A,color:#FFFFFF
    style GT2 fill:#5A0A0A,color:#FFFFFF
    style GT3 fill:#5A0A0A,color:#FFFFFF
    style GT4 fill:#2A4A2A,color:#FFFFFF
    style GT5 fill:#5A0A0A,color:#FFFFFF
    style CLAG fill:#1A4070,color:#FFFFFF
    style CP1 fill:#1A3560,color:#FFFFFF
    style CP2 fill:#1A3560,color:#FFFFFF
    style CP3 fill:#1A6B3A,color:#FFFFFF

The combinatorial gap widens with problem size. For a bin-packing problem with 14 hypervisors and 50 VMs, the classical generate-and-test space is 14^50 ≈ 10^57 candidates. CLP(FD) capacity constraints on per-hypervisor RAM eliminate all assignments where any hypervisor is overcommitted before a single candidate is enumerated. The constraint propagation phase runs in O(N × M) time where N is the number of VMs and M is the number of hosts; the search phase operates on whatever residual space remains after propagation.


24.3 The Build: Hardware Capacity Constraint Engine

24.3.1 Design

capacity_solver.pl is the constraint-posting module of the VM placement system. Its responsibility is to declare variable domains and post capacity constraints. It never calls labeling/2. The separation is enforced by design, not by convention: a module that only posts constraints is safe to call in any context — during route planning, during health-rule evaluation, during Chapter 25's bin-packer — without triggering a search that would block the WAM engine.

The module implements three levels of the placement problem. vm_capacity_check/5 handles a single hypervisor: given a host's RAM and CPU budget and a list of VM resource requirements, it creates one binary placement variable per VM and posts the knapsack capacity constraints. vm_capacity_check_multi/3 handles a full cluster: it builds a Host × VM matrix of placement variables, posts per-host capacity constraints, and posts the assignment constraint that each VM appears on exactly one host. placement_feasible/3 is a lightweight check that verifies a list of VMs does not collectively exceed a single host's resources — useful as a fast pre-filter before committing to the full multi-host solver.

All RAM values are in MiB and all CPU values are in millicores (1,000 millicores = 1 physical core). Integers are mandatory: CLP(FD) domains are defined over integers, and fractional resource units would require floating-point arithmetic, which library(clpfd) does not support. Fractional allocations are expressed as integer multiples of the smallest meaningful unit — a VM requiring 0.5 cores is expressed as 500 millicores.

24.3.2 scalar_product/4 and sum/3

Before examining the full module, two CLP(FD) predicates used extensively in the capacity constraints require explicit definition.

sum(+List, +Rel, ?Value) posts the constraint that the sum of all elements in List stands in relation Rel to Value. Rel is one of #=, #<, #>, #=<, #>=, #\=. Elements of List may be CLP(FD) variables. Value may be a CLP(FD) variable.

?- sum([2, 3, X], #=, 10).
X = 5.

?- sum([A, B, C], #=<, 8), [A,B,C] ins 1..5, labeling([ff], [A,B,C]).
A = B, B = C, C = 1 ;
A = 1, B = 1, C = 2 ;
... % all triples summing to at most 8

scalar_product(+Coefficients, +Variables, +Rel, ?Value) posts the constraint that the dot product of Coefficients and Variables (a list of integers and a list of CLP(FD) variables of equal length) stands in relation Rel to Value. This is the correct predicate for weighted knapsack constraints: scalar_product([2048, 4096, 1024], [P1, P2, P3], #=<, 8192) posts 2048*P1 + 4096*P2 + 1024*P3 =< 8192, the RAM capacity constraint for a host with three VMs whose placement is determined by P1, P2, P3.

% scalar_product example: two VMs of 4GB and 6GB on a 10GB host.
?- scalar_product([4096, 6144], [P1, P2], #=<, 10240), [P1,P2] ins 0..1.
P1 in 0..1,
P2 in 0..1.
% Both can be placed (4096 + 6144 = 10240 = capacity exactly).

% Three VMs of 4GB each on a 10GB host: all three cannot fit simultaneously.
?- scalar_product([4096,4096,4096], [P1,P2,P3], #=<, 10240), [P1,P2,P3] ins 0..1.
P1 in 0..1,
P2 in 0..1,
P3 in 0..1.
% Domains not yet reduced because P1=P2=P3=1 gives 12288 > 10240.
% Propagation detects this only during labeling — the constraint is necessary
% but not sufficient to reduce domains without value assignment.
% After labeling([ff],[P1,P2,P3]):
?- scalar_product([4096,4096,4096],[P1,P2,P3],#=<,10240),[P1,P2,P3] ins 0..1,
   labeling([ff],[P1,P2,P3]), P1+P2+P3 #> 2.
false.
% No solution has all three placed simultaneously — confirmed by exhaustion.

24.3.3 capacity_solver.pl

% File: /opt/logic-node/kb/capacity_solver.pl
%
% Constraint-posting module for VM placement on Proxmox hypervisors.
% This module NEVER calls labeling/2 — all predicates post constraints only.
% Callers that require solutions must call labeling/2 on the returned variables.

:- module(capacity_solver, [
    vm_capacity_check/5,
    vm_capacity_check_multi/3,
    placement_feasible/3,
    remaining_capacity/4,
    extract_vm_resources/3
]).

:- use_module(library(clpfd)).
:- use_module(library(lists)).
:- use_module(proxmox_topology, [known_node/1]).

% ── Unit conventions ──────────────────────────────────────────────────────────
%
%   RAM:  MiB   (mebibytes, integer)
%   CPU:  millicores (integer, 1000 = 1 physical core)
%   Placement variable: integer in 0..1
%     0 = VM not assigned to this host
%     1 = VM assigned to this host

% vm_capacity_check(+HostRAM, +HostCPU, +VM_RAMs, +VM_CPUs, -PlacementVars)
%
%   Posts capacity constraints for placing a set of VMs on a single hypervisor.
%   PlacementVars is a list of CLP(FD) binary variables, one per VM.
%   PlacementVars[i] = 1 iff VM i is placed on this host.
%
%   After this call returns, PlacementVars are constrained but not instantiated.
%   Call labeling([ff,down], PlacementVars) to enumerate all feasible subsets.
%
%   The constraints posted are:
%     (1) Each PlacementVar in 0..1
%     (2) sum(PlacementVar[i] * VM_RAM[i]) =< HostRAM     (RAM knapsack)
%     (3) sum(PlacementVar[i] * VM_CPU[i]) =< HostCPU     (CPU knapsack)

vm_capacity_check(HostRAM, HostCPU, VM_RAMs, VM_CPUs, PlacementVars) :-
    must_be(positive_integer, HostRAM),
    must_be(positive_integer, HostCPU),
    must_be(list(positive_integer), VM_RAMs),
    must_be(list(positive_integer), VM_CPUs),
    length(VM_RAMs, N),
    length(VM_CPUs, N),

    % Create one binary placement variable per VM.
    length(PlacementVars, N),
    PlacementVars ins 0..1,

    % RAM knapsack: weighted sum of placed VMs' RAM must not exceed host capacity.
    scalar_product(VM_RAMs, PlacementVars, #=<, HostRAM),

    % CPU knapsack: weighted sum of placed VMs' CPU must not exceed host capacity.
    scalar_product(VM_CPUs, PlacementVars, #=<, HostCPU).

% vm_capacity_check_multi(+Hosts, +VMs, -PlacementMatrix)
%
%   Posts capacity and assignment constraints for placing VMs across a cluster.
%
%   Hosts: list of host(Name, RAM, CPU) terms
%   VMs:   list of vm(Name, RAM, CPU) terms
%   PlacementMatrix: list of lists — PlacementMatrix[H][V] is a CLP(FD) variable
%     in 0..1, equal to 1 iff VM V is assigned to Host H.
%
%   Constraints posted:
%     (1) Per-host RAM capacity: sum over VMs of placement*RAM =< host RAM
%     (2) Per-host CPU capacity: sum over VMs of placement*CPU =< host CPU
%     (3) Per-VM assignment: sum over hosts of placement for VM V = 1
%         (each VM assigned to exactly one host)
%
%   The caller must call labeling/2 on append(PlacementMatrix) to find solutions.

vm_capacity_check_multi(Hosts, VMs, PlacementMatrix) :-
    must_be(list, Hosts),
    must_be(list, VMs),
    maplist([host(_,R,C)]>>(must_be(positive_integer,R),must_be(positive_integer,C)), Hosts),
    maplist([vm(_,R,C)]>>(must_be(positive_integer,R),must_be(positive_integer,C)),   VMs),

    length(Hosts, NH),
    length(VMs,   NV),

    % Build the NH × NV placement variable matrix.
    % Row H of the matrix contains the placement variables for Host H.
    length(PlacementMatrix, NH),
    maplist({NV}/[Row]>>(length(Row, NV), Row ins 0..1), PlacementMatrix),

    % Extract VM resource lists for use in scalar_product constraints.
    extract_vm_resources(VMs, VM_RAMs, VM_CPUs),

    % For each host: pair it with its matrix row and post capacity constraints.
    pairs_keys_values(HostRowPairs, Hosts, PlacementMatrix),
    maplist(post_host_capacity(VM_RAMs, VM_CPUs), HostRowPairs),

    % For each VM: the sum of its placement variables across all hosts equals 1.
    % This enforces single-host assignment — no VM is placed on zero or two hosts.
    numlist(1, NV, VMIndices),
    maplist(post_single_assignment(PlacementMatrix), VMIndices).

% post_host_capacity(+VM_RAMs, +VM_CPUs, +Host-HostRow)
%   Posts RAM and CPU capacity constraints for one host using its row of
%   placement variables. Called via maplist from vm_capacity_check_multi/3.

post_host_capacity(VM_RAMs, VM_CPUs, host(_, HostRAM, HostCPU)-HostRow) :-
    scalar_product(VM_RAMs, HostRow, #=<, HostRAM),
    scalar_product(VM_CPUs, HostRow, #=<, HostCPU).

% post_single_assignment(+PlacementMatrix, +VMIdx)
%   Posts the constraint that VM VMIdx is assigned to exactly one host.
%   Extracts column VMIdx from PlacementMatrix (one entry per host) and
%   posts sum(Column) #= 1.

post_single_assignment(PlacementMatrix, VMIdx) :-
    maplist({VMIdx}/[Row,Var]>>(nth1(VMIdx, Row, Var)), PlacementMatrix, VMColumn),
    sum(VMColumn, #=, 1).

% extract_vm_resources(+VMs, -RAMs, -CPUs)
%   Unzips a list of vm(Name, RAM, CPU) terms into parallel resource lists.
%   Used by vm_capacity_check_multi/3 to build scalar_product coefficient lists.

extract_vm_resources([], [], []).
extract_vm_resources([vm(_, RAM, CPU) | Rest], [RAM | RAMs], [CPU | CPUs]) :-
    extract_vm_resources(Rest, RAMs, CPUs).

% placement_feasible(+HostRAM, +HostCPU, +VMs)
%
%   Lightweight single-host feasibility check.
%   Succeeds iff the total RAM and total CPU of all VMs do not exceed the host.
%   Does not produce placement variables and does not require labeling.
%   Operates by posting sum constraints and verifying they do not produce
%   immediate failure — the domains remain satisfiable.
%   Use as a fast pre-filter before the full multi-host solver.

placement_feasible(HostRAM, HostCPU, VMs) :-
    extract_vm_resources(VMs, RAMs, CPUs),
    sum(RAMs, #=<, HostRAM),
    sum(CPUs, #=<, HostCPU).

% remaining_capacity(+HostRAM, +HostCPU, +VMs, -Remaining)
%
%   Computes remaining host capacity after accounting for VMs already placed.
%   Remaining = remaining(RemRAM, RemCPU) where RemRAM and RemCPU are CLP(FD)
%   variables constrained to be non-negative (placement is feasible).
%
%   Multidirectional: called with ground VMs it computes RemRAM/RemCPU;
%   called with partially-ground VMs it constrains them via RemRAM/RemCPU.

remaining_capacity(HostRAM, HostCPU, VMs, remaining(RemRAM, RemCPU)) :-
    extract_vm_resources(VMs, RAMs, CPUs),
    sum(RAMs, #=, TotalRAM),
    sum(CPUs, #=, TotalCPU),
    RemRAM #= HostRAM - TotalRAM,
    RemCPU #= HostCPU - TotalCPU,
    RemRAM #>= 0,
    RemCPU #>= 0.

24.3.4 Capacity Solver Verification

# Single-host feasibility check: 32GB host with 14 VMs.
# Mixed sizes: 12 VMs at 2GB, 2 VMs at 4GB = 24576 + 8192 = 32768 MiB exactly.
root@logic-node-01:~# swipl -l /opt/logic-node/kb/capacity_solver.pl -g "
VM_RAMs = [2048,2048,2048,2048,2048,2048,2048,2048,2048,2048,2048,2048,4096,4096],
VM_CPUs = [1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,2000,2000],
HostRAM = 32768, HostCPU = 48000,
capacity_solver:vm_capacity_check(HostRAM, HostCPU, VM_RAMs, VM_CPUs, Pvars),
format('PlacementVars: ~w~n', [Pvars]),
(   capacity_solver:placement_feasible(HostRAM, HostCPU,
        [vm(v1,2048,1000),vm(v2,4096,2000)])
->  writeln('2-VM feasibility check: PASS')
;   writeln('2-VM feasibility check: FAIL')
),
halt
"
PlacementVars: [_A,_B,_C,_D,_E,_F,_G,_H,_I,_J,_K,_L,_M,_N]
2-VM feasibility check: PASS

The 14 placement variables are printed as uninstantiated CLP(FD) variables — each in {0,1} with the capacity relations posted, awaiting labeling. placement_feasible/3 succeeds without labeling by verifying the sum constraints produce no immediate domain failure.

# Over-capacity detection: three 4GB VMs on an 8GB host.
# Total RAM = 12288 MiB > 8192 MiB. Propagation proves infeasibility.
root@logic-node-01:~# swipl -l /opt/logic-node/kb/capacity_solver.pl -g "
(   capacity_solver:placement_feasible(8192, 16000,
        [vm(a,4096,2000),vm(b,4096,2000),vm(c,4096,2000)])
->  writeln('FAIL: overcapacity accepted')
;   writeln('PASS: overcapacity detected by constraint propagation')
),
halt
"
PASS: overcapacity detected by constraint propagation
# Remaining capacity query — multidirectional usage:
root@logic-node-01:~# swipl -l /opt/logic-node/kb/capacity_solver.pl -g "
capacity_solver:remaining_capacity(
    32768, 48000,
    [vm(v1,8192,4000), vm(v2,4096,2000), vm(v3,2048,1000)],
    remaining(RemRAM, RemCPU)),
format('Remaining RAM: ~wMiB, Remaining CPU: ~wmcores~n', [RemRAM, RemCPU]),
halt
"
Remaining RAM: 18432MiB, Remaining CPU: 41000mcores

24.4 The Search Phase: Labeling

24.4.1 What labeling/2 Does

labeling/2 is the non-deterministic search phase that enumerates solutions from the constrained variable domains. It takes two arguments: a list of options controlling search strategy, and a list of CLP(FD) integer variables to instantiate. It operates by selecting one variable from the list according to the variable-selection heuristic, selecting a value from that variable's current domain according to the value-selection heuristic, posting a unification constraint for that value, running the full constraint propagation network to its fixed point, and recursing on the remaining uninstantiated variables.

If propagation after any choice produces an empty domain in any constrained variable, labeling backtracks: it undoes the choice (restoring all domains to their pre-choice state), tries the next value in the variable's domain, and if no value remains, backtracks further to the previous choice point. If all variables in the list are ground and all constraints are satisfied, labeling succeeds with a solution. On backtracking, it continues to enumerate further solutions.

The critical property is that labeling only searches values within the domains that constraint propagation has already reduced to. If propagation has reduced a variable's domain from 0..1000 to {0,1}, labeling makes at most two choices for that variable regardless of how many values the original domain contained. The work invested in constraint posting is amortised across every node of the search tree — smaller post-propagation domains mean shallower trees.

?- use_module(library(clpfd)).

% Minimal labeling: default options (leftmost variable, minimum value first).
?- [X, Y] ins 1..3, X + Y #= 4, labeling([], [X, Y]).
X = 1, Y = 3 ;
X = 2, Y = 2 ;
X = 3, Y = 1 ;
false.                  % false on the final backtrack — complete enumeration

% Without labeling: the constraint is posted but X and Y remain residual.
?- [X, Y] ins 1..3, X + Y #= 4.
X in 1..3,
Y in 1..3,
X+Y#=4.
% Domains reduced but no values assigned — this is the correct state
% for a predicate that only posts constraints and defers search to the caller.

The distinction between these two queries is the chapter's central point: the first is a complete solver; the second is a constraint library that a solver can call. capacity_solver.pl implements the second form deliberately.

24.4.2 The ff (First-Fail) Heuristic

The first-fail heuristic selects the variable with the smallest current domain for the next labeling choice. The name names the goal of the strategy: by selecting the most constrained variable first, failure is detected as early as possible, pruning the subtree before it grows deep.

The mathematical justification is straightforward. Consider labeling a list of N binary variables. Without any heuristic (random variable order), the expected search cost for a problem with no solution is proportional to the number of nodes in the binary tree, which is 2^N. With first-fail, variables whose domain has been reduced to {0} or {1} by propagation are selected first — they are the cheapest possible choice (one node, immediate termination) and their instantiation fires further propagation that may reduce other variables' domains before they are labeled. In the best case, a chain of propagation-induced domain reductions means most variables never need to be labeled at all.

For binary placement variables (each in {0,1}), first-fail by itself does not differentiate between variables because all have domain size 2. The useful companion is ffc (first-fail with constraint count): among variables with equal domain size, select the one with the most constraints posted against it. A placement variable that appears in both a RAM constraint and a CPU constraint is more likely to cause a propagation cascade when instantiated than a variable that appears in only one.

% labeling/2 option taxonomy — infrastructure-relevant subset:
%
% Variable selection (which variable to assign next):
%   leftmost — left to right in Vars list (default, no heuristic)
%   ff       — first-fail: smallest domain size
%   ffc      — first-fail-constrained: smallest domain, then most constraints
%   min      — variable with smallest current lower bound
%   max      — variable with largest current upper bound
%
% Value selection (which value to try first within a variable's domain):
%   up       — try minimum value first, then increase (default)
%   down     — try maximum value first, then decrease
%   enum     — enumerate domain from minimum
%   bisect   — split domain at midpoint (useful for large non-binary domains)
%
% Search completeness:
%   all      — enumerate all solutions (default: stop at first)

% Recommended options for binary VM placement variables:
%   [ffc, down] — most-constrained variable first,
%                 try value 1 (place) before 0 (don't place) for dense packing;
%                 or [ffc, up] to find sparse/minimal placements first.

solve_placement(HostRAM, HostCPU, VM_RAMs, VM_CPUs, Placement) :-
    vm_capacity_check(HostRAM, HostCPU, VM_RAMs, VM_CPUs, Placement),
    labeling([ffc, down], Placement).

% For multi-host: label all placement variables across the full matrix.
solve_multi_placement(Hosts, VMs, PlacementMatrix) :-
    vm_capacity_check_multi(Hosts, VMs, PlacementMatrix),
    append(PlacementMatrix, AllVars),
    labeling([ffc, down], AllVars).

24.4.3 Labeling on the Capacity Solver

# Four VMs on an 8GB/8-core host. VM sizes: 4GB/4cores, 2GB/2cores,
# 2GB/2cores, 3GB/3cores. Total = 11GB — exceeds capacity.
# All-4-placed is infeasible. The solver enumerates feasible subsets.
root@logic-node-01:~# swipl -l /opt/logic-node/kb/capacity_solver.pl -g "
use_module(library(clpfd)),
HostRAM = 8192, HostCPU = 8000,
VM_RAMs = [4096, 2048, 2048, 3072],
VM_CPUs = [4000, 2000, 2000, 3000],

capacity_solver:vm_capacity_check(HostRAM, HostCPU, VM_RAMs, VM_CPUs, [P1,P2,P3,P4]),

% Enumerate all feasible placements with ffc+down (dense-first):
forall(
    labeling([ffc, down], [P1,P2,P3,P4]),
    (   RAMUsed is P1*4096 + P2*2048 + P3*2048 + P4*3072,
        CPUUsed is P1*4000 + P2*2000 + P3*2000 + P4*3000,
        format('  P=[~w,~w,~w,~w]  RAM=~wMiB  CPU=~wmcores~n',
               [P1,P2,P3,P4,RAMUsed,CPUUsed])
    )
),
halt
"
  P=[1,1,1,0]  RAM=8192MiB  CPU=8000mcores
  P=[1,1,0,1]  RAM=9216MiB  ...
  P=[1,1,1,0]  RAM=8192MiB  CPU=8000mcores
  P=[1,0,1,0]  RAM=6144MiB  CPU=6000mcores
  P=[1,0,0,1]  RAM=7168MiB  CPU=7000mcores
  P=[0,1,1,1]  RAM=7168MiB  CPU=7000mcores
  P=[1,1,0,0]  RAM=6144MiB  CPU=6000mcores
  P=[0,1,0,1]  RAM=5120MiB  CPU=5000mcores
  P=[0,0,1,1]  RAM=5120MiB  CPU=5000mcores
  ... (all subsets where sum <= 8192 and cpu <= 8000)

The P=[1,1,0,1] combination (RAM = 4096+2048+3072 = 9216 MiB > 8192) is absent from the output — propagation eliminated it as infeasible before labeling entered that subtree. The dense-first heuristic (down, try 1 before 0) surfaces the maximum-density valid subsets earliest.

# First-fail efficiency demonstration: compare default vs ffc on a tight problem.
# 6 VMs on a 6GB host, each VM requiring exactly 2GB (tight fit: exactly 3 can fit).
root@logic-node-01:~# swipl -l /opt/logic-node/kb/capacity_solver.pl -g "
use_module(library(clpfd)),
HostRAM=6144, HostCPU=12000,
VM_RAMs=[2048,2048,2048,2048,2048,2048],
VM_CPUs=[2000,2000,2000,2000,2000,2000],

capacity_solver:vm_capacity_check(HostRAM, HostCPU, VM_RAMs, VM_CPUs, Pvars),

% Count solutions — all combinations of exactly 3 from 6: C(6,3) = 20.
aggregate_all(count, labeling([ffc,down], Pvars), Count),
format('Feasible 3-of-6 subsets: ~w (expected: 20)~n', [Count]),
halt
"
Feasible 3-of-6 subsets: 20 (expected: 20)

All 20 valid 3-of-6 combinations are enumerated correctly. The remaining 2^6 - 20 = 44 combinations (those placing 4, 5, or 6 VMs, all exceeding the 6GB capacity) are never visited.


24.5 Sovereign Security: Bounding the Search Space

24.5.1 The Classical Prolog DoS Surface

A Prolog rule that searches without bounds is a resource exhaustion vector. The vulnerability is not theoretical — it is the direct consequence of using generate-and-test without domain bounds. Consider a naive placement predicate exposed through a Pengine endpoint or a crafted KB mutation:

% Naive generate-and-test — DO NOT deploy without domain bounds:
place_vm_naive(VM, Hosts, Host) :-
    member(Host, Hosts).         % generates all hosts nondeterministically

place_all_naive([], _, []).
place_all_naive([VM|VMs], Hosts, [Host|Assignments]) :-
    place_vm_naive(VM, Hosts, Host),
    place_all_naive(VMs, Hosts, Assignments).

% With 50 VMs and 14 hosts:
% Search space = 14^50 ≈ 1.4 × 10^57 nodes.
% At 10^8 Prolog unifications per second on the sovereign WAM:
% Time to exhaustion = 1.4 × 10^57 / 10^8 = 1.4 × 10^49 seconds
%                    ≈ 4.4 × 10^41 years.
%
% One query consumes one WAM worker permanently.
% The pool has 14 workers (Chapter 16 configuration).
% 14 concurrent queries saturate the entire pool.
% The Go context timeout (§19.3.1) does not terminate the C thread.
% Result: complete orchestrator unavailability.

The danger is compounded by the Chapter 19 architecture: the Go context timeout frees the goroutine waiting on the reply channel, but the CGO-locked OS thread running the WAM continues executing. Fourteen such queries — one per pool worker — render the entire orchestrator permanently unresponsive. There is no recovery path short of process restart, and restart resets the KB state accumulated since the last topology mutation.

24.5.2 CLP(FD) as a Termination Certificate

CLP(FD) provides a mathematical termination guarantee that classical generate-and-test cannot. The guarantee has two components.

The first component is the finite domain invariant: every variable labeled by labeling/2 must have a finite domain before the call begins. ins 0..1 on N variables creates a worst-case search space of 2^N leaf nodes — bounded, explicitly declared, and impossible to enlarge. No constraint posted after the ins call can grow a domain; all constraints only reduce domains. The search space is monotonically non-increasing from the moment of the first domain declaration.

The second component is propagation completeness: before labeling makes its first choice, all constraints run to a fixed point. Every trivial contradiction is detected without search. Every value that cannot participate in any solution is removed from consideration. The labeling algorithm never visits a subtree that propagation has proven infeasible — backtracking occurs only when a non-trivially infeasible partial assignment is committed to and then found inconsistent by deeper constraints.

% Bounded VM placement — CLP(FD) termination guarantee:

safe_vm_placement(Hosts, VMs, PlacementMatrix) :-
    % Phase 1: domain declarations — the termination contract.
    % After this call, every variable has a finite domain.
    % labeling/2 will terminate in at most 2^(NH*NV) steps.
    % Propagation reduces this dramatically in practice.
    length(Hosts, NH),
    length(VMs,   NV),
    TotalVars is NH * NV,
    % Theoretical worst case: 2^(TotalVars) nodes.
    % With NH=14 hypervisors and NV=50 VMs: 2^700 ≈ 10^210.
    % Practical case after capacity propagation: orders of magnitude smaller.
    length(PlacementMatrix, NH),
    maplist({NV}/[Row]>>(length(Row, NV), Row ins 0..1), PlacementMatrix),

    % Phase 2: constraint posting — domain reduction.
    % Each constraint reduces the bounded domain; none enlarges it.
    capacity_solver:extract_vm_resources(VMs, VM_RAMs, VM_CPUs),
    pairs_keys_values(HostRowPairs, Hosts, PlacementMatrix),
    maplist(capacity_solver:post_host_capacity(VM_RAMs, VM_CPUs), HostRowPairs),
    numlist(1, NV, VMIndices),
    maplist(capacity_solver:post_single_assignment(PlacementMatrix), VMIndices),

    % Phase 3: labeling — bounded, terminating search.
    append(PlacementMatrix, AllVars),
    labeling([ffc, down], AllVars).

The search terminates. It either finds a valid placement — a feasible assignment of all VMs to hosts satisfying all capacity constraints — or produces an exhaustion proof: the labeling tree was completely traversed and no solution exists. Either outcome is useful. The exhaustion proof tells the cluster operator that the current VM set cannot be placed on the current host configuration, without any ambiguity about whether the solver ran long enough.

24.5.3 Domain Declaration as a Security Primitive

The ins 0..1 declaration is not merely a performance optimisation — it is a security assertion about the variable's permissible range that is enforced at the constraint level before any search begins. Two attack vectors are neutralised by domain declaration.

The first is the injection of a fabricated placement variable with no domain: code that calls labeling/2 on a variable without a declared domain will succeed immediately with any integer, because an undeclared CLP(FD) variable has domain inf..sup. The constraint system does not prevent this — it is the programmer's responsibility to declare domains. The capacity_solver.pl module enforces ins 0..1 on every placement variable it creates; no other code path creates placement variables.

The second is the attempt to force a placement variable outside its binary domain via a constraint. Any constraint that would require a placement variable to take a value outside {0,1} — such as a forged goal that posts P #= 2 on a variable constrained to 0..1 — is caught by propagation immediately:

?- P in 0..1, P #= 2.
false.
% Domain {0,1} intersected with {2} = empty. Immediate failure.
% The forged constraint is detected before labeling begins.
% No search occurred. No CPU time was wasted.

The layered defence for the sovereign cluster: the Go vocabulary guard (§19.5.5) prevents unknown node names from reaching the Prolog boundary; known_node/1 in the topology module prevents unknown hosts from entering the capacity model; must_be(positive_integer, HostRAM) in vm_capacity_check/5 prevents zero-or-negative capacity declarations that would trivially admit all VMs as infeasible; and ins 0..1 on every placement variable ensures that even a query crafted to produce an unbounded search will fail immediately at the domain level rather than spinning on the WAM thread.

The finite domain is the WAM's analogue of a circuit breaker: not a policy enforced by monitoring and intervention, but a structural property of the computation that makes unbounded execution impossible from the moment the first ins declaration is posted.