Skip to main content

Chapter 17: Tabling (SLG Resolution)

A Proxmox cluster with redundant network bridges is not a tree. It is a graph — specifically, a graph that contains cycles by design. SDN VLAN bridges redundantly connect hypervisors. Bonded uplinks form rings. Spine-leaf topologies are deliberately cyclic. Every infrastructure network worth operating has cycles, because a network without cycles has no redundancy.

Classical SLD resolution — the execution model used by every Prolog predicate that does not carry a :- table directive — explores the proof tree depth-first, left-to-right. Given link(a, b) and link(b, a), the query path(a, X) unfolds to path(a, b) (direct), then to path(a, a) via b (one hop back), then to path(a, b) again (two hops), then to path(a, a) again, indefinitely. The SLD tree is infinite. The WAM local stack fills. The WAM throws ERROR: Stack limit exceeded. Chapter 16's panic recovery catches it and destroys the worker's engine. The pool manager respawns a replacement. The same query arrives again. The same stack overflow occurs. The respawn loop from Section 16.5.3 fires. The backoff accumulates. The pool degrades.

Tabling — SLG (Semantic Lattice Grounding) resolution — eliminates this class of failure at the resolution engine level, without any change to the calling Go code, without any change to the fact database, and without any procedural cycle-detection logic in the Prolog rules. The :- table path/3 directive instructs the WAM to maintain two data structures per tabled predicate: a Call Trie and an Answer Trie. When a call path(a, X) is first encountered, it is recorded in the Call Trie. When the same call pattern is encountered again during recursive evaluation — the cycle — the engine detects the existing entry, suspends the recursive branch, and waits for answers to be propagated from completed subgoals. No infinite descent. The SLD tree is replaced by a finite SLG forest.

Five properties define tabling as the correct resolution strategy for infrastructure graph queries.

1. SLG resolution terminates on any cyclic Datalog program. A Prolog program is in the Datalog fragment if all predicates have a finite Herbrand base — all ground atoms that could be derived are finite in number. Network topology predicates satisfy this condition: the number of hypervisors, switches, and VLANs in a Proxmox cluster is finite. For any tabled Datalog program, SLG resolution terminates in time proportional to the size of the Herbrand base — bounded by the number of distinct ground answer tuples, not by the depth of the proof tree. Cycles in the graph do not create cycles in the SLG evaluation — they create shared subgoals that are resolved once and reused.

2. The Call Trie detects cycles by recognising repeated subgoal patterns. When the WAM evaluates a tabled call G, it first hashes G against the Call Trie. If G is not present, it adds G to the Call Trie and proceeds with normal SLD evaluation. If G is already present (the cycle case), the current continuation is suspended and registered as a consumer of G's answers. When new answers for G are derived, all consumers are resumed. This suspension-and-propagation mechanism is the core of SLG — it converts the infinite SLD tree into a finite fixed-point computation.

3. Answer Subsumption replaces answer accumulation with lattice operations. Without Answer Subsumption, tabling records every distinct answer to a subgoal. For path/3 over a 100-node graph, this could be O(n²) distinct paths, each with a different cost. For routing decisions, all paths except the minimum-cost one are useless. lattice(min/3) instructs the Answer Trie to apply a meet operation when a new answer is derived: if a new answer path(a, b, Cost_new) arrives and the trie already contains path(a, b, Cost_old), the trie keeps only min(Cost_old, Cost_new). The Answer Trie never grows beyond O(n²) entries and contains only Pareto-optimal answers with respect to the cost lattice. The minimum-cost path is computed incrementally during tabling, not as a post-processing step over a list of all paths.

4. Table space is bounded heap, not stack — but it is not free. SLD resolution uses WAM local stack for recursion depth. Tabling trades stack depth for heap space: the Call Trie and Answer Trie are allocated on the WAM table space (a dedicated heap region, controlled by set_prolog_flag(table_space, N)). For a 100-node topology with edges labelled by VLAN cost, the Answer Trie for shortest_path/3 with lattice(min/3) stores at most 100 × 100 = 10,000 entries at approximately 200 bytes each — 2MB total. A complete traversal of all pairs costs 2MB of table space permanently allocated until the tables are abolished. For a 10,000-node topology, the cost scales to 200MB. The table space limit is the security boundary that prevents an ungrounded query from computing and caching the entire Cartesian product.

5. Distributed cache invalidation requires coordinated abolish_all_tables/0 across all WAM engines in the worker pool. Each Chapter 16 worker goroutine owns a private WAM engine with independent table space. When a network link changes — a hypervisor goes offline, a VLAN bridge fails — every engine's shortest_path/3 table contains stale answers. A ControlFlushTables broadcast via the pool's ctrlCh reaches each worker between query iterations. Each worker calls abolish_all_tables/0 on its own engine, releasing the stale table space and forcing recomputation on the next query. The semantics are eventually consistent: during the broadcast propagation window (at most pool_size × avg_query_time, approximately 320μs at 16 workers), some engines serve stale paths. This is operationally identical to the KB reload consistency window from Chapter 16, Section 16.4.2 — and operationally acceptable for routing decisions that change on the timescale of seconds, not microseconds.


Chapter Roadmap

Section Title Focus
17.1 SLD vs. SLG Resolution Infinite loop anatomy, Call Trie, Answer Trie, SLG fixed-point
17.2 Answer Subsumption lattice(min/3), meet operation, Pareto-optimal answer sets
17.3 The Build: proxmox_topology.pl Cyclic facts, tabled rules, grounding guards, REPL demonstration
17.4 Distributed Cache Invalidation ControlFlushTables, abolish_all_tables/0, Go pool extension
17.5 Security: Table Space Exhaustion Ungrounded query DoS, table_space limit, must_be/2 guards
Outcome Declarative Algorithmic Optimisation Verification checklist, complexity comparison

17.1 SLD vs. SLG Resolution: The Physics of Infinite Loops

17.1.1 Why SLD Dies on link(a, b). link(b, a).

% A minimal cyclic topology — two hypervisors connected by a redundant bridge.
link(pve1, pve2, 1).
link(pve2, pve1, 1).   % ← the return edge that kills SLD

% A naive reachability rule:
path_sld(X, Y) :- link(X, Y, _).
path_sld(X, Y) :- link(X, Z, _), path_sld(Z, Y).
?- path_sld(pve1, pve2).

The SLD execution tree:

path_sld(pve1, pve2)
├── link(pve1, pve2, _)           → true  [first clause, succeeds]
└── link(pve1, Z, _), path_sld(Z, pve2)
    └── Z = pve2
        └── path_sld(pve2, pve2)
            ├── link(pve2, pve2, _) → false
            └── link(pve2, Z2, _), path_sld(Z2, pve2)
                └── Z2 = pve1
                    └── path_sld(pve1, pve2)   ← IDENTICAL TO ROOT CALL
                        └── ... ∞ recursion

The first solution is found immediately — link(pve1, pve2, _) succeeds at depth 1. The problem is backtracking: when the caller asks for the next solution (or when the rule is embedded in a findall/3 that exhausts all solutions), the engine explores the second clause, recurses through the cycle, and never terminates.

The stack growth is linear in the number of recursive calls. Default local stack: 128MB. Stack frame per call: approximately 200 bytes. Stack exhaustion at depth: 128MB / 200 bytes = 655,360 recursive calls. At 10 recursive calls per millisecond: exhaustion in 65 seconds. For an HTTP handler waiting on a response, this manifests as a request that hangs for 65 seconds before the worker panics — not a clean timeout, not a useful error, just a locked worker that the pool manager eventually replaces.

The pathological case for infrastructure networks is not the two-node example — it is a 50-node Proxmox cluster with 120 VLAN bridges, 8-way bonded uplinks, and three spine switches. In that topology, SLD's exploration of all path combinations before exhaustion takes longer than any reasonable HTTP timeout, and the memory pressure from the accumulating stack frames degrades the WAM engines of other workers sharing the same NUMA node.

17.1.2 How SLG Terminates: The Call Trie and Answer Propagation

% The same topology with tabling:
:- table path_slg/2.

path_slg(X, Y) :- link(X, Y, _).
path_slg(X, Y) :- link(X, Z, _), path_slg(Z, Y).

The SLG evaluation of path_slg(pve1, pve2):

Step 1: Call path_slg(pve1, pve2).
        Not in Call Trie → add to Call Trie. Evaluate.

Step 2: Clause 1: link(pve1, pve2, _) → Answer: path_slg(pve1, pve2).
        Add to Answer Trie. Propagate to consumers (none yet).

Step 3: Clause 2: link(pve1, Z, _) → Z = pve2.
        Call path_slg(pve2, pve2).
        Not in Call Trie → add to Call Trie. Evaluate.

Step 4: Clause 1: link(pve2, pve2, _) → false.
Step 5: Clause 2: link(pve2, Z2, _) → Z2 = pve1.
        Call path_slg(pve1, pve2).
        *** FOUND IN CALL TRIE *** → SUSPEND. Register as consumer.
        No new derivation. Continue.

Step 6: path_slg(pve2, pve2) has no new answers. Mark COMPLETE.

Step 7: path_slg(pve1, pve2) completion: all clauses exhausted.
        Answer Trie for path_slg(pve1, pve2) = {path_slg(pve1, pve2)}.
        Mark COMPLETE. Propagate answers to consumers.
        (Suspended call at Step 5 receives answer: path_slg(pve1, pve2) — already known.)

Result: path_slg(pve1, pve2) = true.   Derived in 7 steps. No stack overflow.

The suspension in Step 5 is the cycle-breaking mechanism. The engine does not recurse into path_slg(pve1, pve2) a second time — it notes that this call is already being evaluated and waits for its answers to be propagated through the completion mechanism. The SLD tree, which was infinite, becomes an SLG forest of three nodes: {path_slg(pve1, pve2), path_slg(pve2, pve2), path_slg(pve1, pve2)} with a back-edge from the third to the first. The back-edge is never followed — it is converted into a dependency that is resolved by answer propagation.

17.1.3 Diagram: SLG Cycle Detection and Answer Propagation

%%{init: {"themeVariables": {"fontSize": "14px"}}}%%
flowchart TD
    Q["Query: path_slg(pve1, pve2)\nNew call — not in Call Trie\nAdd to Call Trie\nBegin SLD evaluation"]

    C1["Clause 1: link(pve1, pve2, _)\nSucceeds immediately\nAnswer: path_slg(pve1, pve2)\nAdd to Answer Trie"]

    C2["Clause 2: link(pve1, Z, _)\nZ = pve2\nSubgoal: path_slg(pve2, pve2)\nNew call — add to Call Trie"]

    C2A["path_slg(pve2, pve2)\nClause 1: link(pve2, pve2) → false\nClause 2: link(pve2, Z2) → Z2 = pve1\nSubgoal: path_slg(pve1, pve2)"]

    CYCLE["CYCLE DETECTED\npath_slg(pve1, pve2)\nalready in Call Trie\nSUSPEND — register as consumer\nDo NOT recurse"]

    COMPLETE1["path_slg(pve2, pve2)\nNo answers found\nMark COMPLETE\nReturn to parent"]

    COMPLETE2["path_slg(pve1, pve2)\nAll clauses exhausted\nAnswer Trie: {path_slg(pve1, pve2)}\nMark COMPLETE\nPropagate to suspended consumers"]

    RESULT["Result: true\nStack depth: O(1) — no recursion\nTable space used: 2 entries\nAnswer retrievable on next query\nfrom Answer Trie — O(1) lookup"]

    Q --->|"Clause 1 succeeds"| C1
    Q --->|"Clause 2 fires"| C2
    C1 --->|"Answer stored"| COMPLETE2
    C2 --->|"Recursive call"| C2A
    C2A --->|"Recursive call"| CYCLE
    CYCLE --->|"Suspension — no recursion"| COMPLETE1
    COMPLETE1 --->|"Completion propagates up"| COMPLETE2
    COMPLETE2 --->|"Answer propagated to consumers"| RESULT

    style Q fill:#1A2B4A,color:#FFFFFF
    style C1 fill:#1A4070,color:#FFFFFF
    style C2 fill:#1A4070,color:#FFFFFF
    style C2A fill:#1A4070,color:#FFFFFF
    style CYCLE fill:#7A1A1A,color:#FFFFFF
    style COMPLETE1 fill:#8B6914,color:#FFFFFF
    style COMPLETE2 fill:#8B6914,color:#FFFFFF
    style RESULT fill:#1A6B3A,color:#FFFFFF

Reading the diagram: Dark blue nodes are normal SLD evaluation steps. The red node is the cycle detection point — the only place where the normal recursion is intercepted. Amber nodes are the completion phase — answers propagate upward through the dependency graph. Green is the result: a finite, correct answer retrieved from the Answer Trie on any subsequent identical call in O(1) time.


17.2 Answer Subsumption: The Shortest Path

17.2.1 Why All-Paths Tabling Is Wrong for Routing

Without Answer Subsumption, tabling accumulates every distinct answer:

:- table path/3.   % WITHOUT answer subsumption

path(X, Y, Cost) :- link(X, Y, Cost).
path(X, Y, Cost) :- link(X, Z, C1), path(Z, Y, C2), Cost is C1 + C2.

For a 10-node topology with average degree 3, the number of distinct acyclic paths between any two nodes is O(n!) in the worst case. In practice, for a Proxmox SDN with 50 hypervisors and 3 redundant uplinks each: hundreds of distinct paths per source-destination pair, each stored in the Answer Trie. The trie grows proportionally to the number of distinct paths. Most of these paths are dominated — longer, higher latency, never selected by any routing decision. They consume table space, slow down answer propagation, and are retrieved by no downstream consumer. They are waste.

The findall(Cost, path(pve1, pve50, Cost), Costs), min_list(Costs, Min) pattern retrieves all stored paths and post-processes them. This is procedural graph algorithm reimplemented on top of a declarative engine.

17.2.2 lattice(min/3) — The Meet Operation

lattice(+MeetPred) is Answer Subsumption mode for a tabled predicate. The meet predicate has the signature meet(+OldAnswer, +NewAnswer, -Result): given an existing answer in the trie and a newly derived answer, it returns the result to retain. For shortest-path routing, the meet is min/3: min(Old, New, Result) :- Result is min(Old, New).

%% Directive syntax:
:- table shortest_path(+, +, lattice(min/3)).
%%                     ^  ^  ^^^^^^^^^^^^^^^^
%%                     |  |  Answer Subsumption: apply min/3 to the Cost argument
%%                     |  Dst: mode (+) — must be ground at call time
%%                     Src: mode (+) — must be ground at call time

The (+) mode annotations are enforced by the tabling engine: if Src or Dst is unbound when shortest_path/3 is called, SWI-Prolog raises an instantiation error before any table lookup. This mode enforcement is a first line of defence against the ungrounded query DoS described in Section 17.5 — but it is not the only defence, because mode errors are catchable and the enforcement is not guaranteed across all SWI-Prolog versions before 9.0.

When a new Cost_new arrives for a pair (Src, Dst) that already has Cost_old in the Answer Trie, the engine calls min(Cost_old, Cost_new, Result) and stores only Result. If Cost_new >= Cost_old, the new answer is discarded immediately — it never enters the trie, never consumes additional table space, never propagates to consumers. The trie entry count is bounded by the number of distinct (Src, Dst) pairs: O(n²) entries, each storing a single integer cost. For 50 nodes: 2,500 entries × ~64 bytes = 160KB total table space for the complete shortest-path matrix.

17.2.3 The Lattice Contract

For lattice(min/3) to be semantically correct, min/3 must be a meet operation on a bounded semilattice:

  1. Commutativity: min(X, Y, R) and min(Y, X, R) produce the same R. ✓
  2. Associativity: min(min(X, Y), Z) = min(X, min(Y, Z)). ✓
  3. Idempotency: min(X, X, X). ✓ (if X arrives again, result is X — no change)
  4. Bounded below: there exists a minimum element (0 for non-negative costs). ✓

Any routing cost metric satisfying these four properties is a valid lattice for Answer Subsumption. Latency in microseconds: ✓. Hop count: ✓. Bandwidth (maximise, so use max/3): ✓. Composite metrics that mix additive costs with multiplicative reliability: requires custom meet/3 that operates on compound terms — valid but requires careful implementation.

The one invalid case is a cost function that is not monotone: if adding a hop can decrease the total cost (negative edge weights), the min/3 meet may discard a path that later becomes cheaper through additional hops. Dijkstra's algorithm requires non-negative edge weights for the same reason. For infrastructure latency, hop count, and bandwidth, all weights are non-negative.


17.3 The Build: proxmox_topology.pl

17.3.1 Network Topology Facts

logicadmin@logic-node-01:~$ nano /opt/logic-node/kb/topology/proxmox_topology.pl
%% =============================================================================
%% FILE:    /opt/logic-node/kb/topology/proxmox_topology.pl
%% PURPOSE: Proxmox SDN topology — hypervisors, VLAN bridges, inter-node links.
%%          Tabled shortest_path/3 for lowest-latency route computation.
%%
%% TOPOLOGY:
%%   Three-rack Proxmox cluster, dual spine, redundant leaf uplinks.
%%   All links are bidirectional. Costs are in microseconds (L2 latency).
%%   Cycles are intentional: redundant uplinks create rings at every layer.
%%
%% TABLING STRATEGY:
%%   shortest_path/3 uses lattice(min/3) Answer Subsumption.
%%   Answer Trie stores ONE entry per (Src, Dst) pair — the minimum-cost path cost.
%%   Table space: O(N²) entries, N = number of nodes in the topology.
%%   For this 14-node topology: 196 entries maximum ≈ 12KB.
%%
%% GROUNDING CONTRACT:
%%   shortest_path/3 and reachable/2 MUST NOT be called with unbound arguments.
%%   The public API is query_path/4 and query_reachable/3, which enforce
%%   ground/1 on Src and Dst before delegating to the tabled predicates.
%%   See Section 17.5 for the full security analysis.
%% =============================================================================

:- module(proxmox_topology, [
    query_path/4,          % Public API: query_path(+Src, +Dst, -Cost, -Hops)
    query_reachable/3,     % Public API: query_reachable(+Src, +Dst, -Bool)
    assert_link/3,         % Dynamic: add a new link at runtime
    retract_link/3,        % Dynamic: remove a failed link
    flush_topology_tables/0 % Invalidate all tabled answers
]).

:- use_module(library(tabling)).
:- use_module(library(error), [must_be/2]).
:- use_module(library(aggregate), [aggregate_all/3]).

%% ---------------------------------------------------------------------------
%% TOPOLOGY FACTS
%% link(+Node1, +Node2, +CostMicroseconds)
%% Bidirectional — both directions are asserted.
%% Dynamic so that live_state.pl can retract failed links.
%% ---------------------------------------------------------------------------

:- dynamic link/3.

%% Rack A hypervisors → Rack A leaf switch
link(pve1,  leaf_a, 10).  link(leaf_a, pve1,  10).
link(pve2,  leaf_a, 10).  link(leaf_a, pve2,  10).
link(pve3,  leaf_a, 12).  link(leaf_a, pve3,  12).

%% Rack B hypervisors → Rack B leaf switch
link(pve4,  leaf_b, 10).  link(leaf_b, pve4,  10).
link(pve5,  leaf_b, 11).  link(leaf_b, pve5,  11).
link(pve6,  leaf_b, 10).  link(leaf_b, pve6,  10).

%% Rack C hypervisors → Rack C leaf switch
link(pve7,  leaf_c, 14).  link(leaf_c, pve7,  14).
link(pve8,  leaf_c, 10).  link(leaf_c, pve8,  10).

%% Leaf switches → Spine switches (dual uplinks — redundant ring)
link(leaf_a, spine1, 5).  link(spine1, leaf_a, 5).
link(leaf_a, spine2, 5).  link(spine2, leaf_a, 5).
link(leaf_b, spine1, 5).  link(spine1, leaf_b, 5).
link(leaf_b, spine2, 6).  link(spine2, leaf_b, 6).
link(leaf_c, spine1, 7).  link(spine1, leaf_c, 7).
link(leaf_c, spine2, 5).  link(spine2, leaf_c, 5).

%% Spine interconnect (east-west traffic between spines)
link(spine1, spine2, 2).  link(spine2, spine1, 2).

%% Storage network: dedicated NVMe-oF fabric (lower latency for storage traffic)
link(pve1, storage1, 3).   link(storage1, pve1, 3).
link(pve2, storage1, 3).   link(storage1, pve2, 3).
link(pve4, storage1, 8).   link(storage1, pve4, 8).
link(pve7, storage1, 12).  link(storage1, pve7, 12).

%% ---------------------------------------------------------------------------
%% TABLED PREDICATES
%% ---------------------------------------------------------------------------

%% shortest_path(+Src, +Dst, -Cost)
%% Finds the minimum-cost path from Src to Dst using Answer Subsumption.
%%
%% lattice(min/3): when two answers (Src, Dst, Cost1) and (Src, Dst, Cost2)
%% are derived, the engine calls min(Cost1, Cost2, Result) and retains only
%% Result in the Answer Trie. The trie never grows beyond one entry per
%% (Src, Dst) pair — regardless of how many distinct paths exist.
%%
%% Mode (+, +, lattice(min/3)) enforced by the tabling engine:
%%   Src and Dst must be ground at call time.
%%   Unground calls raise instantiation_error before table lookup.
:- table shortest_path(+, +, lattice(min/3)).

%% Base case: direct link.
shortest_path(Src, Dst, Cost) :-
    link(Src, Dst, Cost).

%% Recursive case: path via intermediate node.
%% The tabling engine handles cycles — no visited list required.
%% SLG suspension prevents infinite recursion on cyclic topologies.
shortest_path(Src, Dst, Cost) :-
    link(Src, Mid, C1),
    shortest_path(Mid, Dst, C2),
    Cost is C1 + C2.

%% DESIGN NOTE — The Action Void and live_link/3:
%%
%% shortest_path/3 above traverses link/3 — static physical topology facts.
%% It has no knowledge of whether the hypervisors at each endpoint are healthy.
%% A link from pve3 to leaf_a still appears traversable even when Chapter 22's
%% live_state module has asserted node_health(pve3, critical).
%%
%% This is intentional for this module. proxmox_topology.pl is the authority on
%% static physical connectivity — what cables exist. live_state.pl is the
%% authority on live node health. Mixing the two concerns in this predicate
%% would introduce a circular module dependency (live_state already imports
%% proxmox_topology for known_node/1).
%%
%% The health-aware routing predicate is live_state:live_query_path/4, defined
%% in Chapter 22 §22.3.4. It traverses live_state:live_link/3, which gates
%% each link on healthy_node(A) and healthy_node(B). Chapter 26 builds the
%% full Closed-Loop Remediation layer on live_query_path/4 — routing decisions
%% made during active eviction use live_query_path/4, never this predicate.
%%
%% This predicate (shortest_path/3) remains useful for:
%%   - Visualisation tools that show the full static topology
%%   - Debugging: confirming a physical path exists before checking health
%%   - REPL exploration during cluster bring-up before any live metrics arrive

%% reachable(+Src, +Dst)
%% Succeeds if any path exists from Src to Dst.
%% Uses simple tabling (no subsumption) — boolean answer per pair.
:- table reachable(+, +).

reachable(Src, Dst) :- link(Src, Dst, _).
reachable(Src, Dst) :- link(Src, Mid, _), reachable(Mid, Dst).

%% ---------------------------------------------------------------------------
%% PUBLIC API — GROUNDING GUARDS
%% (See Section 17.5 for the security rationale)
%% ---------------------------------------------------------------------------

%% query_path(+Src, +Dst, -Cost, -Path)
%% Public API for shortest_path.
%% Enforces ground(Src) and ground(Dst) before delegating to tabled predicate.
%% Throws instantiation_error if either argument is unbound.
%%
%% Path reconstruction: once Cost is known, path_trace/4 reconstructs
%% the explicit sequence of hops. This is NOT tabled — it is computed
%% on demand from the link facts using the known minimum cost.
query_path(Src, Dst, Cost, Path) :-
    must_be(ground, Src),   % Throws instantiation_error if unbound
    must_be(ground, Dst),   % Throws instantiation_error if unbound
    shortest_path(Src, Dst, Cost),
    path_trace(Src, Dst, Cost, Path).

%% path_trace(+Src, +Dst, +Cost, -Path)
%% Reconstructs the explicit hop sequence achieving Cost.
%% Non-tabled — called once after shortest_path establishes the minimum Cost.
path_trace(Src, Dst, Cost, [Src, Dst]) :-
    link(Src, Dst, Cost),
    !.
path_trace(Src, Dst, Cost, [Src | Rest]) :-
    link(Src, Mid, C1),
    C2 is Cost - C1,
    C2 > 0,
    path_trace(Mid, Dst, C2, Rest),
    !.

%% query_reachable(+Src, +Dst, -Bool)
%% Public API for reachable.
%% Returns Bool = true/false rather than succeed/fail for Go interop.
query_reachable(Src, Dst, true) :-
    must_be(ground, Src),
    must_be(ground, Dst),
    reachable(Src, Dst),
    !.
query_reachable(Src, Dst, false) :-
    must_be(ground, Src),
    must_be(ground, Dst).

%% ---------------------------------------------------------------------------
%% DYNAMIC LINK MANAGEMENT
%% ---------------------------------------------------------------------------

%% assert_link(+Node1, +Node2, +Cost)
%% Adds a new bidirectional link and flushes topology tables.
%% Tables are stale after any structural change — flush before next query.
assert_link(N1, N2, Cost) :-
    must_be(ground, N1),
    must_be(ground, N2),
    must_be(positive_integer, Cost),
    ( link(N1, N2, _) -> true ; assertz(link(N1, N2, Cost)) ),
    ( link(N2, N1, _) -> true ; assertz(link(N2, N1, Cost)) ),
    flush_topology_tables.

%% retract_link(+Node1, +Node2, +Cost)
%% Removes a failed link and flushes topology tables.
retract_link(N1, N2, Cost) :-
    must_be(ground, N1),
    must_be(ground, N2),
    retractall(link(N1, N2, Cost)),
    retractall(link(N2, N1, Cost)),
    flush_topology_tables.

%% flush_topology_tables/0
%% Abolishes all tabled answers for this module's tabled predicates.
%% Called after any structural change to the topology.
%% In the worker pool context, this is called by each worker upon receiving
%% a ControlFlushTables broadcast from the Go pool manager.
%%
%% Note: live_state:live_shortest_path/3 is NOT tabled (see §22.3.4 commentary)
%% so no live_state table flush is required here. If live_shortest_path/3 is
%% tabled in a future iteration with incremental dependency tracking, the Chapter
%% 19 §19.3.3 incremental tabling mechanism handles invalidation automatically
%% when node_metric/4 facts are retracted/asserted.
flush_topology_tables :-
    abolish_table_subgoals(shortest_path(_, _, _)),
    abolish_table_subgoals(reachable(_, _)).
    %% abolish_table_subgoals/1 removes specific predicate tables.
    %% abolish_all_tables/0 would remove ALL tabled predicates system-wide —
    %% too broad if other modules also use tabling (e.g., archive_ingestor.pl).
    %% Prefer abolish_table_subgoals/1 scoped to this module's predicates.

17.3.2 REPL Demonstration

?- use_module('/opt/logic-node/kb/topology/proxmox_topology').
true.

% Direct link query — O(1) from link/3 fact
?- query_path(pve1, leaf_a, Cost, Path).
Cost = 10,
Path = [pve1, leaf_a].   % Direct link, 10μs

% Cross-rack query — routes through spine
?- query_path(pve1, pve4, Cost, Path).
Cost = 30,
Path = [pve1, leaf_a, spine1, leaf_b, pve4].
% 10 (pve1→leaf_a) + 5 (leaf_a→spine1) + 5 (spine1→leaf_b) + 10 (leaf_b→pve4) = 30μs
% Answer Subsumption discarded the spine2 route (10+5+6+10 = 31μs)
% and the mixed route (10+5+2+6+10 = 33μs) during table construction

% Answer Trie is populated — subsequent identical query is O(1) trie lookup
?- query_path(pve1, pve4, Cost, Path).   % Second call
Cost = 30,
Path = [pve1, leaf_a, spine1, leaf_b, pve4].
% Retrieved from Answer Trie — no recomputation

% Storage path query
?- query_path(pve7, storage1, Cost, Path).
Cost = 12,
Path = [pve7, storage1].   % Direct storage fabric link

% Verify cycle termination — this query would loop infinitely under SLD
?- query_reachable(pve1, pve1, Bool).
Bool = true.   % Reachable from itself via any cycle — terminates in O(1)

% Topology modification: spine1 fails
?- retract_link(leaf_a, spine1, 5).
true.
% Tables flushed — next query recomputes via spine2

?- query_path(pve1, pve4, Cost, Path).
Cost = 32,
Path = [pve1, leaf_a, spine2, spine1, leaf_b, pve4].
% New minimum: 10+5+2+5+10 = 32μs via spine2→spine1 east-west hop
% (spine1 still reachable via spine2's interconnect)

% Ungrounded query — rejected by grounding guard
?- query_path(X, pve4, Cost, Path).
ERROR: type_error(ground, _G123)
% must_be(ground, X) fired before shortest_path/3 was called.
% No table lookup occurred. No table space allocated.

% Table space inspection
?- table_statistics(shortest_path/3).
% Variant count: 6   (pve1→leaf_a, pve1→pve4, etc.)
% Answer count:  6   (one per variant, due to lattice(min/3))
% Memory:        ~384 bytes

17.3.3 Complexity Comparison

Approach Cycle handling Memory Query latency Re-query latency
SLD without visited list Stack overflow 128MB stack (then crash) ∞ (loops) N/A
SLD with explicit visited list Manual cycle detection O(path_len) per query O(n! paths) O(n! paths)
BFS in Prolog (procedural) Visited set O(n²) O(n²) O(n²)
SLG tabling without subsumption Guaranteed termination O(paths per pair) O(n!) first O(1)
SLG + lattice(min/3) Guaranteed termination O(n²) entries O(n²) first O(1)

The SLG + lattice(min/3) column is the correct operating point: first-query cost is O(n²) as the engine populates the Answer Trie; all subsequent queries for any previously-computed (Src, Dst) pair are O(1) trie lookups. For a stable Proxmox topology, the trie is populated on the first batch of routing queries after startup (or after a flush_topology_tables) and serves all subsequent queries without recomputation until the next topology change.


17.4 Distributed Cache Invalidation

17.4.1 The Staleness Problem

Each Chapter 16 worker goroutine owns a private WAM engine with independent table space. After worker 0's engine evaluates shortest_path(pve1, pve4, Cost) for the first time, the answer Cost = 30 is stored in worker 0's Answer Trie. Workers 1 through 15 have empty tries — they have not received this query yet.

When spine1 fails — retract_link(leaf_a, spine1, 5) executes on the Prolog side, triggered by a topology change event from the monitoring pipeline — two operations must happen atomically from the routing layer's perspective:

  1. The link(leaf_a, spine1, 5) and link(spine1, leaf_a, 5) facts are retracted from the shared clause database. This affects all engines simultaneously — link/3 is a dynamic predicate in the shared database.

  2. Every engine's shortest_path/3 and reachable/2 Answer Tries must be flushed. These are per-engine, per-table-space structures. Flushing one engine's tables does not affect any other.

Step 1 is automatic: the shared clause database change is visible to all engines immediately upon retractall. Step 2 requires explicit abolish_table_subgoals/1 on every engine. Until step 2 completes on a given engine, that engine's shortest_path(pve1, pve4, Cost) query returns Cost = 30 from the stale trie — the pre-failure shortest path through the now-retracted spine1 link.

The Go ControlFlushTables broadcast is the mechanism that drives step 2 across all 16 workers.

17.4.2 ControlFlushTables — Go Pool Extension

// engine_pool.go — additions to Chapter 16's ControlKind constants

const (
    ControlSync         ControlKind = iota // No-op ping
    ControlReload                          // Reload KB
    ControlShutdown                        // Graceful shutdown
    ControlFlushTables                     // NEW: abolish tabled answers on this engine
)
// handleControl — add the ControlFlushTables case to the existing switch:

case ControlFlushTables:
    // Execute abolish_table_subgoals scoped to the topology module.
    // This is a Prolog-level operation — must go through the FLI.
    // Uses the existing consultFile/executeGoal infrastructure from Ch 15.
    log.Printf("[Worker %d] Flushing topology tables", workerID)
    if err := executeGoalOnEngine("proxmox_topology:flush_topology_tables"); err != nil {
        // Flush failure is logged but not fatal.
        // The worker continues with potentially stale tables.
        // The pool manager will see this in the next status report.
        log.Printf("[Worker %d] Table flush FAILED: %v", workerID, err)
    } else {
        log.Printf("[Worker %d] Topology tables flushed OK", workerID)
    }
    return nil
// executeGoalOnEngine: executes a Prolog goal string on the calling thread's engine.
// The calling goroutine must be locked to an OS thread with an attached engine.
// Used for control operations like flush_topology_tables — not for data queries.
func executeGoalOnEngine(goalStr string) error {
    runtime.LockOSThread()
    // Note: already locked in workerLoop — this call is a no-op if already locked.
    // Included for clarity and defensive correctness.

    frame := C.PL_open_foreign_frame()
    defer C.PL_discard_foreign_frame(frame)

    args := C.PL_new_term_refs(1)
    cGoal := C.CString(goalStr)
    defer C.free(unsafe.Pointer(cGoal))

    // Build the goal term as an atom, then call call/1.
    // call/1 accepts an atom and executes it as a Prolog goal.
    if C.PL_put_atom_chars(args, cGoal) == 0 {
        // goalStr is a compile-time constant — PL_put_atom_chars is safe here.
        // This is NOT user-controlled input — it is a fixed goal string.
        return fmt.Errorf("PL_put_atom_chars failed for goal %q", goalStr)
    }

    pred := C.PL_predicate(C.CString("call"), 1, C.CString("system"))
    qid  := C.PL_open_query(nil, C.PL_Q_CATCH_EXCEPTION|C.PL_Q_NODEBUG, pred, args)
    rc   := C.PL_next_solution(qid)

    if rc == 0 {
        exStr := C.pl_exception_to_string(qid)
        C.PL_close_query(qid)
        if exStr != nil {
            msg := C.GoString(exStr)
            C.free(unsafe.Pointer(exStr))
            return fmt.Errorf("goal %q threw: %s", goalStr, msg)
        }
        return fmt.Errorf("goal %q failed", goalStr)
    }
    C.PL_close_query(qid)
    return nil
}

17.4.3 Triggering Invalidation from the HTTP API

// In main.go — add topology change endpoints that broadcast ControlFlushTables

// topologyChangeHandler: called when a link failure is reported by monitoring.
// Executes the Prolog retract on one engine (the retract affects the shared DB),
// then broadcasts ControlFlushTables to flush stale tabled answers on all engines.
func topologyChangeHandler(w http.ResponseWriter, r *http.Request) {
    var body struct {
        Action string `json:"action"` // "remove_link" | "add_link"
        Node1  string `json:"node1"`
        Node2  string `json:"node2"`
        Cost   int    `json:"cost"`
    }
    if err := json.NewDecoder(r.Body).Decode(&body); err != nil {
        http.Error(w, "invalid JSON", http.StatusBadRequest)
        return
    }

    // Validate: node names must be from the known topology vocabulary.
    // This prevents injection of arbitrary Prolog goals via the node name field.
    // known_topology_nodes is a static set defined at startup from the KB.
    if !knownTopologyNode(body.Node1) || !knownTopologyNode(body.Node2) {
        http.Error(w, "unknown topology node", http.StatusBadRequest)
        return
    }

    // Build and dispatch the topology mutation goal.
    // Goal string is constructed from validated tokens — not from raw user input.
    var goal string
    switch body.Action {
    case "remove_link":
        goal = fmt.Sprintf(
            "proxmox_topology:retract_link(%s, %s, %d)",
            body.Node1, body.Node2, body.Cost)
    case "add_link":
        goal = fmt.Sprintf(
            "proxmox_topology:assert_link(%s, %s, %d)",
            body.Node1, body.Node2, body.Cost)
    default:
        http.Error(w, "unknown action", http.StatusBadRequest)
        return
    }

    // Execute the topology mutation on one worker.
    // The retract/assert affects the shared clause database — all engines see it.
    result, err := globalPool.Dispatch(FirewallRequest{
        // Topology mutations are routed as a special request type.
        // See Exercise 17.3 for the typed request union implementation.
    }, 500*time.Millisecond)
    _ = result
    if err != nil {
        http.Error(w, err.Error(), http.StatusServiceUnavailable)
        return
    }

    // Broadcast ControlFlushTables to all 16 workers.
    // Each worker calls flush_topology_tables on its own engine between queries.
    // Consistency window: ~320μs at 16 workers × 20μs/query.
    globalPool.Broadcast(ControlMsg{
        Kind:    ControlFlushTables,
        Payload: goal, // Carried for logging — not re-executed by workers
    })

    w.WriteHeader(http.StatusAccepted)
    w.Write([]byte(`{"status":"topology_updated","tables":"invalidated"}`))
}

17.4.4 Table Invalidation Sequencing

Timeline: spine1 link failure propagated through the system

T=0ms    Monitoring pipeline detects spine1 port failure.
         POST /topology {"action":"remove_link","node1":"leaf_a","node2":"spine1","cost":5}

T=1ms    topologyChangeHandler executes:
         retractall(link(leaf_a, spine1, 5)) → shared DB updated, all engines see it.
         retractall(link(spine1, leaf_a, 5)) → shared DB updated.
         Broadcast ControlFlushTables sent to ctrlCh (16 copies queued).

T=1-2ms  Workers 0-15 finish current queries, read ControlFlushTables from ctrlCh.
         Each calls flush_topology_tables:
           abolish_table_subgoals(shortest_path(_, _, _))
           abolish_table_subgoals(reachable(_, _))
         Answer Tries cleared. Table space released.

T=2ms    First new routing query arrives: query_path(pve1, pve4, Cost, Path).
         shortest_path/3 Answer Trie is empty — recomputation begins.
         link(leaf_a, spine1, _) is now absent from the DB.
         New minimum path: Cost = 32 via spine2.
         Answer Trie populated with new result.

T=2-300ms  All 14-node topology pair costs recomputed across worker engines
           as queries for each pair arrive. Total table space: ~12KB.

INVARIANT: No routing decision uses the pre-failure Cost = 30 path
           after T=2ms. The spine1 link is absent from the DB.
           Any worker that had Cost = 30 cached had its table flushed
           before T=2ms. Zero stale routing decisions after flush completes.

17.4.5 Shared Tabling: Eliminating the Broadcast Entirely

The ControlFlushTables broadcast works correctly but carries a 320μs consistency window during which engines hold stale answers. SWI-Prolog 9.0 introduced Shared Tabling — a directive option that moves the Answer Trie from per-engine thread-local storage into the global shared clause database, making it accessible and mutable from all WAM engines simultaneously.

%% Per-thread tabling (default — Chapter 17.4.1 through 17.4.4):
:- table shortest_path(+, +, lattice(min/3)).
%% Answer Trie lives in each engine's private table space.
%% Flush requires broadcast to all N engines.

%% Shared tabling (SWI-Prolog 9.0+):
:- table shortest_path(+, +, lattice(min/3)) as shared.
%% Answer Trie lives in the global shared clause database.
%% One abolish_table_subgoals call invalidates the trie for ALL engines.

Under shared tabling, the invalidation sequence for a spine1 failure collapses from a 16-worker broadcast to a single operation:

%% flush_topology_tables — shared tabling variant:
flush_topology_tables :-
    abolish_table_subgoals(shortest_path(_, _, _)),
    abolish_table_subgoals(reachable(_, _)).
    %% Both calls operate on the shared global trie.
    %% All 16 engines see the empty trie on their next query — instantly.
    %% No ControlFlushTables broadcast required.
    %% No 320μs consistency window.

The Go side simplifies correspondingly: ControlFlushTables and handleControl's flush case are no longer needed. The topologyChangeHandler calls flush_topology_tables once via executeGoalOnEngine on any single worker, and the invalidation is complete globally. The Broadcast call in topologyChangeHandler is replaced by a single Dispatch:

// topologyChangeHandler — shared tabling variant (replaces the Broadcast call):
result, err := globalPool.Dispatch(FirewallRequest{ /* topology mutation */ },
    500*time.Millisecond)
if err != nil { /* handle */ }
// Single dispatch executes retractall + flush_topology_tables.
// flush_topology_tables operates on the shared global trie.
// All 16 workers see the empty trie on their next routing query.
// No broadcast. No consistency window.

The concurrency contract for shared tabling differs from per-thread tabling in one important respect: the shared Answer Trie is protected by a SWI-Prolog internal read-write lock. Concurrent reads (multiple workers querying shortest_path/3 simultaneously) proceed without contention — readers do not block each other. A write — abolish_table_subgoals or an answer insertion during the first post-flush population — acquires the write lock and briefly serialises all concurrent readers waiting on that trie entry. For the 14-node topology, this serialisation window is microseconds. For a 5,000-node topology under high concurrency, the write lock during full population becomes a bottleneck: all 16 workers racing to insert answers into the shared trie serialise through the write lock O(n²) times.

When to prefer shared vs. per-thread tabling:

Condition Use shared tabling Use per-thread tabling
Topology changes infrequent (seconds/minutes) ✓ Simpler invalidation, zero consistency window ✗ Broadcast overhead acceptable but unnecessary
Topology changes frequent (≤10/second) ✗ Write-lock contention during rapid reloads ✓ Per-engine isolation absorbs reload bursts
Topology ≤ 1,000 nodes ✓ Shared trie population fast; write-lock window short ✓ Either works
Topology > 5,000 nodes, 16+ workers ✗ Write-lock contention during O(n²) insertions ✓ Parallel per-engine population, no shared contention

The Proxmox production topology (14–100 nodes, topology changes on minute timescales) sits firmly in the shared-tabling-preferred quadrant. The as shared annotation is a one-token change to proxmox_topology.pl that eliminates the Go broadcast infrastructure entirely for this use case.


17.5 Security: Table Space Exhaustion and Grounding Guards

17.5.1 The Ungrounded Query DoS — Physics

shortest_path/3 with lattice(min/3) mode-checks its first two arguments. An ungrounded call slips through if the mode annotation is not present or if the caller bypasses query_path/4. The DoS vector operates as follows:

% DANGEROUS: direct call to tabled predicate with unbound arguments
?- shortest_path(X, Y, Cost).

With 14 nodes in the topology, this query asks the engine to compute shortest_path/3 for all 14 × 14 = 196 distinct (X, Y) pairs simultaneously. With Answer Subsumption, each pair produces one entry in the Answer Trie. Total cost: 196 entries × ~64 bytes = 12.5KB. For 14 nodes, this is trivial.

Scale the attack:

Ungrounded query against a topology with N nodes:

N = 50  (medium Proxmox cluster):
  196 pairs → 2,500 pairs.
  Each entry: 64 bytes.
  Table space: 160KB.  (Still trivial)

N = 500  (large SDN with VLAN decomposition into virtual nodes):
  Pairs: 250,000.
  Table space: 16MB.   (Acceptable at first)

N = 5,000  (full BGP routing table decomposed to hop-by-hop nodes):
  Pairs: 25,000,000.
  Table space: 1.6GB.  (Process killed by OOM killer)

N = 10,000  (threat model: attacker submits fabricated topology facts via
            a KB injection vulnerability, then issues ungrounded query):
  Pairs: 100,000,000.
  Table space: 6.4GB.  ← OOM kill before completion
  During allocation: Go GC and WAM GC compete for physical RAM.
  All 16 worker engines on the same NUMA node experience memory pressure.
  Other services on the host begin failing.

The set_prolog_flag(table_space, 64MB) flag limits the total table space across all tabled predicates on a single engine. When the limit is reached, the WAM throws ERROR: Resource error: max_table_space. Chapter 16's panic recovery catches this and destroys the engine. The pool manager respawns a replacement. But the OOM pressure during the allocation window — before the limit is hit — has already affected other workers. The table_space flag is the last defence, not the first.

The first defence is the grounding guard, which must execute before the tabled predicate is called.

17.5.2 The Grounding Guard Pattern

%% CORRECT: query_path/4 — public API with mandatory grounding guards
query_path(Src, Dst, Cost, Path) :-
    %% must_be(ground, X) throws type_error(ground, X) if X is not ground.
    %% type_error is a Prolog standard error — caught by PL_exception(qid)
    %% in cgo_bridge.go and converted to a Go error before reaching the HTTP handler.
    %% The Go caller receives a 400 Bad Request. No table lookup occurs.
    %% No table space is allocated.
    must_be(ground, Src),
    must_be(ground, Dst),
    %% Additional guards: Src and Dst must be known topology nodes.
    %% An attacker who submits a valid ground atom that is NOT in the topology
    %% causes shortest_path/3 to fail immediately (no matching link facts)
    %% without allocating table space.
    ( known_node(Src) -> true
    ; type_error(topology_node, Src) ),
    ( known_node(Dst) -> true
    ; type_error(topology_node, Dst) ),
    shortest_path(Src, Dst, Cost),
    path_trace(Src, Dst, Cost, Path).

%% known_node/1: closed vocabulary of valid topology nodes.
%% Pre-compiled at startup from the link/3 facts.
%% Do NOT use link(Node, _, _) as the guard — that would allow any ground atom
%% that appears as a source in the link facts, including attacker-injected ones.
:- dynamic known_node/1.

initialise_known_nodes :-
    retractall(known_node(_)),
    aggregate_all(set(N), (link(N, _, _) ; link(_, N, _)), Nodes),
    maplist([Node]>>(assertz(known_node(Node))), Nodes).

:- initialise_known_nodes.

The known_node/1 guard provides two protections simultaneously:

  1. Ungrounded query prevention: must_be(ground, Src) fires first, rejecting any unbound argument before known_node/1 or shortest_path/3 is reached.

  2. Fabricated node injection prevention: An attacker who injects a new link(attacker_node, pve1, 0) fact via a KB mutation vulnerability cannot then issue query_path(attacker_node, pve1, Cost, Path)known_node(attacker_node) fails because initialise_known_nodes was called at load time and attacker_node was not present in the link facts then. The closed vocabulary is frozen at startup. Adding initialise_known_nodes to the assert_link/3 handler keeps the vocabulary consistent with the legitimate topology.

17.5.3 table_space as the Final Backstop

%% Set per-engine table space limit at KB load time.
%% 64MB accommodates a 700-node topology with full shortest_path/3 population:
%%   700² pairs × 64 bytes = 31.4MB — within the 64MB limit.
%% A 1,000-node topology would require 64MB and may hit the limit under full population.
%% Increase proportionally with topology scale; reduce if memory is constrained.

:- set_prolog_flag(table_space, 64_000_000).  % 64MB per engine
// In engine_pool.go — table_space flag set during engine initialisation.
// InitEngine's argv includes the table_space flag as a command-line parameter
// equivalent to set_prolog_flag. SWI-Prolog accepts --table-space=64M:

argv := []*C.char{
    C.CString("swipl"),
    C.CString("--quiet"),
    C.CString("--table-space=64M"),  // Per-engine table space limit
    C.CString("-g"),
    C.CString("true"),
    C.CString("-t"),
    C.CString("halt"),
}

When the table_space limit is hit:

?- shortest_path(X, Y, Cost).   % Ungrounded — bypassed query_path/4 somehow
ERROR: Resource error: max_table_space
%   Overflow: table_space limit (64,000,000) exceeded
// In QueryFirewall / executeGoalOnEngine — PL_exception(qid) catches this:
// exStr = "Resource error: max_table_space"
// err = fmt.Errorf("prolog exception: Resource error: max_table_space")
// HTTP response: 500 Internal Server Error with the error string.
// Worker engine is NOT destroyed — the error is a Prolog exception, not a WAM crash.
// The table_space limit is enforced at the Prolog level, above the C crash boundary.

The distinction is important: table_space exhaustion throws a catchable Prolog exception (error(resource_error(max_table_space), _)). The engine continues operating. Only the query that triggered the exhaustion receives the error. Other workers are unaffected. The panic recovery from Section 16.5 is not triggered. The table space allocated before the limit was hit is still present and must be released — the worker should call flush_topology_tables after receiving a table_space exhaustion error, to return to a clean state.

17.5.4 OOMScoreAdjust: Protecting the Orchestrator Under System Memory Pressure

Tabling trades WAM stack depth for WAM table space. Under normal operation, the per-engine table_space flag prevents individual engines from exhausting memory. Under abnormal operation — a misconfigured flag, a proxmox_topology.pl update that doubles the node count without updating the limit, or a shared-tabling population surge on a 5,000-node topology — the Go binary's RSS (Resident Set Size) can grow faster than the OOM killer's sampling interval on a hypervisor under VM memory pressure.

Linux's OOM killer scores processes on a 0–1000 scale: higher scores are killed first. A process with a score of 0 is killed only as a last resort; a process at 1000 is killed immediately when memory is exhausted. By default, all processes start at a baseline score derived from their RSS and runtime. A short-lived metrics exporter with small RSS and a low oom_score_adj may still receive a lower kill score than the long-running, larger-RSS Go orchestrator binary, making the orchestrator a more attractive kill target despite being operationally critical.

OOMScoreAdjust in the systemd service unit shifts the process's OOM kill score by a fixed offset. Negative values make the process harder to kill:

# /etc/systemd/system/firewall-bridge.service
[Service]
ExecStart=/usr/local/bin/firewall-bridge
CPUAffinity=0-15
NUMAPolicy=bind
NUMAMask=0
OOMScoreAdjust=-500
# -500: shifts OOM kill score 500 points toward "kill last".
# Range: -1000 (never kill) to +1000 (kill first).
# -500 ensures that logging daemons, metrics exporters, and
# monitoring agents (typically OOMScoreAdjust=0) are killed
# well before the orchestrator binary when the host is under
# extreme memory pressure from VM balloon drivers.
# Do NOT use -1000: that setting prevents the OOM killer from
# ever targeting this process, which can cause system-wide deadlock
# if the orchestrator itself is the source of the memory leak.
# Verify the active OOM score after service start:
logicadmin@logic-node-01:~$ cat /proc/$(pgrep firewall-bridge)/oom_score_adj
-500   # ✓ Correct

# Verify the computed OOM kill score (lower = harder to kill):
logicadmin@logic-node-01:~$ cat /proc/$(pgrep firewall-bridge)/oom_score
17     # e.g., baseline 517 - 500 adjustment = 17

# Compare with a typical metrics exporter (no adjustment):
logicadmin@logic-node-01:~$ cat /proc/$(pgrep node_exporter)/oom_score
234    # Higher score — killed before the orchestrator

The OOMScoreAdjust=-500 entry belongs in the same [Service] stanza as the CPUAffinity and NUMAPolicy directives established in Chapter 16, Section 16.1.2.1. Together they constitute the complete systemd hardening baseline for the sovereign orchestrator: NUMA-bound execution, CPU-affinitised threads, and OOM-protected process lifetime.


Outcome: Declarative Algorithmic Optimisation

17.6.1 The Conceptual Transition

Chapters 1 through 16 built the infrastructure: the parsers, the KB, the CGO bridge, the worker pool, the panic recovery, the distributed control channel. All of it delivers Prolog queries into the hands of Go HTTP handlers at 20μs latency. Chapter 17 changes what those queries are allowed to ask.

Before tabling, infrastructure graph queries required procedural implementation: a BFS in Prolog with an explicit visited set, or a Dijkstra implementation accumulating a priority queue in a list, or a memoisation wrapper that manually checks whether a result has already been computed. Every one of these implementations replicates, less correctly, what the SLG resolution engine already does — and none of them is declarative. A procedural BFS is 80 lines of Prolog. A tabled shortest_path/3 is 3 lines.

The 3-line declarative implementation replaces the 80-line procedural implementation entirely.

Procedural BFS in Prolog Tabled SLG with lattice(min/3)
~80 lines of explicit graph traversal 3 lines: two rules + :- table directive
Manual visited set — programmer's responsibility Cycle detection automatic — Call Trie
Returns one path, not necessarily optimal Returns minimum-cost path — Answer Subsumption
Must be re-run for each query Memoised — O(1) on cached pairs
Stack overflow on deep recursion O(n²) table space, O(1) stack depth
Cache invalidation: manual abolish_table_subgoals/1 — one call
Distributing invalidation: ad-hoc ControlFlushTables broadcast — pool-wide

17.6.2 Verification Checklist

% 1. Cycle termination: pve1→pve1 terminates
?- query_reachable(pve1, pve1, Bool).
Bool = true.   % ✓ Terminates; does not loop

% 2. Shortest path correctness: pve1→pve4
?- query_path(pve1, pve4, Cost, Path).
Cost = 30, Path = [pve1, leaf_a, spine1, leaf_b, pve4].   % ✓

% 3. Answer Subsumption: suboptimal paths discarded
%    Verify only one answer exists in the trie for this pair
?- aggregate_all(count, shortest_path(pve1, pve4, _), N).
N = 1.   % ✓ Exactly one answer per pair — lattice(min/3) enforced

% 4. Table reuse: second query for same pair is O(1) trie lookup
%    (Measure via get_time/1 — should be < 1μs vs ~20μs for first call)
?- query_path(pve1, pve4, Cost, _).
Cost = 30.   % ✓ Returns immediately from trie

% 5. Link failure propagation: retract spine1
?- retract_link(leaf_a, spine1, 5).
true.   % ✓ Shared DB updated, tables flushed

% 6. Post-failure re-routing: new path avoids spine1
?- query_path(pve1, pve4, Cost, Path).
Cost = 32.   % ✓ Rerouted via spine2

% 7. Grounding guard: unbound Src rejected before table lookup
?- query_path(_, pve4, _, _).
ERROR: type_error(ground, _G123)   % ✓ must_be(ground, Src) fired

% 8. Unknown node rejected
?- query_path(attacker_node, pve4, _, _).
ERROR: type_error(topology_node, attacker_node)   % ✓ known_node guard fired

% 9. table_space flag active
?- current_prolog_flag(table_space, N).
N = 64000000.   % ✓ 64MB per-engine limit set

% 10. ControlFlushTables via Go pool broadcasts abolish to all engines
%     (Verified via HTTP endpoint after topology change)

17.6.3 What Comes Next

The proxmox_topology.pl rules built in this chapter are pure Prolog: declarative facts and tabled predicates with no dependency on the WAM threading model, the CGO bridge, or the Go runtime. They are portable. The SLG engine that evaluates them is not tied to a server process or a CGO binary.

Chapter 18 takes these exact routing rules and compiles them to WebAssembly using the SWI-Prolog WASM target (swipl-wasm). A network operations engineer opening a browser-based topology dashboard receives the compiled proxmox_topology.wasm module. The tabled shortest_path/3 predicate executes entirely client-side — no round trip to the server for each routing query, no HTTP latency, no load on the CGO worker pool. The WASM runtime provides a sandboxed WAM with its own table space, bounded by the browser tab's memory allocation. The same lattice(min/3) Answer Subsumption that makes server-side routing O(1) on cached pairs makes browser-side routing equally efficient after the first population sweep.

The security contract changes at the WASM boundary: input validation and grounding guards still apply, but the threat model shifts from server memory exhaustion to browser tab memory exhaustion. The known_node/1 vocabulary, exported as part of the WASM module's initialisation goal, is static and tamper-evident — the operator's browser verifies the module's hash before loading it. The table space limit, set at compile time in the WASM module's Prolog flags, bounds the maximum routing state the browser can accumulate. Both constraints from Section 17.5 carry forward unchanged.


Chapter Summary

Concept Operational Definition Performance / Security Consequence
SLD resolution on cyclic graphs Depth-first evaluation; no cycle detection; infinite recursion on link(a,b). link(b,a). WAM local stack exhaustion in O(depth/frame_size) time; triggers Ch 16 panic recovery; wrong tool for graph queries
SLG resolution (tabling) Fixed-point evaluation via Call Trie and Answer Trie; cycles detected by Call Trie membership check Terminates on any finite Herbrand base; stack depth O(1); table space O(distinct answers)
Call Trie Per-engine hash trie of in-progress calls; keyed by call pattern Cycle detected when call pattern already present; continuation suspended rather than recursed
Answer Trie Per-engine storage of derived answers for tabled predicates O(1) lookup on cached calls; persists until abolish_table_subgoals/1 or abolish_all_tables/0
:- table shortest_path(+, +, lattice(min/3)) Enables SLG for shortest_path/3; mode (+,+) enforced; lattice(min/3) applies meet on Cost One Answer Trie entry per (Src, Dst) pair; O(n²) total entries; dominated paths discarded at derivation time
lattice(min/3) meet operation min(Old, New, Result) :- Result is min(Old, New) applied when new answer arrives Answer Trie never exceeds O(n²) entries; minimum-cost path computed incrementally, not post-hoc
Answer Subsumption correctness requirements Meet must be commutative, associative, idempotent, bounded below Non-negative edge weights required; negative weights invalidate the lattice — use Bellman-Ford instead
abolish_table_subgoals/1 Removes Answer Trie entries for a specific predicate pattern Preferred over abolish_all_tables/0 (too broad); scoped to affected predicate; safe to call between queries
ControlFlushTables broadcast Go pool broadcasts N copies to ctrlCh; each worker calls flush_topology_tables between queries Consistency window: N × avg_query_time ≈ 320μs at 16 workers; operationally acceptable for topology changes on second timescales
set_prolog_flag(table_space, N) Per-engine ceiling on total table space allocation Throws resource_error(max_table_space) when exceeded; catchable Prolog exception — engine survives; last-resort backstop
must_be(ground, Src) guard Throws type_error(ground, Src) if Src is unbound; executes before tabled call Zero table allocation on invalid call; ungrounded query DoS blocked at guard; error converted to Go error by PL_exception(qid)
known_node/1 closed vocabulary Pre-computed at load time from link/3 facts; checked in query_path/4 Blocks fabricated-node injection even after a KB mutation vulnerability; topology vocabulary frozen at startup
Ungrounded query DoS vector shortest_path(X, Y, Cost) with unbound X, Y forces full (Src×Dst) trie population 10,000 nodes → 100M entries → 6.4GB → OOM kill; NUMA pressure affects all workers; prevented by must_be + known_node before table_space backstop fires
:- table ... as shared (SWI 9.0+) Answer Trie allocated in global shared clause database; readable by all engines concurrently; write-locked during insertion/abolish One abolish_table_subgoals invalidates all 16 engines instantly; eliminates ControlFlushTables broadcast and 320μs consistency window; prefer for ≤ 1,000 nodes and infrequent changes
Shared tabling write-lock contention Concurrent answer insertions serialise through shared-trie write lock during first post-flush population sweep At 5,000 nodes + 16 workers: O(n²) write-lock acquisitions; use per-thread tabling for large topologies under high concurrency
OOMScoreAdjust=-500 (systemd) Shifts OOM kill score 500 points toward "kill last"; complements CPUAffinity and NUMAPolicy in [Service] stanza Logging daemons and metrics exporters (score 0) killed before orchestrator; never use -1000 — that prevents OOM kill entirely and can deadlock the host

Exercises

Exercise 17.1 — Weighted Betweenness Centrality Add a centrality/2 predicate that computes the betweenness centrality of each topology node: the number of shortest paths between all pairs (Src, Dst) that pass through the node. Use findall/3 over the shortest_path/3 Answer Trie (after full population) and path_trace/4 to enumerate which nodes each path passes through. Identify the highest-centrality node in the proxmox_topology.pl example. Verify that it matches intuition: for the spine-leaf topology, the highest-centrality nodes should be the spine switches.

Exercise 17.2 — Maximum Bandwidth Path with lattice(max/3) Extend proxmox_topology.pl to include a bandwidth_mbps attribute on each link. Add a new tabled predicate max_bandwidth_path(+Src, +Dst, -BW) using :- table max_bandwidth_path(+, +, lattice(max/3)). The bandwidth of a path is the minimum bandwidth of any link on the path (the bottleneck). Implement min_on_path_bandwidth/3 to compute this. Verify that max_bandwidth_path/3 returns the path with the highest bottleneck bandwidth, not necessarily the fewest hops.

Exercise 17.3 — Typed Request Union for the Go Dispatcher Section 17.4.3's topologyChangeHandler notes that topology mutations are routed as a special request type. Implement a Request sum type in Go:

type Request struct {
    Kind    RequestKind
    Firewall *FirewallRequest     // for Kind = KindFirewall
    Topology *TopologyRequest    // for Kind = KindTopology
}

Update engine_pool.go to dispatch KindTopology requests to executeGoalOnEngine rather than QueryFirewall. Verify that a KindFirewall request and a KindTopology request can be in-flight simultaneously across different workers, with the topology mutation broadcasting ControlFlushTables upon completion.

Exercise 17.4 — Incremental Table Invalidation flush_topology_tables calls abolish_table_subgoals(shortest_path(_, _, _)) which removes all cached answers. For a topology change that only affects links adjacent to spine1, only pairs (Src, Dst) whose shortest path passes through spine1 need recomputation. Implement flush_affected_paths(+Node) that calls path_trace/4 to find all currently-cached pairs whose path includes Node, and calls abolish_table_subgoals(shortest_path(Src, Dst, _)) only for those specific pairs. Benchmark the recomputation time after a full flush vs. an incremental flush on the 14-node topology, reporting the ratio.

Exercise 17.5 — Table Space Regression Test Write a test TestTableSpaceStability that: (1) calls query_path/4 for all 14×14 = 196 pairs in the topology; (2) records the table space used via table_statistics/2; (3) asserts that the used space is ≤ 64KB (the known bound for this topology); (4) calls query_path(_, _, _, _) directly (bypassing query_path/4's guards) on a patched version without the grounding guard; (5) verifies that table_space exhaustion throws resource_error(max_table_space) with a reduced set_prolog_flag(table_space, 4096) limit. This test must run before any deployment of an updated proxmox_topology.pl.


Further Reading

  • SWI-Prolog Manual: Tabling — https://www.swi-prolog.org/pldoc/man?section=tabling — complete reference for :- table, mode directives, answer subsumption, abolish_table_subgoals/1, abolish_all_tables/0, table_statistics/2
  • SWI-Prolog Manual: Answer Subsumption — https://www.swi-prolog.org/pldoc/man?section=tabling-subsumptionlattice/1, first/0, last/0, and custom meet predicates; the authoritative specification for all subsumption modes used in this chapter
  • Chen, W., Swift, T., & Warren, D.S. (1995). "Efficient Top-Down Computation of Queries Under the Well-Founded Semantics." Journal of Logic Programming 24(3). — the original SLG resolution paper; the Call Trie/Answer Trie model is introduced here
  • Swift, T. & Warren, D.S. (2012). "XSB: Extending Prolog with Tabled Logic Programming." Theory and Practice of Logic Programming 12(1-2). — the XSB system paper; SWI-Prolog's tabling implementation follows the same SLG semantics
  • Dijkstra, E.W. (1959). "A note on two problems in connexion with graphs." Numerische Mathematik 1(1). — the original shortest-path algorithm; lattice(min/3) Answer Subsumption implements Dijkstra's optimality condition declaratively
  • Baader, F. & Nipkow, T. (1998). Term Rewriting and All That. Cambridge University Press. Chapter 2: Terms, Substitutions, and Unification — the theoretical foundation for Herbrand bases and the finite Herbrand base requirement that guarantees SLG termination

End of Chapter 17 — Next: Chapter 18: Prolog at the Edge — Compiling Sovereign Routing Rules to WebAssembly for Client-Side Execution


Revision record: Chapter 17.1 — Architect’s review applied. Fix 1: philosophical paragraph in Section 17.6.1 deleted; replaced with "The 3-line declarative implementation replaces the 80-line procedural implementation entirely." Fix 2: Section 17.6.3 inserted pointing to Chapter 18 Prolog at the Edge (WASM) — explains portability of the tabled routing rules to browser-side WASM execution, client-side table space bounds, and known_node/1 vocabulary as a tamper-evident module export. Footer corrected. Improvement 1: Section 17.4.5 inserted — Shared Tabling via :- table ... as shared; full Prolog and Go simplification code; read-write lock concurrency contract; decision table covering 4 conditions (topology size × change frequency) for shared vs. per-thread preference. Improvement 2: Section 17.5.4 inserted — OOMScoreAdjust=-500 in systemd service unit; OOM score arithmetic; oom_score_adj and oom_score verification commands; cross-reference to Chapter 16 CPUAffinity/NUMAPolicy as the complete systemd hardening baseline. Three new Chapter Summary rows. BookStack tags: swi-prolog, chapter-17, tabling, slg-resolution, answer-subsumption, shortest-path, proxmox-topology, cache-invalidation, table-space, grounding-guards, shared-tabling, wasm, oom-protection, volume-iii