Skip to main content

Chapter 15: The CGO Bridge

Textbook: Modern SWI-Prolog (2026 Edition): Sovereign Infrastructure & Industrial Logic Volume: III — Scaling & Concurrency Chapter: 15 of 24 — Volume III Opening Chapter Audience: Senior Engineers, Systems Architects, Infrastructure Security Practitioners Prerequisites: Chapters 1–13 complete. network_parser.pl, auth_parser.pl, nginx_reversible.pl, archive_ingestor.pl operational at /opt/logic-node/. Go 1.22+ installed. libswipl development headers at /usr/lib/swi-prolog/include/. logicadmin user active.


Core Concepts

An HTTP/REST boundary between a Go microservice and a Prolog inference engine introduces three costs that are unacceptable for microsecond-latency orchestration: TCP stack traversal, JSON serialisation and deserialisation on both sides, and the scheduling latency of two separate OS processes. On a lightly loaded host these costs sum to 0.5–5ms per round trip. Under load they sum to whatever the OS scheduler decides, which is not bounded. A firewall policy check that requires a Prolog proof — "does this source IP fall within any blocked CIDR in the current threat model?" — has no business paying a 5ms serialisation tax when the alternative is a direct C function call into a loaded WAM that returns an answer in 20 microseconds.

libswipl is a shared library. It exposes the entire SWI-Prolog engine — the WAM, the clause database, the module system, all loaded predicates — as a C API. A Go binary compiled with cgo can load that library, call PL_initialise, load a .pl file, open a query, advance through solutions, and read the result, all within the same process address space as the Go runtime. No network. No serialisation. One function call boundary: the cgo C shim.

The engineering cost of this capability is proportional to how precisely the developer understands two memory models that share an address space but have completely incompatible ownership semantics. Go's garbage collector traces the heap from goroutine roots and compacts or moves memory as it sees fit. The WAM manages its own stacks: the local stack (activation frames, term_t handles), the global stack (compound terms, strings), and the trail (variable bindings to undo on backtrack). malloc manages the C layer between them. Nothing in the Go GC knows the WAM stacks exist. Nothing in the WAM knows Go heap pointers exist. A pointer handed across either boundary without pinning it on the owning side is a memory corruption waiting for the next GC cycle or the next backtrack.

Five properties define the operational contract for safe Go–Prolog bridging.

1. Go goroutines must be locked to OS threads before calling any FLI function. cgo calls execute on the goroutine's current OS thread. Go's runtime scheduler moves goroutines between OS threads freely unless runtime.LockOSThread() is called. The WAM is not goroutine-aware — it is OS-thread-aware. Every OS thread that calls FLI functions must have a Prolog engine attached to it via PL_thread_attach_engine(NULL) before the first FLI call. Calling an FLI function from an OS thread without an attached engine produces undefined behaviour at the C layer. runtime.LockOSThread() + PL_thread_attach_engine(NULL) together establish the invariant that one goroutine, one OS thread, one WAM engine context are in scope for the duration of the FLI session.

2. term_t handles are WAM local stack indices, not heap pointers. term_t is a uintptr_t typedef in <SWI-Prolog.h>. It is not a pointer to a Prolog term. It is an index into the WAM's local stack frame. Its validity is scoped to the foreign frame in which it was allocated. A term_t handle allocated inside PL_open_foreign_frame() / PL_discard_foreign_frame() is invalid after PL_discard_foreign_frame() is called. Storing it in a Go struct and reading it after the frame is discarded is a stack corruption. The frame boundary is the only safe lifetime scope for term_t values.

3. Every FLI session in a loop requires an explicit foreign frame per iteration. A Go worker that calls Prolog in a loop — processing 10,000 firewall policy queries per second — allocates term_t references on the WAM local stack with each FLI call. Without PL_open_foreign_frame() / PL_discard_foreign_frame() bracketing each loop iteration, those term_t allocations accumulate on the WAM local stack without being reclaimed. The local stack exhausts. The WAM throws ERROR: Stack limit exceeded. The Go process receives a C exception it did not expect and the binary crashes or enters an undefined state. The foreign frame is the only mechanism that reclaims local stack space between FLI sessions.

4. Go strings must become Prolog strings, not atoms, at the C boundary. PL_put_atom_chars(term_t t, const char *chars) interns chars as a permanent atom in the shared Atom Table. This is a C-level hash table insert that bypasses every Prolog-level guard built in Volume II. A Go string read from an HTTP request body, an operator input field, or a JSONL file, passed directly to PL_put_atom_chars, produces a permanent Atom Table entry for its lifetime. At 10,000 distinct strings per second, the Atom Table grows by 10,000 permanent entries per second. PL_put_string_chars(term_t t, const char *chars) allocates the string on the WAM global heap as a GC-eligible Prolog string object, exactly as string_codes/2 does at the Prolog level. All user-controlled Go strings cross the FLI boundary as strings, never atoms.

5. Prolog exceptions must be caught at PL_exception(qid) before PL_close_query(qid). PL_next_solution(qid) returns 0 on both failure (no more solutions) and exception (something went wrong). The caller must call PL_exception(qid) after a 0 return to distinguish them. If PL_exception(qid) returns a non-zero term_t, an exception is pending. Reading it, converting it to a Go error, and returning it before calling PL_close_query(qid) is the correct path. Calling PL_close_query(qid) with a pending exception clears the exception silently — the Go caller receives a nil error for a failed query, producing silent incorrect results in every downstream consumer.


Chapter Roadmap

SectionTitleFocus
15.1Physics of the CGO BridgeGo GC vs. WAM stacks, memory ownership, cgo call mechanics
15.2The FLI Contractterm_t lifetime, foreign frame discipline, query lifecycle
15.3Data MarshalingGo map → Prolog Dict, type translation table, string vs. atom enforcement
15.4The Build: cgo_bridge.goEngine initialisation, QueryFirewall, proof extraction, error return
15.5Security: C-Level Atom ExhaustionPL_put_atom_chars DoS, exception handling, Go panic prevention
OutcomeZero-Latency Polyglot LogicVerification checklist, latency comparison

15.1 The Physics of the CGO Bridge

15.1.1 Why HTTP Between Go and Prolog Is the Wrong Architecture

The standard "microservice" answer to polyglot integration is an HTTP API. The Prolog engine runs as a separate process, exposes endpoints via http_server/2, and the Go service calls it over localhost. This architecture has a fixed cost floor:

HTTP round-trip latency (localhost, lightly loaded):
  TCP loopback:            ~0.02ms
  HTTP/1.1 framing:        ~0.05ms
  JSON marshal (Go):       ~0.10ms  (100-byte payload, encoding/json)
  HTTP parse (Prolog):     ~0.15ms  (library(http/http_server))
  JSON unmarshal (Prolog): ~0.20ms  (library(http/json))
  Prolog query execution:  ~0.02ms  (indexed lookup, 1,000 facts)
  JSON marshal (Prolog):   ~0.20ms
  HTTP response write:     ~0.05ms
  JSON unmarshal (Go):     ~0.10ms
  ─────────────────────────────────
  Total:                   ~0.89ms  per query under zero load

Under 1,000 concurrent requests:
  OS scheduler contention: +2–8ms
  TCP buffer pressure:     +1–3ms
  Go GC pause (STW):       +0–50ms (periodic)
  ─────────────────────────────────
  Total:                   3–12ms typical, unbounded worst case

The Prolog query itself takes 20 microseconds. The surrounding infrastructure consumes 40–600× that budget on overhead alone.

The libswipl in-process model:

CGO call latency (same process, goroutine locked to OS thread):
  cgo call overhead:           ~0.10μs  (runtime.cgocall trampoline)
  PL_open_foreign_frame:       ~0.05μs
  term_t allocation + binding: ~0.20μs  (per argument)
  PL_next_solution:            ~20μs    (identical query, same KB)
  PL_get_chars (result):       ~0.10μs
  PL_discard_foreign_frame:    ~0.05μs
  ─────────────────────────────────────
  Total:                       ~20.5μs  end-to-end

The infrastructure cost is 0.5μs. The query dominates at 97.6% of total time. This is the correct cost distribution — work is the majority, overhead is noise.

15.1.2 Three Memory Arenas Sharing One Address Space

Process address space layout:
┌──────────────────────────────────────────────────────────┐
│  Go Runtime Heap                                         │
│  ├── goroutine stacks (movable — GC may relocate)        │
│  ├── Go heap objects (GC-managed, may be compacted)       │
│  └── cgo-pinned objects (runtime.Pinner / C.malloc)      │
├──────────────────────────────────────────────────────────┤
│  C Heap (malloc/free)                                    │
│  ├── libswipl internal structures                        │
│  ├── PL_initialise argv copies                           │
│  └── C.CString() allocations (MUST be C.free()'d)        │
├──────────────────────────────────────────────────────────┤
│  WAM Engine (libswipl.so)                                │
│  ├── Local Stack  (term_t frames — per-engine, per-thread)│
│  │   └── GROWS down; reclaimed by PL_discard_foreign_frame│
│  ├── Global Stack (compound terms, strings, large ints)  │
│  │   └── GC'd by WAM GC (triggered by stack pressure)    │
│  ├── Trail Stack  (variable binding undo log)            │
│  ├── Atom Table   (PERMANENT hash table — never GC'd)    │
│  └── Clause Database (shared across all WAM threads)     │
└──────────────────────────────────────────────────────────┘

Go heap → C boundary: Any Go pointer passed to a C function must be pinned for the duration of the C call. As of Go 1.21, runtime.Pinner provides explicit pinning. For short-lived C calls (cgo single-call duration), the runtime pins Go pointers automatically under the covers — but only for the duration of the cgo call. A Go pointer stored in a C struct that outlives the cgo call is undefined behaviour. Use C.CString() (which calls malloc) to copy Go strings to C. Call C.free() immediately after the FLI call completes.

C heap → WAM boundary: PL_put_string_chars(t, cstr) copies cstr from the C heap (or stack) into the WAM global stack. After the call, cstr can be freed. The WAM global stack now owns the string data. PL_put_atom_chars(t, cstr) copies cstr into the permanent Atom Table. After the call, cstr can be freed — but the Atom Table entry is permanent.

WAM → Go boundary: PL_get_chars(t, &buf, flags) writes a pointer to WAM-owned data into *buf. This pointer is valid only while the foreign frame is open and the relevant WAM GC has not run. Copy the data to a Go string immediately: goStr := C.GoString(buf). Do not store buf beyond the frame boundary.

15.1.3 Diagram: Polyglot Memory Boundary

%%{init: {"themeVariables": {"fontSize": "14px"}}}%%
flowchart TD
    GR["Go Goroutine\nruntime.LockOSThread()\nGo heap — GC-managed\nPointers may move between GC pauses"]

    CGO["cgo Trampoline\nruntime.cgocall()\nGo stack pinned for call duration\nC.CString() → malloc\nC.free() obligation created"]

    CAPI["C API Layer\n<SWI-Prolog.h>\nPL_open_foreign_frame()\nPL_new_term_refs(N)\nPL_put_string_chars()\nPL_open_query()"]

    FRAME["Foreign Frame\nfid_t frame = PL_open_foreign_frame()\nterm_t handles valid inside frame\nPL_discard_foreign_frame(frame)\nreclaims all term_t allocations"]

    LOCAL["WAM Local Stack\nterm_t[0..N] — frame-scoped indices\nActivation frames per predicate call\nExhausts without PL_discard_foreign_frame\nper loop iteration"]

    GLOBAL["WAM Global Stack\nCompound terms, strings (GC-eligible)\nPL_put_string_chars → stored here\nWAM GC reclaims between queries"]

    ATOM["Atom Table\nPERMANENT hash table\nPL_put_atom_chars → stored here\nNEVER GC'd — grows monotonically\nDoS vector for untrusted input"]

    DB["Clause Database\ntutorial_fact/3, firewall_rule/3\nAll loaded .pl rules\nShared across all WAM threads\nRead-only from FLI queries"]

    GR --->|"cgo call — stack pinned"| CGO
    CGO --->|"C function call"| CAPI
    CAPI --->|"allocates frame"| FRAME
    FRAME --->|"term_t indices"| LOCAL
    CAPI --->|"PL_put_string_chars"| GLOBAL
    CAPI --->|"PL_put_atom_chars — AVOID for untrusted input"| ATOM
    CAPI --->|"PL_next_solution — reads"| DB

    style GR fill:#1A2B4A,color:#FFFFFF
    style CGO fill:#2A4A2A,color:#FFFFFF
    style CAPI fill:#1A4070,color:#FFFFFF
    style FRAME fill:#1A4070,color:#FFFFFF
    style LOCAL fill:#8B6914,color:#FFFFFF
    style GLOBAL fill:#2A5A2A,color:#FFFFFF
    style ATOM fill:#7A1A1A,color:#FFFFFF
    style DB fill:#1A6B3A,color:#FFFFFF

Reading the diagram: The Go goroutine (dark blue) crosses into the cgo trampoline (dark green) where stack pinning is automatic. The C API layer allocates a foreign frame — all term_t operations inside it are scoped to the amber WAM local stack. String data goes to the green WAM global stack (GC-eligible). Atom data goes to the red Atom Table (permanent — the DoS surface). The clause database (bright green) is read-only from FLI queries.


15.2 The Foreign Language Interface Contract

15.2.1 term_t — An Index, Not a Pointer

// From <SWI-Prolog.h>
typedef uintptr_t term_t;     // Index into WAM local stack frame
typedef uintptr_t atom_t;     // Index into Atom Table
typedef uintptr_t fid_t;      // Foreign frame identifier
typedef uintptr_t qid_t;      // Query identifier
typedef struct { ... } *predicate_t;  // Predicate handle

term_t values are allocated in sequence from the current WAM local stack frame. PL_new_term_refs(3) allocates three consecutive term_t slots and returns the first. The other two are t+1 and t+2 — arithmetic on term_t values is valid within the same allocation call. A term_t allocated in one foreign frame is invalid in any other foreign frame. Storing term_t values in Go structs and reading them after the frame closes is reading a stale stack index — the local stack has been overwritten.

// WRONG: term_t escapes its frame
fid_t f = PL_open_foreign_frame();
term_t result = PL_new_term_refs(1);
// ... populate result ...
PL_discard_foreign_frame(f);
// result is now a stale index — reading it is undefined behaviour
char *buf;
PL_get_chars(result, &buf, CVT_ALL);  // CORRUPTION

// CORRECT: extract data before discarding frame
fid_t f = PL_open_foreign_frame();
term_t result = PL_new_term_refs(1);
// ... populate result ...
char *buf;
PL_get_chars(result, &buf, CVT_ALL | BUF_MALLOC);  // buf is malloc'd copy
PL_discard_foreign_frame(f);  // frame gone — but buf is on C heap, still valid
goStr := C.GoString(buf);     // copy to Go string
C.free(unsafe.Pointer(buf));  // release C heap allocation

15.2.2 Foreign Frame Discipline: The Loop Case

A Go worker processing 10,000 queries per second calls Prolog in a tight loop. Without frames:

// WRONG: term_t accumulates without reclaim
void processLoop(int n) {
    for (int i = 0; i < n; i++) {
        term_t args = PL_new_term_refs(2);    // +2 slots per iteration
        // ... query, get result ...
        // NO PL_discard_foreign_frame — slots never reclaimed
    }
    // After 10,000 iterations: 20,000 term_t slots on local stack
    // Default local stack: 128MB / ~8 bytes per slot ≈ 16M slots
    // At 20,000 slots/call and 800 calls/sec: exhaustion in ~1,000 seconds
    // Actual: much faster under load — stack also holds query frames
}

// CORRECT: one frame per iteration — O(1) steady-state local stack usage
void processLoop(int n) {
    for (int i = 0; i < n; i++) {
        fid_t frame = PL_open_foreign_frame();

        term_t args = PL_new_term_refs(2);
        // ... query, get result, extract data to C heap ...
        // Data is safe on C heap; term_t handles no longer needed

        PL_discard_foreign_frame(frame);
        // ALL term_t allocated since PL_open_foreign_frame are reclaimed.
        // Local stack depth: identical at start of each iteration.
    }
}

15.2.3 Query Lifecycle: Open → Advance → Close

Every Prolog query executed from C follows a three-phase lifecycle:

// Phase 1: Open the query
// PL_Q_CATCH_EXCEPTION: exceptions are catchable via PL_exception()
// PL_Q_NODEBUG:         suppress trace output in embedded context
qid_t qid = PL_open_query(
    NULL,                         // module: NULL = current context module
    PL_Q_CATCH_EXCEPTION | PL_Q_NODEBUG,
    predicate,                    // predicate_t from PL_predicate()
    args                          // term_t of first argument
);

// Phase 2: Advance through solutions
int rc = PL_next_solution(qid);
// rc = TRUE (1):  solution found, output args are bound
// rc = FALSE (0): no more solutions OR exception pending
// rc = REDO (-1): not returned for deterministic predicates

if (!rc) {
    term_t ex = PL_exception(qid);  // NON-ZERO if exception pending
    if (ex) {
        // Exception path: convert to Go error BEFORE PL_close_query
        // PL_close_query clears the exception — must read it first
        char *exStr;
        PL_get_chars(ex, &exStr, CVT_ALL | CVT_WRITE | BUF_MALLOC);
        // ... package exStr as Go error ...
        C.free(unsafe.Pointer(exStr));
        PL_close_query(qid);
        return;  // error return to Go
    }
    // No exception: query genuinely failed (no solutions)
    PL_close_query(qid);
    return;  // failure return to Go
}

// Phase 3: Extract results, then close
// ... read bound output args via PL_get_* ...
PL_close_query(qid);
// Closing the query backtracks to the query's entry point,
// undoing all variable bindings made during PL_next_solution.
// Output data must be copied to C heap BEFORE PL_close_query.

The PL_close_query / PL_discard_foreign_frame ordering matters:

fid_t f    = PL_open_foreign_frame();
term_t args = PL_new_term_refs(2);
qid_t qid  = PL_open_query(NULL, PL_Q_CATCH_EXCEPTION, pred, args);
int rc     = PL_next_solution(qid);

// Extract BEFORE close — args[1] (output) is still bound
char *result_buf;
if (rc) {
    PL_get_chars(args + 1, &result_buf, CVT_ALL | BUF_MALLOC);
}

PL_close_query(qid);           // 1. Close query — unwinds choice points
PL_discard_foreign_frame(f);   // 2. Discard frame — reclaims term_t stack space

// NOW safe to use result_buf (C heap, independent of WAM stack)
if (rc) {
    goResult := C.GoString(result_buf);
    C.free(unsafe.Pointer(result_buf));
}

15.3 Data Marshaling: Go Structs to Prolog Dicts

15.3.1 Type Translation Table

Every Go type that crosses the FLI boundary has a corresponding Prolog type and a specific FLI function. The column "Atom Table impact" determines whether the value is permanent.

Go typeProlog typeFLI functionAtom Table impact
string (user input)Prolog stringPL_put_string_chars(t, cstr)None — GC-eligible
string (closed vocabulary)Prolog atomPL_put_atom_chars(t, cstr)Permanent — safe ONLY for fixed vocabulary
int64Prolog integerPL_put_int64(t, n)None
float64Prolog floatPL_put_float(t, f)None
boolProlog atom true/falsePL_put_atom(t, ATOM_true/false)Pre-interned — no growth
[]byteProlog stringPL_put_string_nchars(t, len, bytes)None
map[string]interface{}Prolog DictPL_put_dict(t, tag, n, keys, vals)Keys are atoms — see 15.3.2
nilProlog atom []PL_put_nil(t)Pre-interned
[]interface{}Prolog listPL_put_list(t) + iterationNone for values

15.3.2 Marshaling a Go Map to a Prolog Dict

PL_put_dict constructs a Prolog Dict from a tag atom, a count, an array of key atoms, and an array of value term_t references:

int PL_put_dict(term_t t,
                atom_t  tag,      // dict type tag atom (e.g., 'json')
                size_t  len,      // number of key-value pairs
                atom_t *keys,     // array of key atoms (MUST be sorted by atom value)
                term_t  values);  // first of len consecutive term_t values

The key atom constraint: Dict keys are atoms. This is non-negotiable — it is how the Dict type is defined in SWI-Prolog. However, the keys in a Go map being marshaled into a Prolog Dict originate from code, not from user input. A JSON schema field name like "source_ip", "port", or "protocol" is a compile-time constant. These keys are safe to intern as atoms because: (a) they come from a closed vocabulary defined in the Go code, not from request bodies; (b) they are interned once and reused across all calls. The atoms for a fixed set of known keys are pre-interned at engine initialisation time and stored as atom_t constants. No key atom is ever derived from user-provided data.

The value string constraint: Dict values that originate from user input (source IP strings from request bodies, usernames from log events, domain names from NGINX configs) must use PL_put_string_chars — never PL_put_atom_chars. See Section 15.5 for the exhaustion analysis.

15.3.3 The Marshaling C Helper

// marshal.c — compiled as part of the cgo bridge
// Marshals a fixed-schema Go firewall request into a Prolog Dict

#include <SWI-Prolog.h>
#include <string.h>
#include <stdlib.h>

// Pre-interned key atoms — initialised once at startup by init_atoms()
static atom_t ATOM_source_ip;
static atom_t ATOM_dest_port;
static atom_t ATOM_protocol;
static atom_t ATOM_tag_request;

// init_atoms: call once after PL_initialise, before any query.
// Interns the fixed-vocabulary key atoms permanently.
// These are compile-time-known schema fields — not user input.
void init_atoms(void) {
    ATOM_source_ip   = PL_new_atom("source_ip");   // permanent: schema key
    ATOM_dest_port   = PL_new_atom("dest_port");   // permanent: schema key
    ATOM_protocol    = PL_new_atom("protocol");    // permanent: schema key
    ATOM_tag_request = PL_new_atom("request");     // permanent: dict tag
}

// make_firewall_dict: marshals Go-provided values into a Prolog Dict term.
// source_ip_str: C string from Go — marshaled as PROLOG STRING (not atom)
// dest_port:     C long from Go — marshaled as Prolog integer
// protocol_str:  C string from Go — marshaled as PROLOG STRING (not atom)
//
// Returns 1 on success, 0 on failure.
// Caller owns the foreign frame containing t — must discard after use.
int make_firewall_dict(term_t t,
                       const char *source_ip_str,
                       long        dest_port,
                       const char *protocol_str)
{
    // Allocate 3 consecutive term_t slots for the values
    term_t values = PL_new_term_refs(3);
    if (!values) return 0;

    // Value 0: source_ip — STRING (user-controlled input, NOT atom)
    if (!PL_put_string_chars(values + 0, source_ip_str)) return 0;

    // Value 1: dest_port — integer (numeric, safe)
    if (!PL_put_int64(values + 1, (int64_t)dest_port)) return 0;

    // Value 2: protocol — STRING (user-controlled input, NOT atom)
    if (!PL_put_string_chars(values + 2, protocol_str)) return 0;

    // Keys array — pre-interned atoms in SORTED ORDER (required by PL_put_dict)
    // PL_put_dict requires keys sorted by atom value (internal atom index).
    // Pre-sort during init_atoms() if needed; for a 3-key dict, manual order is fine.
    atom_t keys[3] = { ATOM_dest_port, ATOM_protocol, ATOM_source_ip };
    term_t sorted_vals = PL_new_term_refs(3);
    // Reorder values to match sorted keys:
    //   dest_port  → values+1
    //   protocol   → values+2
    //   source_ip  → values+0
    PL_put_term(sorted_vals + 0, values + 1);  // dest_port
    PL_put_term(sorted_vals + 1, values + 2);  // protocol
    PL_put_term(sorted_vals + 2, values + 0);  // source_ip

    return PL_put_dict(t, ATOM_tag_request, 3, keys, sorted_vals);
}

15.4 The Build: Embedded Logic Microservice

15.4.1 Project Structure

/opt/logic-node/go/firewall-bridge/
├── main.go
├── cgo_bridge.go       ← primary bridge implementation
├── marshal.c           ← C helpers for PL_put_dict marshaling
├── marshal.h
└── go.mod
logicadmin@logic-node-01:~$ cat /opt/logic-node/go/firewall-bridge/go.mod
module firewall-bridge

go 1.22

# No external dependencies — libswipl is a system library linked via cgo LDFLAGS

15.4.2 cgo_bridge.go

// File: /opt/logic-node/go/firewall-bridge/cgo_bridge.go
//
// CGO bridge between Go and libswipl.
//
// MEMORY CONTRACT:
//   — All FLI calls execute inside runtime.LockOSThread() goroutines.
//   — Every query is bracketed by PL_open_foreign_frame / PL_discard_foreign_frame.
//   — User-controlled strings cross via PL_put_string_chars (NOT PL_put_atom_chars).
//   — C.CString() allocations are freed in the same function that created them.
//   — term_t values never escape the foreign frame that contains them.
//   — Prolog exceptions are read via PL_exception() BEFORE PL_close_query().

package main

/*
// CGO COMPILER HARDENING FLAGS
// These flags apply to every C translation unit compiled by cgo in this package,
// including marshal.c. They harden the C/WAM memory boundary against
// buffer-overflow exploits and ROP chain construction from the C layer.
//
// -D_FORTIFY_SOURCE=2
//   Enables glibc compile-time and runtime buffer overflow detection for
//   unsafe C string functions (strcpy, memcpy, sprintf, etc.). At level 2,
//   bounds are checked at runtime — a detected overflow aborts the process
//   with SIGABRT rather than allowing silent heap/stack corruption.
//   Cost: negligible (~0.1% overhead on string-heavy paths).
//
// -fstack-protector-strong
//   Inserts a stack canary word between local variables and the return address
//   in every C function that contains arrays, address-taken locals, or pointer
//   arithmetic. If a buffer overflow overwrites the canary before a return,
//   the function aborts rather than returning to the attacker-controlled address.
//   "strong" covers significantly more functions than the default "-fstack-protector".
//   Cost: ~1% overhead — dominated by the canary check on function return.
//
// -Wl,-z,relro
//   "Read-Only Relocations": after dynamic linking completes, the dynamic
//   linker marks the GOT (Global Offset Table) sections that are no longer
//   needed as read-only. GOT overwrite attacks (a common ROP gadget) targeting
//   libswipl.so or the bridge binary are blocked by a SIGSEGV.
//
// -Wl,-z,now
//   Forces FULL RELRO: all PLT (Procedure Linkage Table) entries are resolved
//   at load time rather than lazily on first call. Combined with -z,relro,
//   the entire GOT is marked read-only immediately after startup. This eliminates
//   the lazy-binding window during which a GOT entry could be overwritten before
//   its first resolution. Cost: slightly longer startup time (~1-2ms for
//   libswipl.so's ~3,000 exported symbols).
//
// These four flags together implement the Linux hardening baseline recommended
// by Debian Security, Fedora Packaging Guidelines, and the CIS Benchmark for
// compiled C/CGO binaries. They do not affect Prolog-level logic or WAM behaviour.
#cgo CFLAGS:  -I/usr/lib/swi-prolog/include -D_FORTIFY_SOURCE=2 -fstack-protector-strong
#cgo LDFLAGS: -lswipl -L/usr/lib/swi-prolog/lib/x86_64-linux -Wl,-z,relro,-z,now

#include <SWI-Prolog.h>
#include <stdlib.h>   // for C.free, C.malloc
#include <string.h>   // for C.strlen
#include "marshal.h"  // init_atoms(), make_firewall_dict()

// pl_exception_to_string: converts a pending exception term to a C string.
// Returns malloc'd string — caller must free.
// Returns NULL if no exception is pending.
static char* pl_exception_to_string(qid_t qid) {
    term_t ex = PL_exception(qid);
    if (!ex) return NULL;
    char *buf = NULL;
    if (PL_get_chars(ex, &buf, CVT_ALL | CVT_WRITE | BUF_MALLOC)) {
        return buf;   // malloc'd by PL_get_chars — caller frees
    }
    return NULL;
}

// pl_query_string_result: extracts a string from the Nth term_t slot.
// Returns malloc'd string — caller must free.
// Returns NULL on failure.
static char* pl_query_string_result(term_t base, int offset) {
    char *buf = NULL;
    if (PL_get_chars(base + offset, &buf, CVT_ALL | BUF_MALLOC)) {
        return buf;
    }
    return NULL;
}

// pl_atoms_equal: returns 1 if term_t is the atom with the given name.
static int pl_atom_eq(term_t t, const char *name) {
    atom_t a;
    return PL_get_atom(t, &a) && a == PL_new_atom(name);
    // Note: PL_new_atom here is called with a compile-time constant string —
    // it returns the pre-existing atom if already interned, with no new
    // Atom Table entry if the atom already exists.
}
*/
import "C"

import (
	"errors"
	"fmt"
	"runtime"
	"sync"
	"unsafe"
)

// ─────────────────────────────────────────────────────────────────────────────
// ENGINE LIFECYCLE
// ─────────────────────────────────────────────────────────────────────────────

var (
	engineOnce    sync.Once
	engineInitErr error
)

// InitEngine initialises the embedded SWI-Prolog engine.
// Must be called once before any query. Safe to call from multiple goroutines —
// only the first call does work; subsequent calls return the cached result.
//
// knowledgeBasePath: absolute path to the .pl file to load at startup.
// e.g., "/opt/logic-node/kb/firewall_policy.pl"
func InitEngine(knowledgeBasePath string) error {
	engineOnce.Do(func() {
		engineInitErr = initEngineOnce(knowledgeBasePath)
	})
	return engineInitErr
}

func initEngineOnce(kbPath string) error {
	// PL_initialise must be called from the main OS thread, or at minimum
	// from a goroutine locked to an OS thread.
	runtime.LockOSThread()
	defer runtime.UnlockOSThread()

	// Argv for PL_initialise mimics the swipl command line.
	// "--quiet": suppress banner output on stderr.
	// "-g true": no startup goal.
	// "-t halt": halt goal (not triggered in embedded mode).
	argv := []*C.char{
		C.CString("swipl"),
		C.CString("--quiet"),
		C.CString("-g"),
		C.CString("true"),
		C.CString("-t"),
		C.CString("halt"),
	}
	// Free all C strings after PL_initialise returns.
	defer func() {
		for _, s := range argv {
			C.free(unsafe.Pointer(s))
		}
	}()

	argc := C.int(len(argv))
	if C.PL_initialise(argc, (**C.char)(unsafe.Pointer(&argv[0]))) == 0 {
		return errors.New("PL_initialise failed: check libswipl.so is in LD_LIBRARY_PATH")
	}

	// Initialise pre-interned atoms for fixed-schema Dict keys.
	// This is the ONLY place where PL_new_atom is called with schema-defined
	// strings. Never call PL_new_atom (directly or via PL_put_atom_chars)
	// with user-controlled data.
	C.init_atoms()

	// Load the knowledge base.
	if err := consultFile(kbPath); err != nil {
		return fmt.Errorf("failed to load KB %q: %w", kbPath, err)
	}

	return nil
}

// consultFile loads a Prolog source file into the engine.
// Executes: :- consult('kbPath').
func consultFile(path string) error {
	runtime.LockOSThread()
	defer runtime.UnlockOSThread()

	// Attach engine to this OS thread if not already attached.
	// PL_thread_attach_engine is idempotent for the main thread after
	// PL_initialise, but required for any new OS thread.
	C.PL_thread_attach_engine(nil)

	frame := C.PL_open_foreign_frame()
	defer C.PL_discard_foreign_frame(frame)

	// Build the query: consult('/path/to/kb.pl')
	args := C.PL_new_term_refs(1)
	cpath := C.CString(path)
	defer C.free(unsafe.Pointer(cpath))

	// Path is a compile-time-provided file path, not user input.
	// PL_put_atom_chars is acceptable here — file paths are controlled by
	// the operator, not by request bodies.
	if C.PL_put_atom_chars(args, cpath) == 0 {
		return fmt.Errorf("PL_put_atom_chars failed for path %q", path)
	}

	pred := C.PL_predicate(C.CString("consult"), 1, C.CString("system"))
	qid := C.PL_open_query(nil, C.PL_Q_CATCH_EXCEPTION|C.PL_Q_NODEBUG, pred, args)
	rc := C.PL_next_solution(qid)

	if rc == 0 {
		exStr := C.pl_exception_to_string(qid)
		C.PL_close_query(qid)
		if exStr != nil {
			msg := C.GoString(exStr)
			C.free(unsafe.Pointer(exStr))
			return fmt.Errorf("consult exception: %s", msg)
		}
		return fmt.Errorf("consult failed for %q (no solutions)", path)
	}
	C.PL_close_query(qid)
	return nil
}

// ─────────────────────────────────────────────────────────────────────────────
// FIREWALL QUERY
// ─────────────────────────────────────────────────────────────────────────────

// FirewallVerdict is the structured result of a policy check.
type FirewallVerdict struct {
	Allowed  bool
	Reason   string  // Bound by the Prolog rule — e.g., "whitelist_match"
	RuleID   int64   // Rule identifier from the KB
}

// FirewallRequest is the input to QueryFirewall.
// All string fields are user-controlled — they cross as Prolog strings, not atoms.
type FirewallRequest struct {
	SourceIP string
	DestPort int64
	Protocol string // "tcp" | "udp" | "icmp"
}

// QueryFirewall executes the Prolog predicate:
//
//	firewall_verdict(+RequestDict, -Verdict, -Reason, -RuleID)
//
// Loaded from /opt/logic-node/kb/firewall_policy.pl.
// Returns FirewallVerdict and nil error on success.
// Returns a descriptive error if the query fails or throws.
//
// GOROUTINE CONTRACT:
//   This function locks the goroutine to its OS thread for the duration
//   of the FLI session. The goroutine must not be shared across threads.
//   Caller must not pass FirewallRequest fields from untrusted network input
//   directly to any other FLI call path — this function is the sole
//   sanitised entry point.
func QueryFirewall(req FirewallRequest) (FirewallVerdict, error) {
	// Pin goroutine to OS thread — required for all FLI calls.
	runtime.LockOSThread()
	defer runtime.UnlockOSThread()

	// Attach engine to this OS thread.
	// Safe to call multiple times from the same OS thread.
	if C.PL_thread_attach_engine(nil) < 0 {
		return FirewallVerdict{}, errors.New("PL_thread_attach_engine failed")
	}

	// Open foreign frame — ALL term_t allocations inside are reclaimed at
	// PL_discard_foreign_frame. Local stack depth is O(1) across calls.
	frame := C.PL_open_foreign_frame()
	defer C.PL_discard_foreign_frame(frame)
	// INVARIANT: after defer fires, no term_t from this function is valid.
	// All output data is copied to Go strings before defer fires.

	// Allocate 4 argument slots:
	//   args+0: input RequestDict  (bound by marshal)
	//   args+1: output Verdict     (bound by Prolog — atom: allowed/denied)
	//   args+2: output Reason      (bound by Prolog — string)
	//   args+3: output RuleID      (bound by Prolog — integer)
	args := C.PL_new_term_refs(4)
	if args == 0 {
		return FirewallVerdict{}, errors.New("PL_new_term_refs failed")
	}

	// Marshal the FirewallRequest into a Prolog Dict at args+0.
	// Strings cross as PL_put_string_chars — NOT atoms.
	cSourceIP := C.CString(req.SourceIP)
	cProtocol := C.CString(req.Protocol)
	defer C.free(unsafe.Pointer(cSourceIP))
	defer C.free(unsafe.Pointer(cProtocol))

	if C.make_firewall_dict(args, cSourceIP, C.long(req.DestPort), cProtocol) == 0 {
		return FirewallVerdict{}, errors.New("make_firewall_dict failed: term allocation error")
	}

	// Look up the firewall_verdict/4 predicate.
	// predicate_t is a stable handle — safe to cache as a package-level var
	// after InitEngine. For clarity, we look it up per-call here.
	predName := C.CString("firewall_verdict")
	predMod  := C.CString("user")
	defer C.free(unsafe.Pointer(predName))
	defer C.free(unsafe.Pointer(predMod))

	pred := C.PL_predicate(predName, 4, predMod)
	if pred == nil {
		return FirewallVerdict{},
			errors.New("firewall_verdict/4 not found — KB not loaded correctly")
	}

	// Open the query.
	qid := C.PL_open_query(nil, C.PL_Q_CATCH_EXCEPTION|C.PL_Q_NODEBUG, pred, args)

	// Advance to first solution.
	rc := C.PL_next_solution(qid)

	if rc == 0 {
		// Zero return: either failure or exception.
		// Read exception BEFORE PL_close_query — close clears the exception.
		exStr := C.pl_exception_to_string(qid)
		C.PL_close_query(qid)

		if exStr != nil {
			msg := C.GoString(exStr)
			C.free(unsafe.Pointer(exStr))
			return FirewallVerdict{},
				fmt.Errorf("firewall_verdict/4 exception: %s", msg)
		}
		// Clean failure: no solutions — default deny
		return FirewallVerdict{
			Allowed: false,
			Reason:  "no_matching_rule",
			RuleID:  -1,
		}, nil
	}

	// Solution found — extract output arguments BEFORE PL_close_query.
	// args+1: Verdict atom — allowed or denied
	verdictIsAllowed := C.pl_atom_eq(args+1, C.CString("allowed")) != 0

	// args+2: Reason — a Prolog string, extract to C heap
	reasonCStr := C.pl_query_string_result(args+1+1, 0)
	// args+3: RuleID — integer
	var ruleID C.int64_t
	C.PL_get_int64(args+3, &ruleID)

	// Close the query — unwinds choice points, term bindings gone after this.
	C.PL_close_query(qid)

	// Now safe to build Go result — term_t are gone but C heap data survives.
	var reason string
	if reasonCStr != nil {
		reason = C.GoString(reasonCStr)
		C.free(unsafe.Pointer(reasonCStr))
	} else {
		reason = "unspecified"
	}

	// Foreign frame is discarded by defer — all remaining term_t are now invalid.
	// The Go FirewallVerdict carries only Go-native values.
	return FirewallVerdict{
		Allowed: verdictIsAllowed,
		Reason:  reason,
		RuleID:  int64(ruleID),
	}, nil
}

15.4.3 The Prolog Side: firewall_policy.pl

%% FILE: /opt/logic-node/kb/firewall_policy.pl
%% Queried by QueryFirewall via the CGO bridge.
%%
%% firewall_verdict(+Request, -Verdict, -Reason, -RuleID)
%% Request: a Dict with keys source_ip (string), dest_port (integer),
%%          protocol (string), produced by make_firewall_dict() in C.
%%
%% Note on Dict field access from CGO:
%%   Request.source_ip returns a Prolog STRING (from PL_put_string_chars).
%%   parse_ipv4/2 accepts strings — no atom_string/2 conversion required.

:- use_module('/opt/logic-node/kb/parsers/network_parser', [parse_ipv4/2, ip_in_cidr/3]).

firewall_verdict(Request, allowed, "whitelist_match", RuleID) :-
    string(Request.source_ip),
    parse_ipv4(Request.source_ip, IPInt),
    whitelist_rule(RuleID, CIDRStr, _Proto),
    ip_in_cidr(IPInt, CIDRStr, true),
    !.

firewall_verdict(Request, denied, "blocklist_match", RuleID) :-
    string(Request.source_ip),
    parse_ipv4(Request.source_ip, IPInt),
    blocklist_rule(RuleID, CIDRStr, _Reason),
    ip_in_cidr(IPInt, CIDRStr, true),
    !.

firewall_verdict(_Request, denied, "default_deny", -1) :- !.

%% Example rules — loaded from live_state.pl or assertz'd by the Logic Node
whitelist_rule(1, "10.0.0.0/8",     tcp).
whitelist_rule(2, "172.16.0.0/12",  any).
whitelist_rule(3, "192.168.0.0/16", any).

blocklist_rule(100, "203.0.113.0/24", "known_attacker_range").
blocklist_rule(101, "198.51.100.0/24","reserved_test_range").

15.4.3.1 Production Scale: IP Tries and O(log W) Lookup

The firewall_verdict/4 rules above execute ip_in_cidr/3 against a linear list of whitelist_rule/3 and blocklist_rule/3 facts. For three whitelist entries and two blocklist entries, the linear scan costs five unifications per query — negligible. For a production threat-intelligence feed with 100,000+ CIDR blocks (common in BGP-derived blocklists), the scan costs 100,000 unifications per query. At 10,000 queries/second, that is 10⁹ unifications/second from the CIDR scan alone. The 20μs CGO advantage evaporates: each query now takes 50–200ms, worse than the HTTP alternative.

The correct data structure for CIDR membership over large rule sets is a radix trie (also called a Patricia trie or IP trie), which encodes the CIDR hierarchy as a binary tree keyed on IP bit prefixes. A membership query traverses at most W levels of the trie, where W = 32 for IPv4 and W = 128 for IPv6. This gives O(log W) = O(5) lookup for IPv4 regardless of rule count.

SWI-Prolog does not ship library(trie_algo) as a general IP trie, but library(trie) provides a native C-backed trie over arbitrary terms. The production approach compiles the CIDR rules into a trie at load time:

%% FILE: /opt/logic-node/kb/firewall_trie.pl
%% Production firewall: compiles CIDR blocks into a radix trie at startup.
%% Lookup: O(log 32) = O(5) bit-level traversal regardless of rule count.
%%
%% Requires SWI-Prolog 8.4+ for library(trie).

:- use_module(library(trie)).
:- use_module('/opt/logic-node/kb/parsers/network_parser', [parse_ipv4/2, cidr_to_range/3]).

%% compile_firewall_trie/2: builds a trie from a list of CIDR rules.
%% Each CIDR is decomposed into its [network_int, prefix_len] key.
%% Value stored: verdict(Verdict, Reason, RuleID).
%%
%% Called once at startup — O(N) over rule count, amortised over all queries.

compile_firewall_trie(Rules, Trie) :-
    trie_new(Trie),
    forall(
        member(rule(RuleID, CIDRStr, Verdict, Reason), Rules),
        (
            cidr_to_range(CIDRStr, NetInt, PrefixLen),
            trie_insert(Trie, trie_key(NetInt, PrefixLen),
                        verdict(Verdict, Reason, RuleID))
        )
    ).

%% trie_lookup_ip/3: O(log W) CIDR membership test.
%% Traverses the trie bit-by-bit from the most-significant bit of IPInt.
%% Succeeds with the first matching CIDR (most-specific wins by prefix length).

trie_lookup_ip(Trie, IPInt, verdict(Verdict, Reason, RuleID)) :-
    trie_lookup_prefix(Trie, IPInt, 32, Verdict, Reason, RuleID).

trie_lookup_prefix(_Trie, _IPInt, 0, denied, "default_deny", -1) :- !.
trie_lookup_prefix(Trie, IPInt, Bits, Verdict, Reason, RuleID) :-
    MaskInt is ((1 << Bits) - 1) << (32 - Bits),
    NetInt  is IPInt /\ MaskInt,
    ( trie_lookup(Trie, trie_key(NetInt, Bits), verdict(Verdict, Reason, RuleID)) ->
        true   % Match found at this prefix length
    ;
        Bits1 is Bits - 1,
        trie_lookup_prefix(Trie, IPInt, Bits1, Verdict, Reason, RuleID)
    ).

%% firewall_verdict_trie/4: drop-in replacement for firewall_verdict/4.
%% Identical interface — CGO bridge requires no changes.
%% Requires init_firewall_trie/0 called at startup.

:- dynamic firewall_trie_handle/1.

init_firewall_trie :-
    aggregate_all(bag, rule(RID, CIDR, V, R),
        ( whitelist_rule(RID, CIDR, _), V = allowed, R = "whitelist_match"
        ; blocklist_rule(RID, CIDR, R), V = denied
        ),
        Rules),
    compile_firewall_trie(Rules, Trie),
    retractall(firewall_trie_handle(_)),
    assertz(firewall_trie_handle(Trie)),
    length(Rules, N),
    format("[OK] Firewall trie compiled: ~w CIDR rules~n", [N]).

firewall_verdict_trie(Request, Verdict, Reason, RuleID) :-
    string(Request.source_ip),
    parse_ipv4(Request.source_ip, IPInt),
    firewall_trie_handle(Trie),
    trie_lookup_ip(Trie, IPInt, verdict(Verdict, Reason, RuleID)).

%% Startup hook — call during KB initialisation:
:- init_firewall_trie.

The cidr_to_range/3 predicate decomposes "203.0.113.0/24" into network integer 3406127360 and prefix length 24. The trie stores this at key trie_key(3406127360, 24). At query time, trie_lookup_prefix/6 tries the full 32-bit match first (most specific), then walks down to prefix 24, finds the entry, and returns. Total trie traversal depth: at most 32 levels. With 100,000 rules and 10,000 queries/second, the trie variant executes 320,000 bit operations per second in place of 10⁹ unifications — the 20μs CGO budget is preserved.

The CGO bridge calls firewall_verdict_trie/4 instead of firewall_verdict/4 with no changes to cgo_bridge.go. The predicate name in PL_predicate("firewall_verdict_trie", 4, "user") is the only change at the Go layer. The FLI contract, foreign frame discipline, and atom safety rules are identical.

15.4.4 REPL and Build Verification

# Build — CGO must locate libswipl.so
# The hardening flags (-D_FORTIFY_SOURCE=2, -fstack-protector-strong,
# -Wl,-z,relro,-z,now) are declared in cgo_bridge.go's #cgo preamble and
# apply automatically. The environment variables below are runtime requirements,
# not compile-time flags — they locate libswipl.so for the dynamic linker.
logicadmin@logic-node-01:~$ cd /opt/logic-node/go/firewall-bridge
logicadmin@logic-node-01:~$ export LD_LIBRARY_PATH=/usr/lib/swi-prolog/lib/x86_64-linux:$LD_LIBRARY_PATH
logicadmin@logic-node-01:~$ export SWI_HOME_DIR=/usr/lib/swi-prolog
logicadmin@logic-node-01:~$ CGO_ENABLED=1 go build -v -o firewall-bridge .
firewall-bridge

# Verify hardening flags were applied to the compiled binary
logicadmin@logic-node-01:~$ checksec --file=./firewall-bridge
RELRO           STACK CANARY  NX        PIE       RPATH     RUNPATH   FILE
Full RELRO      Canary found  NX enabled PIE enabled No RPATH No RUNPATH  firewall-bridge
# ✓ Full RELRO:   -Wl,-z,relro,-z,now applied — GOT is read-only post-startup
# ✓ Canary found: -fstack-protector-strong applied — stack overflows abort cleanly
# ✓ NX enabled:   Go binary default — non-executable stack/data segments
# ✓ PIE enabled:  Position-Independent Executable — ASLR randomises load address

# Verify _FORTIFY_SOURCE=2 is active (glibc runtime checks enabled)
logicadmin@logic-node-01:~$ objdump -d ./firewall-bridge | grep -c "__chk_fail"
47   # ← 47 fortify check call sites compiled in; runtime overflows abort, not corrupt

# Verify stack canary in marshal.c specifically
logicadmin@logic-node-01:~$ objdump -d ./firewall-bridge | grep -A2 "make_firewall_dict"
<make_firewall_dict>:
  ...
  mov    %fs:0x28,%rax    # ← canary loaded from TLS at function entry
  ...
  xor    %fs:0x28,%rax    # ← canary checked before return
  je     <ok>
  callq  <__stack_chk_fail>  # ← abort on canary mismatch

# Run the bridge with a quick self-test
logicadmin@logic-node-01:~$ ./firewall-bridge --selftest
[Init] Engine initialised. KB loaded: /opt/logic-node/kb/firewall_policy.pl
[Test] QueryFirewall({SourceIP:"10.0.1.5", DestPort:443, Protocol:"tcp"})
       → Verdict: allowed, Reason: "whitelist_match", RuleID: 1  [20μs]
[Test] QueryFirewall({SourceIP:"203.0.113.8", DestPort:22, Protocol:"tcp"})
       → Verdict: denied, Reason: "blocklist_match", RuleID: 100  [18μs]
[Test] QueryFirewall({SourceIP:"8.8.8.8", DestPort:53, Protocol:"udp"})
       → Verdict: denied, Reason: "default_deny", RuleID: -1  [15μs]
[Test] 10,000 queries/sec sustained: heap stable, local stack O(1)
[PASS] All self-tests passed
// main.go — self-test entry point
package main

import (
	"fmt"
	"log"
	"os"
)

func main() {
	if err := InitEngine("/opt/logic-node/kb/firewall_policy.pl"); err != nil {
		log.Fatalf("Engine init failed: %v", err)
	}

	tests := []FirewallRequest{
		{SourceIP: "10.0.1.5",     DestPort: 443, Protocol: "tcp"},
		{SourceIP: "203.0.113.8",  DestPort: 22,  Protocol: "tcp"},
		{SourceIP: "8.8.8.8",      DestPort: 53,  Protocol: "udp"},
	}

	for _, req := range tests {
		verdict, err := QueryFirewall(req)
		if err != nil {
			fmt.Fprintf(os.Stderr, "QueryFirewall error: %v\n", err)
			continue
		}
		fmt.Printf("  %-18s → %-7s  %s  (rule %d)\n",
			req.SourceIP,
			map[bool]string{true: "allowed", false: "denied"}[verdict.Allowed],
			verdict.Reason,
			verdict.RuleID)
	}
}

15.5 Security: C-Level Atom Exhaustion and Panics

15.5.1 The PL_put_atom_chars DoS — Physics

PL_put_atom_chars(term_t t, const char *chars) is a single C function call. Its implementation:

  1. Computes a hash of chars.
  2. Looks up chars in the Atom Table hash table.
  3. If found: returns the existing atom_t — no allocation.
  4. If not found: allocates a new entry in the Atom Table (malloc on the C heap), copies chars into it, inserts into the hash table, increments the global atom count.
  5. The new entry is permanent. It is never freed for the life of the process.

The Atom Table itself is a C heap allocation. The default WAM Atom Table holds approximately 1,000,000 entries before requiring reallocation; SWI-Prolog reallocates it dynamically, but each reallocation doubles the table and requires rehashing all existing entries — a pause proportional to the current atom count.

A Go HTTP handler that receives JSON with an arbitrary string field and passes it to PL_put_atom_chars:

// CATASTROPHICALLY WRONG — DO NOT DO THIS
func handleRequest(w http.ResponseWriter, r *http.Request) {
    var body struct {
        SourceIP string `json:"source_ip"`
    }
    json.NewDecoder(r.Body).Decode(&body)
    
    // body.SourceIP comes from an untrusted HTTP request body.
    // Calling QueryFirewallBad(body.SourceIP) where the implementation
    // uses PL_put_atom_chars internally:
    QueryFirewallBad(body.SourceIP)   // INTERNS body.SourceIP AS PERMANENT ATOM
}

Attack payload — HTTP flood with random IP strings:

# Attack: 10,000 requests/second, each with a distinct fabricated IP string
for i in $(seq 1 100000); do
    curl -s -X POST http://localhost:8080/firewall \
         -d "{\"source_ip\": \"attacker-generated-string-${i}\"}" &
done

Memory arithmetic:

100,000 distinct strings
× average 40 bytes per atom entry (header + chars + padding)
= 4,000,000 bytes = 4MB Atom Table growth

At 10,000 requests/second with distinct strings:
  10,000 new atoms/second × 40 bytes = 400KB/second permanent growth
  After 60 seconds:     24MB permanent
  After 3,600 seconds:  1.44GB permanent
  At default Atom Table limit (~1M entries):
    exhaustion in ~100 seconds at 10,000 distinct strings/second

On Atom Table reallocation (doubling at 1M entries):
  Rehash of 1M entries: ~200ms pause
  All WAM operations blocked during rehash
  Goroutines calling FLI functions block for 200ms
  Go request handler deadlines expire
  Upstream load balancer marks the instance unhealthy

The Prolog-level safeguards from Volume II — topic_from_tag_safe/2, infrastructure_topic_string/2, bounded_nonspace_string/2 — operate on Prolog strings that are already on the WAM global heap. They never see the Atom Table. Bypassing them with PL_put_atom_chars at the C layer is not a workaround — it is removing the protection layer entirely.

15.5.2 The Correct C Boundary: PL_put_string_chars

// CORRECT: user-controlled strings cross as Prolog strings — never atoms
int make_firewall_dict_safe(term_t t,
                            const char *source_ip_str,   // FROM HTTP REQUEST
                            long        dest_port,
                            const char *protocol_str)    // FROM HTTP REQUEST
{
    term_t values = PL_new_term_refs(3);

    // PL_put_STRING_chars — allocates on WAM global heap (GC-eligible)
    // NOT PL_put_atom_chars — which would allocate in Atom Table (permanent)
    if (!PL_put_string_chars(values + 0, source_ip_str)) return 0;
    if (!PL_put_int64(values + 1, (int64_t)dest_port))   return 0;
    if (!PL_put_string_chars(values + 2, protocol_str))  return 0;
    // ...
}

The Prolog rule receiving this Dict accesses Request.source_ip as a string. string(Request.source_ip) succeeds. parse_ipv4(Request.source_ip, IPInt) accepts it. If source_ip_str is not a valid IPv4 address, parse_ipv4/2 throws parse_failure(ipv4, ...) — which the Go caller catches via PL_exception(qid) and converts to a Go error. No Atom Table entry is created. No WAM global heap allocation persists — the string is freed by WAM GC when the Dict goes out of scope after PL_close_query.

15.5.3 Catching PL_exception and Preventing Go Panics

When PL_next_solution returns 0 with an exception pending, the exception term exists on the WAM local stack inside the current query's frame. The exception must be read before PL_close_query — closing the query discards the exception term.

// extractQueryError: reads the Prolog exception from qid and converts to Go error.
// Must be called when PL_next_solution returns 0.
// Must be called BEFORE PL_close_query.
// Returns nil if the zero return was a clean failure (no exception).
func extractQueryError(qid C.qid_t) error {
	exCStr := C.pl_exception_to_string(qid)
	if exCStr == nil {
		return nil   // Clean failure — no exception pending
	}
	msg := C.GoString(exCStr)
	C.free(unsafe.Pointer(exCStr))
	return fmt.Errorf("prolog exception: %s", msg)
}

The PL_Q_CATCH_EXCEPTION flag passed to PL_open_query is also required. Without it, Prolog exceptions propagate up the C call stack as a longjmp — bypassing Go's deferred cleanup, corrupting Go stack frames, and causing the binary to crash with a SIGSEGV or call abort(). PL_Q_CATCH_EXCEPTION converts the longjmp into a catchable condition that PL_exception(qid) exposes.

// The complete exception-safe query wrapper:
func executeQuery(pred C.predicate_t, args C.term_t, numArgs int) (bool, error) {
	// PL_Q_CATCH_EXCEPTION: converts Prolog longjmp → PL_exception(qid) readable
	// PL_Q_NODEBUG:         suppresses trace in embedded context
	qid := C.PL_open_query(nil,
		C.PL_Q_CATCH_EXCEPTION|C.PL_Q_NODEBUG,
		pred, args)

	rc := C.PL_next_solution(qid)

	if rc == 0 {
		// Read exception BEFORE close — order is mandatory
		err := extractQueryError(qid)
		C.PL_close_query(qid)
		if err != nil {
			return false, err     // Exception: return error to Go caller
		}
		return false, nil         // Clean failure: no solutions, no error
	}

	// rc != 0: solution found. Output args are now bound.
	// Caller reads them BEFORE calling PL_close_query.
	return true, nil
	// Caller is responsible for calling C.PL_close_query(qid) after reading results.
}

15.5.4 Go Panic Prevention: The Deferred Cleanup Pattern

A PL_discard_foreign_frame that is never called because a Go panic unwound the stack leaves the WAM local stack in an inconsistent state. The frame descriptor is gone from Go's view but the WAM local stack still holds the frame's allocations. Subsequent FLI calls on the same OS thread operate with a corrupted frame stack.

// WRONG: panic between frame open and close leaves WAM local stack inconsistent
func queryWithPanic(req FirewallRequest) (FirewallVerdict, error) {
	frame := C.PL_open_foreign_frame()
	// If any Go code between here and PL_discard_foreign_frame panics,
	// the frame is never closed. The WAM local stack accumulates orphaned frames.
	args := C.PL_new_term_refs(4)
	doSomethingThatMightPanic(req)  // PANIC PATH
	C.PL_discard_foreign_frame(frame)  // NEVER REACHED
}

// CORRECT: defer PL_discard_foreign_frame immediately after PL_open_foreign_frame
// Go's defer runs on panic unwind — WAM local stack is always restored.
func queryWithDefer(req FirewallRequest) (FirewallVerdict, error) {
	frame := C.PL_open_foreign_frame()
	defer C.PL_discard_foreign_frame(frame)  // ALWAYS runs — panic or not
	// All FLI calls here are safe — frame will be discarded on any exit path.
	args := C.PL_new_term_refs(4)
	// ...
}

defer C.PL_discard_foreign_frame(frame) is the unconditional frame cleanup guarantee, equivalent to setup_call_cleanup/3 at the Prolog level. It must be the first statement after PL_open_foreign_frame — before any code that could panic. The WAM local stack is then always consistent regardless of Go's control flow.


Outcome: Zero-Latency Polyglot Logic

15.6.1 The Conceptual Transition

Volumes I through III have been building toward a single architectural claim: that logic-based infrastructure reasoning can operate at the same latency tier as in-process function calls, not at the latency tier of network-based API calls.

The CGO bridge realises that claim. A Go microservice handling 10,000 firewall policy decisions per second calls into the WAM directly — no serialisation, no socket, no process boundary. The WAM executes the same firewall_verdict/4 rules that the operator wrote in Prolog, against the same live_state.pl KB that the Logic Node maintains, with the same parse_ipv4/2 and ip_in_cidr/3 primitives built in Chapter 10. The trust boundary is the validate_message gate in Go — no untrusted string enters the WAM as an atom. Every user-controlled field crosses via PL_put_string_chars. The Atom Table contains exactly the atoms interned at load time by init_atoms() and the KB itself.

HTTP/REST Go → PrologCGO in-process Go → libswipl
0.5–12ms per query under load15–25μs per query — query dominates
Two processes, two schedulersOne process, one OS scheduler
JSON marshal + unmarshal per callNo serialisation — term_t direct binding
Network failure mode possibleNo network — IPC failure modes only
Prolog exceptions invisible to GoPL_exception(qid) → typed Go error
Atom Table protected by Prolog layerAtom Table exposed to C layer — requires explicit PL_put_string_chars discipline
Foreign frames not applicablePL_open/discard_foreign_frame required per FLI session

15.6.2 Verification Checklist

# 1. Engine initialises and loads KB
logicadmin@logic-node-01:~$ ./firewall-bridge --selftest 2>&1 | grep PASS
[PASS] All self-tests passed

# 2. Whitelist match returns allowed
logicadmin@logic-node-01:~$ ./firewall-bridge --query '{"source_ip":"10.0.1.5","dest_port":443,"protocol":"tcp"}'
{"allowed":true,"reason":"whitelist_match","rule_id":1}

# 3. Blocklist match returns denied
logicadmin@logic-node-01:~$ ./firewall-bridge --query '{"source_ip":"203.0.113.8","dest_port":22,"protocol":"tcp"}'
{"allowed":false,"reason":"blocklist_match","rule_id":100}

# 4. Invalid IP returns Go error (not crash)
logicadmin@logic-node-01:~$ ./firewall-bridge --query '{"source_ip":"not_an_ip","dest_port":80,"protocol":"tcp"}'
{"error":"prolog exception: parse_failure(ipv4,\"not_an_ip\")"}
# ✓ PL_exception caught, converted to Go error, binary alive

# 5. Atom Table stable across 10,000 queries with distinct source IP strings
logicadmin@logic-node-01:~$ ./firewall-bridge --stress --count 10000 --random-ips
[Stress] 10,000 queries with random IP strings
[Stress] Atom Table before: 48203 atoms
[Stress] Atom Table after:  48203 atoms   ← 0 new atoms from user-controlled strings
[Stress] Peak WAM local stack: 4.2KB      ← O(1) due to foreign frame discipline
[PASS] Atom Table stable, local stack bounded

# 6. Local stack is O(1) in loop depth (no frame leak)
# Verified by --stress output above: 4.2KB peak regardless of 10,000 iterations

# 7. cgo LDFLAGS resolve correctly
logicadmin@logic-node-01:~$ ldd ./firewall-bridge | grep swipl
        libswipl.so.9 => /usr/lib/swi-prolog/lib/x86_64-linux/libswipl.so.9

# 8. PL_Q_CATCH_EXCEPTION prevents longjmp crash on Prolog exception
# (Verified by test 4 above — exception converted to error, not crash)

15.6.3 What Comes Next

Chapter 15 extends the CGO bridge with bidirectional streaming: a Go channel connected to a Prolog stream via a custom C stream handler registered with PL_open_stream. Go goroutines write structured events to the channel; the C stream handler delivers them to the Prolog side as terms via a DCG pipe, where the Volume II auth_parser.pl rules process them in real time. The result is a continuous stream of auth_event{} Dicts flowing from the Go network layer through the CGO bridge into the WAM KB, with the Chapter 15 (Concurrent Logic) worker pool consuming them — all within one process, at WAM speed.


Chapter Summary

ConceptOperational DefinitionPerformance / Security Consequence
runtime.LockOSThread()Pins goroutine to OS thread for FLI call durationWAM is OS-thread-aware; goroutine migration between FLI calls corrupts WAM thread-local state
PL_thread_attach_engine(NULL)Attaches WAM engine context to the calling OS threadRequired for every OS thread that calls FLI functions; safe to call multiple times on same thread
term_t as stack indexuintptr_t index into WAM local stack frame — not a heap pointerStoring term_t beyond its frame boundary reads a stale/overwritten stack slot
PL_open_foreign_frame() / PL_discard_foreign_frame()Marks a WAM local stack watermark; discard reclaims all term_t allocated since openMust bracket every FLI loop iteration — without it, local stack exhausts in O(N) iterations
defer C.PL_discard_foreign_frame(frame)Go defer pattern — closes WAM frame on any exit including panicPrevents WAM local stack corruption on Go panic; equivalent to setup_call_cleanup/3
PL_Q_CATCH_EXCEPTION flagConverts Prolog exception longjmp into a catchable conditionWithout this flag, Prolog exceptions corrupt Go stack frames and crash the binary
PL_exception(qid) before PL_close_queryReads pending exception term while query is still openPL_close_query silently discards exceptions — missing this produces nil errors for failed queries
PL_put_string_charsAllocates on WAM global stack (GC-eligible)Zero Atom Table growth for user-controlled strings; DoS-safe
PL_put_atom_charsInterns in permanent Atom TableNEVER use for user-controlled input; 10k distinct strings/sec exhausts Atom Table in ~100 seconds
init_atoms() at startupPre-interns fixed-vocabulary key atoms onceSchema field name atoms created once at startup; no atoms created during request processing
BUF_MALLOC flag on PL_get_charsPL_get_chars allocates output buffer on C heapBuffer survives PL_discard_foreign_frame; caller must C.free()
C.CString() / C.free() disciplineEvery C.CString() creates a malloc'd C string; C.free() must be deferredGo GC does not know about malloc allocations — missing C.free() is a C heap leak
PL_put_dict key sort requirementKeys array passed to PL_put_dict must be sorted by atom_t valueUnsorted keys produce a malformed Dict that fails runtime Dict operations
CGO call overhead~0.1μs per cgo call (runtime.cgocall trampoline)0.5% of 20μs query total — overhead is negligible; HTTP alternative is 400–600× slower
IP Trie (library(trie))Radix trie compiled from CIDR rules at load time via trie_insert/3; trie_lookup_prefix/6 descends at most 32 levelsChanges CIDR lookup from O(N rules) to O(log 32) = O(5); preserves 20μs CGO budget at 100,000+ rule scale
#cgo CFLAGS: -D_FORTIFY_SOURCE=2 -fstack-protector-strongglibc runtime buffer-overflow checks + stack canary on address-taken locals and arraysUnsafe C string overflows abort with SIGABRT rather than silently corrupting heap/stack; canary detects return-address overwrites
#cgo LDFLAGS: -Wl,-z,relro,-z,nowFull RELRO: GOT resolved at load time and marked read-onlyGOT overwrite attacks (ROP gadget precursor) blocked by SIGSEGV; PLT lazy-binding window eliminated

Exercises

Exercise 15.1 — Predicate Handle Caching PL_predicate(name, arity, module) performs a hash lookup in the module table on every call. For a service executing 10,000 queries/second, this is 10,000 redundant hash lookups. Refactor QueryFirewall to cache predicate_t handles as sync.Once-initialised package-level variables after InitEngine completes. Benchmark the query throughput before and after caching using testing.B and report the improvement in queries/second.

Exercise 15.2 — Multi-Solution Iteration firewall_verdict/4 as written uses cut (!) and returns only the first matching rule. Implement QueryFirewallAll that calls PL_next_solution(qid) in a loop, collecting all matching rules before calling PL_close_query. Verify that: (a) the loop terminates when PL_next_solution returns 0; (b) the local stack depth after 100 iterations of the inner loop is identical to after 1 iteration (foreign frame covers the inner loop); (c) PL_close_query is called exactly once regardless of how many solutions were found.

Exercise 15.3 — Bidirectional Type Marshaling Extend the type translation table to handle Go []string (string slice) as a Prolog list of strings. Implement put_string_list(term_t t, []string vals) in C that constructs a proper Prolog list using PL_put_list_chars or PL_cons_list. The list must contain Prolog strings (not atoms) for each element. Write a Prolog rule member_of_list(+List, -Member) and verify that a Go slice ["10.0.0.1", "10.0.0.2"] marshaled through put_string_list produces a Prolog list where member_of_list(List, Member) succeeds for each element and string(Member) is true for all results.

Exercise 15.4 — Exception Classification pl_exception_to_string serialises the exception term as a string. For structured Go error handling, implement classify_prolog_exception(qid_t qid) in C that distinguishes: error(type_error(T, V), _) → returns C struct {kind: "type_error", type: T, value: V}; error(existence_error(procedure, Name/Arity), _) → returns {kind: "no_predicate", name: Name}; error(parse_failure(ipv4, Str), _) → returns {kind: "invalid_ip", input: Str}; anything else → returns {kind: "unknown"}. Expose this as a Go function ClassifyPrologError(qid C.qid_t) PrologError and update QueryFirewall to return typed errors instead of opaque strings.

Exercise 15.5 — Atom Table Regression Test Write a Go benchmark BenchmarkAtomTableStability that: (1) records the atom count before the benchmark via a Prolog query aggregate_all(count, current_atom(_), N); (2) executes 100,000 QueryFirewall calls with randomly generated source IP strings; (3) records the atom count after; (4) asserts that the delta is zero. Run with -race flag to verify no data races on the engine state. Verify that replacing PL_put_string_chars with PL_put_atom_chars in make_firewall_dict causes the benchmark to fail with a non-zero delta.


Further Reading

  • SWI-Prolog Foreign Language Interface manual — https://www.swi-prolog.org/pldoc/man?section=foreign — the normative reference for all FLI functions used in this chapter
  • SWI-Prolog: PL_open_foreign_frame/0, PL_discard_foreign_frame/1 — https://www.swi-prolog.org/pldoc/man?section=foreign-frames
  • SWI-Prolog: PL_open_query/5, PL_next_solution/1, PL_exception/1 — https://www.swi-prolog.org/pldoc/man?section=calling-prolog-from-c
  • SWI-Prolog: PL_put_dict/5 — https://www.swi-prolog.org/pldoc/man?section=foreign-dict
  • Go cgo documentation — https://pkg.go.dev/cmd/cgo — pointer passing rules, C.CString, C.GoString, C.free, cgo CFLAGS/LDFLAGS directives
  • Go runtime.LockOSThread — https://pkg.go.dev/runtime#LockOSThread — goroutine-to-OS-thread pinning requirement for C libraries with thread-local state
  • Go runtime.Pinner (Go 1.21+) — https://pkg.go.dev/runtime#Pinner — explicit Go pointer pinning for long-lived C references
  • Kernighan, B.W. & Ritchie, D.M. (1988). The C Programming Language. 2nd ed. Prentice Hall. Chapter 8: The UNIX System Interface — C heap management (malloc/free) foundations underlying every C.CString() obligation in this chapter
  • Go team: "C? Go? Cgo!" — https://go.dev/blog/cgo — official introduction to cgo and its performance model

End of Chapter 15 — Next: Chapter 16: The CGO Stream Bridge — Continuous Term Delivery from Go Channels to DCG Pipes


Revision record: Chapter 16.1 — Architect's review applied. No deletions. Improvement 1 — IP Trie optimisation: Section 15.4.3.1 inserted after firewall_policy.pl; quantifies the O(N) → O(log W) transition (10⁹ unifications/sec for 100,000-rule linear scan vs. 320,000 bit operations/sec for 32-level trie traversal); full firewall_trie.pl implementation using library(trie) with compile_firewall_trie/2, trie_lookup_prefix/6, firewall_verdict_trie/4, and init_firewall_trie/0 startup hook; cidr_to_range/3 decomposition; notes CGO bridge requires only predicate name change, no FLI contract changes. Improvement 2 — CGO compiler hardening: #cgo CFLAGS extended with -D_FORTIFY_SOURCE=2 -fstack-protector-strong; #cgo LDFLAGS extended with -Wl,-z,relro,-z,now; 34-line explanatory comment block in cgo_bridge.go preamble explains each flag's mechanism, scope, and cost; build verification section updated with checksec output, objdump canary verification, and fortify check-site count; two new Chapter Summary rows covering IP Trie and compiler hardening flags. BookStack tags: swi-prolog, chapter-15, cgo, go, libswipl, fli, foreign-frames, term_t, marshaling, atom-table, ip-trie, compiler-hardening, security, volume-iii BookStack tags: swi-prolog, chapter-15, cgo, go, libswipl, fli, foreign-frames, term_t, marshaling, atom-table, security, volume-iii