Chapter 16: The Go-Log Concurrency Model

Textbook: Modern SWI-Prolog (2026 Edition): Sovereign Infrastructure & Industrial Logic Volume: III — Scaling & Concurrency Chapter: 16 of 24 Audience: Senior Engineers, Systems Architects, Infrastructure Security Practitioners Prerequisites: Chapters 1–15 complete. cgo_bridge.go, marshal.c, firewall_policy.pl operational at /opt/logic-node/go/firewall-bridge/. libswipl headers at /usr/lib/swi-prolog/include/. Go 1.22+. logicadmin user active.

Core Concepts

Chapter 15 established a single working CGO bridge: one goroutine, pinned to one OS thread, attaching one WAM engine context, executing one firewall_verdict/4 query at a time. Under the load profile of a single operator issuing manual policy checks, this is correct and sufficient. Under the load profile of a Go HTTP server receiving 5,000 concurrent firewall policy decisions per second from an upstream load balancer, it is a serialisation bottleneck that negates every advantage the embedded WAM provides.

The path from one working CGO goroutine to a pool of N concurrent CGO goroutines is not "spin up N goroutines." Go's M:N scheduler, the WAM's OS-thread affinity model, and the term_t stack-index lifetime guarantee combine to create an impedance mismatch that produces silent data corruption and segmentation faults unless the pool architecture explicitly accounts for all three constraints simultaneously. The failure mode is not an error at pool startup — it is a SIGSEGV delivered to a worker goroutine at an arbitrary future point, under load, after the scheduler happens to migrate the goroutine between OS threads at the wrong moment.

Five properties define the thread-locked worker pool as the only correct concurrency architecture for the CGO bridge under Go's scheduler.

1. Go's M:N scheduler is incompatible with thread-local WAM state by default. Go multiplexes M goroutines onto N OS threads, where N is controlled by GOMAXPROCS (default: number of CPU cores). The scheduler preempts goroutines at function call boundaries and safe points, and is free to resume a preempted goroutine on any available OS thread in the pool. A goroutine that called PL_thread_attach_engine(NULL) on OS thread 17 accumulates WAM engine state — local stack frames, variable bindings, term_t handles — associated with thread 17's WAM context. If the scheduler migrates this goroutine to OS thread 22 between two FLI calls, the next FLI call executes with thread 22's WAM context (a different engine, potentially uninitialised). The term_t handles allocated on thread 17's local stack are now indices into thread 22's local stack. The result is undefined behaviour at the C layer — typically a SIGSEGV on the first PL_get_* call that dereferences the stale stack index.

2. runtime.LockOSThread() is not optional — it is the architectural primitive. runtime.LockOSThread() pins the calling goroutine to its current OS thread permanently (until runtime.UnlockOSThread() is called or the goroutine exits). The Go scheduler will not migrate a locked goroutine to another thread. For a WAM worker goroutine, runtime.LockOSThread() must be the first statement in the goroutine body — before PL_thread_attach_engine, before any FLI call, before any code that could trigger a scheduler preemption. A runtime.LockOSThread() call after the first FLI call is too late: the goroutine may already have migrated between the goroutine start and the lock call.

3. The pool size is fixed at the OS-thread budget, not the goroutine budget. Each worker goroutine in the pool locks one OS thread for its lifetime. N workers = N locked OS threads. The OS scheduler sees N real kernel threads, each with its own WAM engine context. The Go scheduler sees N goroutines that it cannot migrate. The correct pool size is the number of physical CPU cores minus the threads reserved for the main goroutine, the logger thread, and the Go runtime's background goroutines — typically GOMAXPROCS - 2. Allocating more workers than CPU cores creates N locked OS threads competing for M cores, defeating the purpose of the WAM's per-engine local stack locality.

4. Query dispatch uses a bounded Go channel, not a mutex or a sync.WaitGroup. Incoming FirewallRequest structs from HTTP handlers are dispatched to the worker pool via a single buffered channel. Each worker goroutine owns one request-reply channel: the dispatcher sends a WorkItem (request + reply channel) to the worker's inbound channel; the worker executes the FLI query and sends the FirewallVerdict back on the reply channel. The dispatcher blocks if all workers are busy — back-pressure is applied to the HTTP handler, which applies it to the upstream connection. A bounded channel with depth equal to the pool size ensures that the maximum number of in-flight WAM queries is bounded, preventing heap exhaustion from queued-but-unprocessed requests.

5. Engine recovery requires destroying and recreating the full OS-thread/engine pair. When a worker goroutine panics — from a C-level WAM stack overflow, an unhandled FLI exception, or a null pointer in marshal code — Go's recover() mechanism catches the panic at the deferred function boundary. At that point, the WAM engine on the current OS thread is in an undefined state: it may have an open query, an open foreign frame, a pending exception, or a partially-written term on the local stack. The engine cannot be salvaged by closing the query or discarding the frame. The correct recovery sequence is: call C.PL_thread_destroy_engine() to release the WAM engine context from this OS thread, call runtime.UnlockOSThread() to release the OS thread back to the Go scheduler, and signal the pool manager to spawn a replacement worker on a fresh OS thread with a fresh WAM engine. This sequence restores the pool to its target size without restarting the Go binary.

Chapter Roadmap

Section	Title	Focus
16.1	Scheduler Impedance Mismatch	M:N model, OS-thread affinity, migration failure mode, SIGSEGV mechanics
16.2	Thread-Locked Worker Architecture	Pool design, channel topology, engine-per-thread invariant
16.3	The Build: `engine_pool.go`	`Worker` struct, `workerLoop`, `Dispatcher`, full implementation
16.4	Broadcast Control Channels	KB reload, cache flush, `ControlMsg` dispatch, ordered shutdown
16.5	Security: Panic Isolation and Engine Recovery	Panic anatomy, `recover()` + `PL_thread_destroy_engine`, respawn protocol
Outcome	Enterprise-Grade Polyglot Concurrency	Verification checklist, latency under load

16.1 The Scheduler Impedance Mismatch

16.1.1 Go's M:N Scheduler — The Moving-Thread Problem

Go's runtime implements M:N green-thread scheduling: M goroutines are multiplexed across N OS threads (runtime.GOMAXPROCS(0) OS threads by default, set to the number of logical CPUs at startup). The scheduler is cooperative-preemptive: goroutines yield at function call sites, channel operations, and explicit runtime.Gosched() calls, but the Go 1.14+ asynchronous preemption mechanism also inserts preemption points via OS signals (SIGURG on Linux) at any safe point, including the middle of a long-running loop.

When the scheduler preempts a goroutine, it saves the goroutine's register state and stack pointer, and is free to schedule it later on any available OS thread from the pool. The goroutine is unaware of which OS thread it resumes on. For pure Go code, this is correct — the Go memory model guarantees that all Go heap accesses are safe regardless of which OS thread executes them.

For CGO code that has accumulated OS-thread-local C state, this model is catastrophically wrong:

Timeline of a scheduler migration under WAM:

T=0ms  Goroutine G starts on OS thread M17.
       runtime.LockOSThread() NOT called (incorrect).
       C.PL_thread_attach_engine(nil) → WAM engine E17 attached to M17.
       fid := C.PL_open_foreign_frame() → frame F opened on E17's local stack.
       args := C.PL_new_term_refs(4) → slots 0-3 on E17's local stack frame F.
       C.make_firewall_dict(args, ...) → term data written to E17's local/global stacks.

T=0.1ms  Go scheduler preempts G at a cgo return boundary.
         G's Go stack saved. G moved to run queue.
         OS thread M17 picks up a different goroutine G2.

T=0.2ms  Scheduler resumes G on OS thread M22.
         M22 has a different WAM engine E22 (or no engine at all).

T=0.2ms  G calls C.PL_open_query(nil, flags, pred, args).
         Inside libswipl: current engine = E22 (looked up via pthread_self()).
         'args' is still the term_t value from E17's local stack frame F.
         E22's local stack is at a completely different memory address.
         PL_open_query dereferences 'args' as an index into E22's local stack.
         E22's local stack at that index contains unrelated data or is unmapped.

RESULT:  SIGSEGV. Go binary dead.
         Or, if E22 happens to have valid memory at that index:
         Silent data corruption — query executes against wrong term values.
         Firewall verdict is computed for the wrong IP address.
         No error is returned. The Go caller receives a plausible but incorrect result.

The silent corruption case is worse than the SIGSEGV. A SIGSEGV kills the binary and forces an investigation. Silent wrong results pass through the system, log to the audit trail as successful queries, and surface only when a miscategorised IP address causes a production incident.

16.1.2 `runtime.LockOSThread()` — The Pin

func runtime.LockOSThread()

The pin is implemented at the g (goroutine) struct level in the Go runtime: g.lockedm is set to the current m (machine/OS thread), and m.lockedg is set to the current g. The scheduler's findrunnable() function skips OS threads with lockedg set — the thread runs only the locked goroutine until unlocked. There is no Go scheduler path that can migrate a locked goroutine.

runtime.LockOSThread() permanently wires the calling goroutine to its current OS thread. No other goroutine will execute on that thread. The goroutine will continue to execute on that specific OS thread until runtime.UnlockOSThread() is called the same number of times as runtime.LockOSThread(). If the goroutine exits without calling runtime.UnlockOSThread(), the Go runtime terminates the OS thread entirely — permanently removing it from the available thread pool. For worker goroutines designed to exit and be replaced, defer runtime.UnlockOSThread() is therefore mandatory: it preserves the OS thread for the replacement worker's use.

The placement requirement is absolute:

// WRONG: LockOSThread after any code that could trigger preemption
func workerLoopIncorrect() {
    // Goroutine may be on M17 at goroutine start.
    doSomeSetup()           // <- any function call is a preemption point
    // Goroutine may now be on M22 due to preemption during doSomeSetup().
    runtime.LockOSThread()  // Too late: locks M22, not M17
    C.PL_thread_attach_engine(nil)  // Engine attached to M22 — correct
    // ... but if any prior code left state on M17, it is inaccessible
}

// CORRECT: LockOSThread as the absolute first statement
func workerLoopCorrect() {
    runtime.LockOSThread()          // First. No preemption possible before this.
    defer runtime.UnlockOSThread()  // Released on goroutine exit — thread returned to pool
    C.PL_thread_attach_engine(nil)  // Engine attached to THIS thread — guaranteed stable
    // All subsequent FLI calls execute on the same OS thread.
}

The defer runtime.UnlockOSThread() is equally important: when a locked goroutine exits without calling runtime.UnlockOSThread(), the Go runtime terminates the OS thread entirely. For a pool where workers are expected to restart after panic recovery, defer runtime.UnlockOSThread() preserves the OS thread — the replacement worker goroutine inherits a fresh, unlocked thread from the pool rather than forcing the runtime to create a new one.

16.1.2.1 The Residual Migration Problem: NUMA Node Affinity

runtime.LockOSThread() prevents the Go scheduler from migrating the goroutine to a different OS thread. It does not prevent the Linux kernel's CPU scheduler from migrating the OS thread itself to a different physical CPU core. On a single-socket server this distinction is irrelevant — all cores share the same L3 cache. On a dual-socket server or a Proxmox VM pinned to a NUMA node with vCPUs from both sockets, the Linux scheduler is free to migrate the locked OS thread from a core on NUMA node 0 to a core on NUMA node 1 between WAM instruction dispatch cycles.

The WAM's performance depends heavily on L1/L2 cache locality for its local stack traversals. A worker's WAM local stack occupies a contiguous region of physical memory — pages allocated by the WAM's mmap call during PL_thread_attach_engine. Those pages are faulted in on first access and placed in the physical memory of the NUMA node that serviced the fault. If the page faults occurred on node 0, the pages live in node 0's memory controllers. When the Linux scheduler migrates the OS thread to a core on node 1, every subsequent WAM local stack access crosses the QPI/UPI interconnect — 3–4× higher latency than local DRAM access on a dual-socket EPYC or Xeon system. For a tight findall/3 loop over 30,000 tutorial_fact/3 clauses, cross-NUMA stack access degrades throughput by 40–60% compared to the local-NUMA baseline.

runtime.LockOSThread() alone is not sufficient to prevent this. OS-level NUMA binding is required:

# Option A — numactl: bind the entire Go binary to NUMA node 0.
# All worker OS threads, WAM engine stack allocations, and Go heap pages
# are placed on node 0's memory controllers. QPI crossings: zero.
logicadmin@logic-node-01:~$ numactl --cpunodebind=0 --membind=0 \
    ./firewall-bridge

# Option B — systemd service unit (persistent, no wrapper script):
# /etc/systemd/system/firewall-bridge.service [Service] section:
# CPUAffinity=0-15          ← physical cores on NUMA node 0 (verify with lscpu)
# NUMAPolicy=bind           ← Linux ≥ 5.13: enforce NUMA memory policy
# NUMAMask=0                ← bind to NUMA node 0

# Verify after startup: >99% Numa_Hit on node 0 means no cross-NUMA traffic
logicadmin@logic-node-01:~$ numastat -p $(pgrep firewall-bridge)
                           Node 0          Node 1
Numa_Hit               98234018             147   ← 99.9% local: correct
Numa_Miss                    89               0

The NUMA binding complements runtime.LockOSThread(): the lock eliminates goroutine migration (Go-level); the NUMA binding eliminates OS-thread migration between sockets (kernel-level). Together they guarantee that a worker goroutine executes on the same physical CPU core family, with its WAM local stack hot in that socket's L3 cache, for the duration of the worker's lifetime.

16.1.3 The SIGSEGV Anatomy — Memory Layout at Fault Time

When a goroutine executes an FLI call after scheduler migration, the actual SIGSEGV occurs inside libswipl.so at the point where the WAM dereferences a term_t value as a local stack offset:

// Pseudocode of PL_get_chars internals (simplified):
int PL_get_chars(term_t t, char **s, unsigned flags) {
    PL_local_data_t *ld = LD;  // LD = current thread's WAM local data
                                // Fetched via pthread_getspecific() using
                                // the WAM's thread-local storage key.
                                // After migration: LD points to M22's WAM data.
    Word *p = valTermRef(t);    // p = &ld->stacks.local.base[t]
                                // 't' was allocated on M17's local stack.
                                // M22's local stack base is a different address.
                                // p = (arbitrary address) + t_offset
    // ... dereference p ...    // SIGSEGV if p is outside mapped memory
}

The Go runtime catches SIGSEGV as a panic only if it originates from Go code. A SIGSEGV inside libswipl.so — native C code — is delivered as a fatal signal to the process. Go's panic/recover mechanism does not intercept C-level signals. The process dies with no stacktrace from the Prolog side, only a Go runtime crash dump showing the signal delivery point somewhere inside the libswipl shared library.

This is why Section 16.5's panic isolation strategy addresses Go-level panics (which recover() catches) but cannot address C-level SIGSEGV from incorrect thread migration — the only prevention is correct runtime.LockOSThread() placement before any FLI call.

16.2 The Thread-Locked Worker Architecture

16.2.1 Design Constraints

The pool architecture must satisfy four simultaneous constraints:

One WAM engine per OS thread, for the lifetime of the worker. Each worker goroutine holds one locked OS thread and one PL_thread_attach_engine context. The engine is initialised once, used for all queries in the worker's lifetime, and destroyed only on shutdown or panic recovery.
No shared mutable state between worker goroutines. Workers share the WAM clause database (read-only in normal operation) but have independent local stacks, global stacks, and trail stacks. The tutorial_fact/3 KB and firewall_verdict/4 rules are read-only — concurrent reads are safe without synchronisation. No worker calls assertz or retract.
Bounded in-flight query count. The dispatcher channel depth equals the pool size. When all workers are busy, the dispatcher blocks the HTTP handler goroutine, which blocks the net/http connection read, which applies back-pressure to the TCP connection. The system degrades gracefully under overload — it slows down without exhausting memory.
The pool must self-heal. A panicking worker must not reduce the pool below its target size. The pool manager monitors worker exit signals and spawns replacements. The HTTP handler is unaware of worker restarts — it sees only the dispatcher channel.

16.2.2 Diagram: Thread-Locked Engine Pool

%%{init: {"themeVariables": {"fontSize": "14px"}}}%%
flowchart TD
    H1["HTTP Handler 1\nnet/http goroutine\nFree-floating — scheduler may migrate\nSends WorkItem to dispatchCh"]

    H2["HTTP Handler 2\nnet/http goroutine\nBlocks on dispatchCh if pool full\nBack-pressure to TCP connection"]

    HN["HTTP Handler N\nnet/http goroutine\nBlocked by back-pressure\nNo new WAM allocation until worker free"]

    DISP["Dispatcher\ndispatchCh chan WorkItem\nbuffered depth = poolSize\nBlocks producers when full\nFanout: one WorkItem per worker"]

    CTRL["Pool Manager\nmonitors workerDone chan\nrespawns on worker exit\nmaintains target pool size\nbroadcasts ControlMsg via ctrlCh"]

    W1["Worker 1\nruntime.LockOSThread() — first line\nPL_thread_attach_engine()\nOwns WAM Engine E1\nOwnss OS Thread M_k1\nProcesses WorkItems serially"]

    W2["Worker 2\nruntime.LockOSThread()\nPL_thread_attach_engine()\nOwns WAM Engine E2\nOwns OS Thread M_k2"]

    WN["Worker N\nruntime.LockOSThread()\nPL_thread_attach_engine()\nOwns WAM Engine EN\nOwns OS Thread M_kN"]

    E1["WAM Engine E1\nLocal Stack: private\nGlobal Stack: private\nClause DB: SHARED read-only\nAtom Table: SHARED read-only"]

    E2["WAM Engine E2\nIndependent stacks\nSame shared DB/Atoms"]

    EN["WAM Engine EN\nIndependent stacks\nSame shared DB/Atoms"]

    DB["Shared Clause Database\ntutorial_fact/3\nfirewall_verdict/4\nAll loaded .pl rules\nRead-only from workers"]

    H1 --->|"WorkItem{req, replyCh}"| DISP
    H2 --->|"WorkItem{req, replyCh}"| DISP
    HN --->|"blocks — back-pressure"| DISP

    DISP --->|"WorkItem to worker inCh"| W1
    DISP --->|"WorkItem to worker inCh"| W2
    DISP --->|"WorkItem to worker inCh"| WN

    CTRL --->|"ControlMsg via ctrlCh"| W1
    CTRL --->|"ControlMsg via ctrlCh"| W2
    CTRL --->|"ControlMsg via ctrlCh"| WN

    CTRL --->|"respawn on workerDone signal"| W1

    W1 --->|"attaches"| E1
    W2 --->|"attaches"| E2
    WN --->|"attaches"| EN

    E1 --->|"reads"| DB
    E2 --->|"reads"| DB
    EN --->|"reads"| DB

    style H1 fill:#1A2B4A,color:#FFFFFF
    style H2 fill:#1A2B4A,color:#FFFFFF
    style HN fill:#1A2B4A,color:#FFFFFF
    style DISP fill:#7A1A1A,color:#FFFFFF
    style CTRL fill:#8B6914,color:#FFFFFF
    style W1 fill:#1A4070,color:#FFFFFF
    style W2 fill:#1A4070,color:#FFFFFF
    style WN fill:#1A4070,color:#FFFFFF
    style E1 fill:#2A5A2A,color:#FFFFFF
    style E2 fill:#2A5A2A,color:#FFFFFF
    style EN fill:#2A5A2A,color:#FFFFFF
    style DB fill:#1A6B3A,color:#FFFFFF

Reading the diagram: Dark blue HTTP handlers are free-floating goroutines — the Go scheduler may migrate them freely because they never touch the WAM. The red dispatcher is the single back-pressure point — it blocks all producers when the pool is saturated. Blue workers are the WAM boundary — each is pinned to one OS thread and owns one green WAM engine. The bright green clause database is shared and read-only — no synchronisation required for concurrent reads.

16.3 The Build: `engine_pool.go`

16.3.1 Package Structure

/opt/logic-node/go/firewall-bridge/
├── cgo_bridge.go       (Chapter 15 — single-query FLI, QueryFirewall)
├── marshal.c / .h      (Chapter 15 — make_firewall_dict, init_atoms)
├── engine_pool.go      ← this chapter — worker pool, dispatcher, control channel
├── main.go             (updated — uses Pool.Dispatch instead of QueryFirewall)
└── go.mod

16.3.2 Implementation

// File: /opt/logic-node/go/firewall-bridge/engine_pool.go
//
// Thread-locked WAM worker pool.
//
// ARCHITECTURE CONTRACT:
//   — Each worker goroutine is pinned to one OS thread via runtime.LockOSThread().
//   — LockOSThread() is the FIRST statement in workerLoop() — before any FLI call,
//     before any function call that could trigger a preemption point.
//   — Each worker attaches one WAM engine via C.PL_thread_attach_engine(nil).
//   — Workers share the WAM clause database (read-only).
//   — Workers have independent local, global, and trail stacks.
//   — All inter-goroutine communication is via typed Go channels.
//   — No mutex protects WAM state — no shared WAM state exists between workers.
//
// POOL SIZING:
//   poolSize = GOMAXPROCS - 2
//   (minus 2: one for main goroutine, one for pool manager and Go runtime GC)
//   Never exceed GOMAXPROCS — locked OS threads beyond CPU count thrash the scheduler.
//
// RECOVERY CONTRACT:
//   — workerLoop defers a recovery function that:
//       1. Catches any Go panic (including those propagated from cgo).
//       2. Calls C.PL_thread_destroy_engine() — releases WAM engine from OS thread.
//       3. Calls runtime.UnlockOSThread() — returns OS thread to Go runtime pool.
//       4. Signals pool manager via workerDone channel.
//   — Pool manager respawns a replacement worker to maintain pool size.
//   — Panicking worker never reduces the pool below target size for more than
//     the time required to start a new goroutine and attach a fresh WAM engine.

package main

/*
#cgo CFLAGS:  -I/usr/lib/swi-prolog/include -D_FORTIFY_SOURCE=2 -fstack-protector-strong
#cgo LDFLAGS: -lswipl -L/usr/lib/swi-prolog/lib/x86_64-linux -Wl,-z,relro,-z,now

#include <SWI-Prolog.h>
#include <stdlib.h>
#include "marshal.h"

// destroy_engine_safe: calls PL_thread_destroy_engine with a NULL check.
// PL_thread_destroy_engine is undefined behaviour if called on a thread
// that never successfully called PL_thread_attach_engine. The attached
// flag parameter allows the caller to guard this correctly.
static void destroy_engine_safe(int attached) {
    if (attached) {
        PL_thread_destroy_engine();
    }
}
*/
import "C"

import (
	"errors"
	"fmt"
	"log"
	"runtime"
	"sync"
	"sync/atomic"
	"time"
)

// ─────────────────────────────────────────────────────────────────────────────
// TYPES
// ─────────────────────────────────────────────────────────────────────────────

// WorkItem is the unit dispatched to a worker.
// replyCh is a caller-owned, unbuffered channel.
// The caller blocks on replyCh until the worker sends the verdict.
type WorkItem struct {
	Req     FirewallRequest
	ReplyCh chan<- WorkResult
}

// WorkResult carries the worker's response back to the caller.
// Exactly one of Verdict or Err will be non-zero.
type WorkResult struct {
	Verdict  FirewallVerdict
	Err      error
	WorkerID int        // Which worker processed this — for tracing
	Latency  time.Duration
}

// ControlMsg is sent to all workers via the broadcast control channel.
// Workers act on it between query processing iterations.
type ControlMsg struct {
	Kind    ControlKind
	Payload string // Optional — e.g., new KB path for Reload
}

type ControlKind int

const (
	ControlSync   ControlKind = iota // No-op ping — verify worker is alive
	ControlReload                    // Reload KB from Payload path
	ControlShutdown                  // Graceful shutdown — worker exits loop cleanly
)

// ─────────────────────────────────────────────────────────────────────────────
// WORKER
// ─────────────────────────────────────────────────────────────────────────────

// worker is an internal struct owned by the pool manager.
// It is not exported — callers interact only with Pool.
type worker struct {
	id       int
	inCh     chan WorkItem    // Inbound work — capacity 1 (worker processes one at a time)
	ctrlCh   chan ControlMsg  // Broadcast control messages
	doneCh   chan<- int       // Signals pool manager on exit — sends worker.id
}

// workerLoop is the goroutine body for each pool worker.
// INVARIANT: runtime.LockOSThread() is called BEFORE any other statement.
// This function never returns normally — it loops until ControlShutdown or panic.
func workerLoop(w worker) {
	// ── STEP 1: Pin goroutine to current OS thread. ──────────────────────────
	// This must be the first statement. Any function call before this is a
	// potential preemption point that could migrate the goroutine to a
	// different OS thread before the WAM engine is attached.
	runtime.LockOSThread()

	// engineAttached tracks whether PL_thread_attach_engine succeeded.
	// The deferred recovery function uses this to guard PL_thread_destroy_engine.
	engineAttached := false

	// ── STEP 2: Register deferred recovery. ─────────────────────────────────
	// Deferred functions run on panic unwind. The recovery function MUST be
	// registered before PL_thread_attach_engine — if attach itself panics
	// (e.g., engine limit exceeded), the recovery must still run.
	defer func() {
		if r := recover(); r != nil {
			log.Printf("[Worker %d] PANIC recovered: %v — destroying engine and respawning",
				w.id, r)
		}
		// Destroy WAM engine on this OS thread (if one was successfully attached).
		// PL_thread_destroy_engine releases all WAM state for this thread:
		// local stack, global stack, trail, pending exceptions, open queries.
		// After this call, no FLI function may be called on this OS thread.
		C.destroy_engine_safe(C.int(func() int {
			if engineAttached {
				return 1
			}
			return 0
		}()))

		// Release the OS thread back to the Go runtime's thread pool.
		// Without this, the dying goroutine's OS thread is terminated by the
		// runtime, shrinking the available thread pool by one permanently.
		runtime.UnlockOSThread()

		// Signal pool manager: this worker has exited, pool needs a replacement.
		// Non-blocking send — pool manager may have already initiated shutdown.
		select {
		case w.doneCh <- w.id:
		default:
		}
	}()

	// ── STEP 3: Attach WAM engine to this OS thread. ─────────────────────────
	// PL_thread_attach_engine(nil) uses global default WAM stack sizes:
	//   Local stack:  typically 128MB (controlled by --stack-limit at startup)
	//   Global stack: shared with local stack limit
	//   Trail stack:  embedded in local stack
	//
	// In production, use PL_create_engine() with an explicit PL_engine_t attrs
	// struct to bound each worker's stack independently of the global default.
	// See Section 16.3.2.1 for the bounded-stack pattern.
	// For this implementation, nil uses the global limit set by InitEngine's
	// PL_initialise argv: "--stack-limit=64M" (matching Chapter 14's worker_stack_mb).
	if rc := C.PL_thread_attach_engine(nil); rc < 0 {
		// Negative return: engine limit exceeded or out of memory.
		// Panic here — the deferred recovery will clean up.
		panic(fmt.Sprintf("worker %d: PL_thread_attach_engine failed (rc=%d)", w.id, rc))
	}
	engineAttached = true
	log.Printf("[Worker %d] WAM engine attached (OS thread locked)", w.id)

	// ── STEP 4: Query processing loop. ───────────────────────────────────────
	// select on both work channel and control channel.
	// Work items are processed one at a time — the worker is serialised.
	// No two queries share this worker's WAM engine simultaneously.
	for {
		select {
		case item, ok := <-w.inCh:
			if !ok {
				// Channel closed — pool manager is shutting down this worker.
				return
			}
			start := time.Now()
			verdict, err := QueryFirewall(item.Req)
			// QueryFirewall from cgo_bridge.go already handles:
			//   - PL_open_foreign_frame / PL_discard_foreign_frame per call
			//   - PL_exception extraction before PL_close_query
			//   - C.CString / C.free discipline
			// The worker loop does not need to manage FLI lifecycle — QueryFirewall
			// encapsulates the full frame-open/query/extract/close/frame-discard cycle.
			item.ReplyCh <- WorkResult{
				Verdict:  verdict,
				Err:      err,
				WorkerID: w.id,
				Latency:  time.Since(start),
			}

		case ctrl := <-w.ctrlCh:
			if err := handleControl(w.id, ctrl); err != nil {
				// handleControl returns non-nil only for ControlShutdown.
				// Clean exit — no panic, no engine destruction needed for normal shutdown.
				// The defer still fires and calls destroy_engine_safe — correct.
				log.Printf("[Worker %d] Shutdown on control signal", w.id)
				return
			}
		}
	}
}

// handleControl processes a ControlMsg on the calling worker.
// Returns non-nil to signal the worker loop to exit cleanly (ControlShutdown).
// Returns nil to signal the worker loop to continue (ControlSync, ControlReload).
func handleControl(workerID int, ctrl ControlMsg) error {
	switch ctrl.Kind {
	case ControlShutdown:
		return errors.New("shutdown")

	case ControlSync:
		// Ping — no action required. The fact that this case was reached
		// proves the worker goroutine is alive and on its locked OS thread.
		log.Printf("[Worker %d] Sync acknowledged", workerID)
		return nil

	case ControlReload:
		// Reload the KB on this worker's WAM engine.
		// consultFile from cgo_bridge.go is safe to call here — it uses
		// PL_open_foreign_frame / PL_discard_foreign_frame internally.
		// The reload affects only this worker's engine — other workers
		// continue to use their current KB state until they receive
		// the same ControlReload message via the broadcast.
		log.Printf("[Worker %d] Reloading KB: %s", workerID, ctrl.Payload)
		if err := consultFile(ctrl.Payload); err != nil {
			log.Printf("[Worker %d] KB reload FAILED: %v", workerID, err)
			// Do not return error — worker continues with old KB.
			// Pool manager logs the reload failure via the status API.
		} else {
			log.Printf("[Worker %d] KB reload OK", workerID)
		}
		return nil

	default:
		log.Printf("[Worker %d] Unknown control kind %d — ignoring", workerID, ctrl.Kind)
		return nil
	}
}

// ─────────────────────────────────────────────────────────────────────────────
// POOL
// ─────────────────────────────────────────────────────────────────────────────

// Pool manages a fixed-size collection of thread-locked WAM workers.
// All exported methods are goroutine-safe.
type Pool struct {
	workers    []*worker
	dispatchCh chan WorkItem  // Single inbound channel for all work
	ctrlCh     chan ControlMsg // Broadcast to all workers
	doneCh     chan int        // Workers signal exit here — triggers respawn
	size       int
	mu         sync.Mutex     // Protects workers slice during respawn only
	shutdown   chan struct{}   // Closed by Stop() to halt the manager loop
	stopped    atomic.Bool    // True after Stop() returns
}

// NewPool creates and starts a worker pool of the given size.
// kbPath is passed to InitEngine — the KB must be loadable before workers start.
// Returns an error if engine initialisation fails.
func NewPool(size int, kbPath string) (*Pool, error) {
	if size <= 0 {
		return nil, fmt.Errorf("pool size must be positive, got %d", size)
	}

	// Initialise the shared WAM engine (loads KB into shared clause database).
	// Workers attach their per-thread engine contexts after this.
	if err := InitEngine(kbPath); err != nil {
		return nil, fmt.Errorf("engine init failed: %w", err)
	}

	p := &Pool{
		workers:    make([]*worker, size),
		// dispatchCh depth = pool size: maximum one queued item per worker.
		// Deeper buffer allows more queuing but delays back-pressure.
		// At depth=size, a fully-saturated pool with one queued item per worker
		// means at most 2×size queries are in-flight at any moment.
		dispatchCh: make(chan WorkItem, size),
		ctrlCh:     make(chan ControlMsg, size*2), // Buffered: broadcasts don't block sender
		doneCh:     make(chan int, size),
		size:       size,
		shutdown:   make(chan struct{}),
	}

	// Start all workers.
	for i := 0; i < size; i++ {
		p.startWorker(i)
	}

	// Start the dispatcher goroutine — routes dispatchCh items to worker inCh.
	go p.dispatchLoop()

	// Start the pool manager goroutine — monitors doneCh and respawns workers.
	go p.managerLoop()

	return p, nil
}

// startWorker creates a worker struct and launches its goroutine.
// Caller must hold p.mu if called after initial pool creation.
func (p *Pool) startWorker(id int) {
	w := &worker{
		id:     id,
		inCh:   make(chan WorkItem, 1), // Depth 1: worker accepts one item, processes it
		ctrlCh: p.ctrlCh,              // Shared broadcast channel — all workers read it
		doneCh: p.doneCh,
	}
	p.workers[id] = w
	go workerLoop(*w)
}

// dispatchLoop routes WorkItems from the shared dispatchCh to individual worker inCh.
// Uses a round-robin selection across worker inCh channels to distribute work evenly.
// A worker whose inCh is full (processing a prior item) is skipped in this round.
func (p *Pool) dispatchLoop() {
	idx := 0
	for {
		select {
		case <-p.shutdown:
			return
		case item := <-p.dispatchCh:
			// Try to deliver to the next available worker using a rotating index.
			// If all workers are busy, fall back to a blocking send on any worker.
			delivered := false
			start := idx
			for {
				p.mu.Lock()
				w := p.workers[idx%p.size]
				p.mu.Unlock()
				idx = (idx + 1) % p.size

				// Non-blocking send attempt: if worker inCh has capacity, deliver immediately.
				select {
				case w.inCh <- item:
					delivered = true
				default:
					// Worker busy — try next
				}
				if delivered {
					break
				}
				if idx == start {
					// Full rotation — all workers busy. Block on the first worker.
					// This applies back-pressure: dispatchLoop blocks, dispatchCh fills,
					// Dispatch() blocks, HTTP handler blocks, TCP back-pressure applied.
					p.mu.Lock()
					first := p.workers[start%p.size]
					p.mu.Unlock()
					first.inCh <- item
					break
				}
			}
		}
	}
}

// managerLoop monitors worker exits and respawns replacements.
// A worker that exits (cleanly or via panic recovery) sends its ID to doneCh.
// The manager respawns a replacement worker with the same ID.
func (p *Pool) managerLoop() {
	for {
		select {
		case <-p.shutdown:
			return
		case deadID := <-p.doneCh:
			if p.stopped.Load() {
				// Pool is stopping — do not respawn.
				continue
			}
			log.Printf("[Pool Manager] Worker %d exited — respawning", deadID)
			// Brief backoff before respawn: if the worker panicked due to a
			// systematic error (e.g., KB corruption), rapid respawning loops
			// infinitely and floods the log. The backoff limits respawn rate
			// to one per 100ms per worker slot.
			time.Sleep(100 * time.Millisecond)
			p.mu.Lock()
			p.startWorker(deadID)
			p.mu.Unlock()
			log.Printf("[Pool Manager] Worker %d respawned", deadID)
		}
	}
}

// ─────────────────────────────────────────────────────────────────────────────
// PUBLIC API
// ─────────────────────────────────────────────────────────────────────────────

// Dispatch sends a FirewallRequest to the worker pool and blocks until a result
// is available or the context deadline is exceeded.
// Returns WorkResult — check WorkResult.Err for query errors.
// Returns error if the pool is stopped or the send times out.
func (p *Pool) Dispatch(req FirewallRequest, timeout time.Duration) (WorkResult, error) {
	if p.stopped.Load() {
		return WorkResult{}, errors.New("pool is stopped")
	}

	// Unbuffered reply channel — caller blocks until exactly one result arrives.
	// Using a size-1 buffer to prevent a goroutine leak if the caller times out
	// before reading the result: the worker can still send without blocking.
	replyCh := make(chan WorkResult, 1)

	item := WorkItem{Req: req, ReplyCh: replyCh}

	// Send to dispatch channel with timeout.
	// If dispatchCh is full (all workers busy, queue full), this blocks.
	// The timeout converts back-pressure into an explicit error for the caller.
	select {
	case p.dispatchCh <- item:
		// Successfully queued — now wait for result.
	case <-time.After(timeout):
		return WorkResult{}, fmt.Errorf("dispatch timeout after %v (pool saturated)", timeout)
	}

	// Wait for result with the same timeout (query execution time).
	select {
	case result := <-replyCh:
		return result, nil
	case <-time.After(timeout):
		return WorkResult{}, fmt.Errorf("result timeout after %v (worker stalled)", timeout)
	}
}

// Stop initiates graceful pool shutdown.
// Broadcasts ControlShutdown to all workers, waits for all to exit,
// then closes the dispatch and control channels.
// Stop blocks until all workers have acknowledged shutdown or the deadline expires.
func (p *Pool) Stop(deadline time.Duration) error {
	p.stopped.Store(true)
	close(p.shutdown) // Signals dispatchLoop and managerLoop to exit

	// Broadcast shutdown to all workers.
	p.Broadcast(ControlMsg{Kind: ControlShutdown})

	// Wait for all workers to call doneCh.
	done := make(chan struct{})
	go func() {
		received := 0
		for received < p.size {
			select {
			case <-p.doneCh:
				received++
			case <-time.After(deadline):
				log.Printf("[Pool] Shutdown timeout: %d/%d workers responded", received, p.size)
				close(done)
				return
			}
		}
		close(done)
	}()

	<-done
	log.Printf("[Pool] Shutdown complete")
	return nil
}

// Broadcast sends a ControlMsg to all workers via the shared control channel.
// Workers consume control messages between query iterations — delivery is
// asynchronous but ordered (FIFO within the ctrlCh buffer).
// For Reload: all workers will eventually reload, but not simultaneously.
// For Shutdown: all workers will exit after finishing their current query.
func (p *Pool) Broadcast(msg ControlMsg) {
	// Send N copies — one per worker. Each worker reads from the shared ctrlCh.
	// Workers consume at their own rate — a busy worker delays its own reload.
	for i := 0; i < p.size; i++ {
		select {
		case p.ctrlCh <- msg:
		default:
			// ctrlCh buffer full — this should not happen for a correctly sized buffer
			// (ctrlCh capacity = size*2, Broadcast sends size messages).
			// Log and drop rather than blocking Broadcast.
			log.Printf("[Pool] WARNING: ctrlCh full on Broadcast — message dropped for slot %d", i)
		}
	}
}

// Status returns a snapshot of current pool state.
// Goroutine-safe — reads atomic values and channel lengths only.
func (p *Pool) Status() PoolStatus {
	return PoolStatus{
		PoolSize:     p.size,
		QueueDepth:   len(p.dispatchCh),
		QueueCap:     cap(p.dispatchCh),
		PctSaturated: len(p.dispatchCh) * 100 / cap(p.dispatchCh),
		Stopped:      p.stopped.Load(),
	}
}

// PoolStatus is a point-in-time snapshot of pool health.
type PoolStatus struct {
	PoolSize     int
	QueueDepth   int
	QueueCap     int
	PctSaturated int
	Stopped      bool
}

16.3.2.1 Production Engine Pre-Allocation: Bounding Stack Size with `PL_create_engine`

PL_thread_attach_engine(nil) allocates a WAM engine using the global stack limit — the --stack-limit value passed to PL_initialise at startup. With 16 workers all sharing one global limit of 64MB, a single runaway query (an accidentally infinite recursive rule, a KB loaded without the tabling annotation from Chapter 17) can allocate its way to the full 64MB before the WAM stack overflow signal fires. By the time the overflow is detected and the panic path in Section 16.5 runs, the process may be under severe memory pressure. The Go GC and the WAM garbage collector compete for physical RAM during the overflow window.

PL_create_engine (available in SWI-Prolog 9.1+) constructs an engine with explicitly bounded stacks before it is attached to an OS thread, isolating each worker's stack allocation budget from all others:

// marshal.h — add to the C layer already compiled into the bridge

// create_bounded_engine: creates an engine with a strict per-worker stack cap.
// local_mb:  per-worker local + trail stack limit in megabytes
// global_mb: per-worker global stack limit in megabytes
// Returns a PL_engine_t handle on success, NULL on failure.
// Call PL_set_engine(handle, NULL) on the calling OS thread to attach.
// Call PL_destroy_engine(handle) on cleanup.

static PL_engine_t create_bounded_engine(int local_mb, int global_mb) {
    PL_engine_t e;
    // PL_create_engine accepts a term_t options dict with stack size keys.
    // We build this via the FLI on the current thread's engine (main engine).
    // local(N):  local + trail stack combined limit in bytes
    // global(N): global stack limit in bytes

    // Build options Dict: engine{local: L, global: G}
    term_t opts = PL_new_term_refs(3);
    term_t lval = opts + 1;
    term_t gval = opts + 2;

    if (!PL_put_int64(lval, (int64_t)local_mb  * 1024 * 1024)) return NULL;
    if (!PL_put_int64(gval, (int64_t)global_mb * 1024 * 1024)) return NULL;

    atom_t keys[2] = { PL_new_atom("local"), PL_new_atom("global") };
    if (!PL_put_dict(opts, PL_new_atom("engine"), 2, keys, lval)) return NULL;

    e = PL_create_engine(opts);  // NULL on failure (OOM, limit exceeded)
    return e;
}

// attach_bounded_engine: attaches a pre-created engine to the calling OS thread.
// Must be called after runtime.LockOSThread() and before any FLI query.
// Returns 1 on success, 0 on failure.
static int attach_bounded_engine(PL_engine_t e) {
    PL_engine_t prev;
    return PL_set_engine(e, &prev) == PL_ENGINE_SET;
}

// detach_and_destroy_engine: detaches from the calling OS thread and destroys.
// Call in the recovery defer in place of PL_thread_destroy_engine().
static void detach_and_destroy_engine(PL_engine_t e) {
    PL_set_engine(PL_ENGINE_MAIN, NULL);  // Detach: revert to main engine context
    PL_destroy_engine(e);                 // Release all stacks and internal state
}

The revised workerLoop STEP 3 using PL_create_engine:

// In cgo preamble, expose create_bounded_engine and attach_bounded_engine
// to Go as C functions callable via C.create_bounded_engine(local_mb, global_mb)
// and C.attach_bounded_engine(engine_handle).

// STEP 3 (production variant — bounded stacks):
// localMB and globalMB match the stack_limit from Chapter 14's worker_stack_mb (64MB).
// Each worker is independently capped — a runaway query in worker 7 cannot
// exhaust the global stack allocation and affect workers 0-6 and 8-15.
engineHandle := C.create_bounded_engine(C.int(64), C.int(64))
if engineHandle == nil {
    panic(fmt.Sprintf("worker %d: create_bounded_engine failed", w.id))
}
if C.attach_bounded_engine(engineHandle) == 0 {
    C.detach_and_destroy_engine(engineHandle)
    panic(fmt.Sprintf("worker %d: attach_bounded_engine failed", w.id))
}
engineAttached = true
// Update the deferred recovery to use detach_and_destroy_engine(engineHandle)
// instead of destroy_engine_safe(1) — the C.PL_thread_destroy_engine path
// is not correct for PL_create_engine-allocated handles.

The per-worker isolation means: at 16 workers × 64MB local + 64MB global = 2GB total maximum WAM stack allocation. At 128GB physical RAM, this is 1.5% of available memory reserved for worker stacks. Without PL_create_engine, a single stack-overflow failure before the panic catcher fires can allocate the full global limit — 64MB — in one burst rather than 4MB per worker. The isolation boundary is the difference between one worker failing cleanly and 15 workers experiencing memory pressure during the failure.

16.3.3 Updated `main.go` — HTTP Handler with Pool Dispatch

// File: /opt/logic-node/go/firewall-bridge/main.go (updated for pool)
package main

import (
	"encoding/json"
	"log"
	"net/http"
	"runtime"
	"time"
)

var globalPool *Pool

func main() {
	// Pool size: GOMAXPROCS minus 2 (main goroutine + pool manager/GC).
	poolSize := runtime.GOMAXPROCS(0) - 2
	if poolSize < 1 {
		poolSize = 1
	}

	var err error
	globalPool, err = NewPool(poolSize, "/opt/logic-node/kb/firewall_policy.pl")
	if err != nil {
		log.Fatalf("Pool init failed: %v", err)
	}
	defer globalPool.Stop(5 * time.Second)

	http.HandleFunc("/firewall", firewallHandler)
	http.HandleFunc("/reload",   reloadHandler)
	http.HandleFunc("/status",   statusHandler)

	log.Printf("Firewall bridge listening on :8080 (pool size: %d)", poolSize)
	log.Fatal(http.ListenAndServe(":8080", nil))
}

// firewallHandler: HTTP handler — free-floating goroutine (NOT locked to OS thread).
// Dispatches to pool via channel. Never touches WAM directly.
func firewallHandler(w http.ResponseWriter, r *http.Request) {
	var req FirewallRequest
	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
		http.Error(w, "invalid JSON", http.StatusBadRequest)
		return
	}

	// Dispatch with 500ms timeout — covers dispatch queue wait + query execution.
	// A saturated pool returns a 503; a stalled worker returns a 504.
	result, err := globalPool.Dispatch(req, 500*time.Millisecond)
	if err != nil {
		http.Error(w, err.Error(), http.StatusServiceUnavailable)
		return
	}
	if result.Err != nil {
		http.Error(w, result.Err.Error(), http.StatusInternalServerError)
		return
	}

	w.Header().Set("Content-Type", "application/json")
	json.NewEncoder(w).Encode(map[string]interface{}{
		"allowed":   result.Verdict.Allowed,
		"reason":    result.Verdict.Reason,
		"rule_id":   result.Verdict.RuleID,
		"worker_id": result.WorkerID,
		"latency_us": result.Latency.Microseconds(),
	})
}

// reloadHandler: broadcasts a KB reload to all workers.
func reloadHandler(w http.ResponseWriter, r *http.Request) {
	path := r.URL.Query().Get("path")
	if path == "" {
		path = "/opt/logic-node/kb/firewall_policy.pl"
	}
	globalPool.Broadcast(ControlMsg{Kind: ControlReload, Payload: path})
	w.WriteHeader(http.StatusAccepted)
	w.Write([]byte(`{"status":"reload_broadcast"}`))
}

// statusHandler: returns pool health snapshot.
func statusHandler(w http.ResponseWriter, r *http.Request) {
	s := globalPool.Status()
	w.Header().Set("Content-Type", "application/json")
	json.NewEncoder(w).Encode(s)
}

16.4 Broadcast Control Channels

16.4.1 The Reload Problem

Each worker owns a private WAM engine. The WAM clause database is shared (read-only in normal operation), but each engine's local state — current module, loaded file list, flag values — is per-engine. When the operator pushes an updated firewall_policy.pl (a new blocklist, an updated whitelist range), all running workers must reload the KB. The reload must happen without taking the pool offline, without a query window where some workers use the old KB and others use the new one, and without any worker holding an open foreign frame or open query during the reload.

The ControlReload broadcast via ctrlCh satisfies these constraints:

The broadcaster (HTTP handler → Pool.Broadcast) sends N copies of ControlReload to ctrlCh.
Each worker reads its copy from ctrlCh when it next reaches the select in workerLoop — between query iterations.
A worker processing a query reads the control message only after the current WorkItem finishes — the reload never interrupts an in-flight query.
consultFile(ctrl.Payload) re-loads the file into the WAM. In SWI-Prolog, consult/1 on an already-loaded file performs an incremental update: new clauses are added, modified clauses are replaced, deleted clauses are retracted. The shared clause database is updated atomically at the clause-database mutex level — other workers reading the old version of a predicate during the consult see consistent (old or new, never partial) clause lists.
Workers reload in sequence across the pool, not simultaneously. This is intentional: simultaneous reloads on all engines would all contend on the clause database mutex at the same time, producing a pause.

16.4.2 Reload Sequencing and the Consistency Window

The consistency window — the period between the first worker reloading and the last worker reloading — means that concurrent queries may use different KB versions. For firewall policy this is acceptable: a CIDR range added to the blocklist takes effect for queries processed by the first reloading worker immediately, and for the last worker within at most size × average_query_time seconds. For a 16-worker pool at 20μs per query: 16 × 20μs = 320μs maximum consistency window. No blocklisted IP receives more than 320μs of continued access after a reload broadcast.

If the application requires strict atomic KB cutover (all workers switch simultaneously), the correct approach is a version-tagged clause database: the new rules are loaded as firewall_verdict_v2/4, the old rules remain as firewall_verdict_v1/4, and a shared atomic integer stores the active version number. Workers read the active version number at query time and dispatch to the corresponding predicate. When all workers are confirmed idle (via a synchronisation barrier), the version number is atomically incremented. This is a more complex architecture than the rolling reload and is warranted only when the consistency window is operationally unacceptable.

16.4.3 The Sync Ping — Liveness Verification

ControlSync verifies that a worker is alive and executing on its locked OS thread without disturbing its processing state. The pool manager uses it after a respawn to confirm the replacement worker is healthy:

// PingWorkers: broadcasts ControlSync and waits for all workers to log acknowledgement.
// Used after respawn, after reload, or as a periodic health check.
// This is a best-effort check — a worker that is processing a long query
// will acknowledge only after that query completes.
func (p *Pool) PingWorkers(timeout time.Duration) error {
	p.Broadcast(ControlMsg{Kind: ControlSync})
	// Sync acknowledgement is logged by each worker's handleControl.
	// There is no reply channel for sync — liveness is inferred from
	// the log output, not from a returned value.
	// For a strict liveness check, use Dispatch with a trivial no-op query
	// and verify that all N dispatched items return within the timeout.
	time.Sleep(timeout)
	return nil
}

16.5 Security: Panic Isolation and Engine Recovery

16.5.1 What Triggers a Worker Panic

A worker goroutine can panic from three sources, with different characteristics and recovery requirements:

Source 1: Go-level panic inside QueryFirewall. A nil pointer dereference in the Go code of cgo_bridge.go, a type assertion failure, or an explicit panic() call. This is caught by Go's standard recover() mechanism. The WAM engine state after a Go-level panic depends on where in the FLI call sequence the panic occurred:

If the panic occurred before PL_open_foreign_frame: engine is clean.
If the panic occurred between PL_open_foreign_frame and PL_discard_foreign_frame: a foreign frame is open on the WAM local stack. The engine is not corrupted but the frame is leaked.
If the panic occurred between PL_open_query and PL_close_query: a query is open. The engine has pending choice points and variable bindings.

In all cases, the correct recovery action is PL_thread_destroy_engine() rather than attempting to close the query or discard the frame — the Go stack unwound without executing those cleanup calls, and the WAM state is not recoverable to a known-clean state without destroying the engine entirely.

Source 2: CGO-propagated C error converted to Go panic. Some CGO calls use panic() to signal unrecoverable C-layer errors — PL_thread_attach_engine failure, as implemented in workerLoop. These are Go-level panics triggered by Go code and are caught by recover().

Source 3: C-level SIGSEGV from incorrect thread migration. A SIGSEGV inside libswipl.so is delivered as a process-level signal. Go's runtime catches SIGSEGV originating from Go code as a panic (for stack overflow, nil dereference in Go). A SIGSEGV in C code is a fatal signal — recover() does not catch it. This is why runtime.LockOSThread() as the first statement in workerLoop is not optional — it is the prevention mechanism for this source. If the lock is in place, source 3 cannot occur during normal operation.

16.5.2 The Recovery Sequence — Annotated

// The defer in workerLoop, fully annotated:
defer func() {
	// ── Phase 1: Catch the panic. ────────────────────────────────────────────
	r := recover()
	// recover() returns nil if the goroutine exited cleanly (ControlShutdown).
	// recover() returns the panic value if the goroutine panicked.
	// After recover() returns, the goroutine is in a normal (non-panicking) state.
	// All deferred functions after this point execute normally.
	if r != nil {
		log.Printf("[Worker %d] PANIC: %v", w.id, r)
		// Do not re-panic. The pool manager will respawn this worker.
		// Re-panicking would propagate up to the Go runtime and crash the binary.
	}

	// ── Phase 2: Destroy the WAM engine on this OS thread. ──────────────────
	// Regardless of why the goroutine is exiting (clean or panic),
	// PL_thread_destroy_engine releases all WAM resources for this thread:
	//   - Closes any open foreign frames (reclaims local stack)
	//   - Closes any open queries (unwinds choice points, releases trail)
	//   - Frees the thread's local data structure (PL_local_data_t)
	//   - Detaches the engine from the OS thread (clears pthread_setspecific)
	//
	// After this call: no FLI function may be called from this OS thread.
	// This OS thread is clean and can be reused by the Go runtime.
	C.destroy_engine_safe(C.int(func() int {
		if engineAttached { return 1 }
		return 0
	}()))

	// ── Phase 3: Unlock the OS thread. ──────────────────────────────────────
	// Returns the OS thread to Go's thread pool for use by other goroutines.
	// Without this: the dying goroutine's OS thread is TERMINATED by the runtime,
	// permanently reducing Go's available thread pool by one.
	// With this: the OS thread survives and is available for the replacement worker.
	runtime.UnlockOSThread()

	// ── Phase 4: Signal pool manager. ───────────────────────────────────────
	// Non-blocking send — if pool manager is shutting down, doneCh may be
	// unread. The select prevents the goroutine from blocking on exit.
	select {
	case w.doneCh <- w.id:
	default:
	}
	// Goroutine exits. OS thread survives (UnlockOSThread called).
	// Pool manager receives w.id on doneCh and spawns a replacement.
}()

16.5.3 The Respawn Rate Limiter — Preventing Panic Loops

A worker that panics immediately after respawn — because the KB is corrupted, the WAM engine limit is exceeded, or the FLI call that triggers the panic is in the first query — will respawn, panic, respawn, panic, infinitely. Without rate limiting, this produces:

At 100 panics/second × 16 workers:
  1,600 log lines/second: "[Worker N] PANIC recovered"
  1,600 goroutine/OS-thread creation-and-destruction cycles/second
  Each PL_thread_attach_engine/PL_thread_destroy_engine pair:
    ~50μs of WAM internal mutex contention
  1,600 × 50μs = 80ms/second of WAM mutex contention
  All 16 live workers stalled waiting for the mutex while panicking workers
  fight over it during attach/destroy cycles
  Effective query throughput: near zero

The time.Sleep(100 * time.Millisecond) in managerLoop limits respawn rate to 10 per worker slot per second. At 16 workers respawning simultaneously: 160 attach/destroy operations per second = 8ms/second of WAM mutex contention — tolerable. The 100ms delay also provides a window for an operator to observe the panic log, identify the root cause (KB path wrong, libswipl.so not found, engine limit exceeded), and fix it before the pool resumes normal operation.

For a production deployment, the respawn delay should be exponential with a cap:

// Exponential backoff respawn — prevents panic loops from starving the pool
func (p *Pool) managerLoopWithBackoff() {
	backoff := make([]time.Duration, p.size)
	for i := range backoff {
		backoff[i] = 100 * time.Millisecond
	}
	for {
		select {
		case <-p.shutdown:
			return
		case deadID := <-p.doneCh:
			if p.stopped.Load() {
				continue
			}
			delay := backoff[deadID]
			log.Printf("[Pool Manager] Worker %d exited — respawning in %v", deadID, delay)
			time.Sleep(delay)
			p.mu.Lock()
			p.startWorker(deadID)
			p.mu.Unlock()
			// Double the backoff for this slot, cap at 30 seconds.
			backoff[deadID] = min(backoff[deadID]*2, 30*time.Second)
		}
	}
}

// On successful query completion, reset the backoff for the worker slot.
// Add to WorkResult handling in Dispatch():
//   if result.Err == nil { pool.resetBackoff(result.WorkerID) }
func (p *Pool) resetBackoff(workerID int) {
	// Sent via a separate channel to managerLoop — not implemented here for brevity.
	// Full implementation: backoffResetCh chan int in Pool struct.
}

16.5.4 What `recover()` Cannot Save

Three conditions produce fatal process termination that recover() cannot intercept:

C-level SIGSEGV from thread migration — prevented by runtime.LockOSThread() as the first goroutine statement. If the lock is missing and migration occurs, recover() never fires.
runtime.throw from the Go runtime — internal Go runtime errors (concurrent map read and map write, memory corruption detected by the runtime itself) call runtime.throw, which terminates the process without going through the panic/recover mechanism. These should not occur in well-written Go code but are listed for completeness.
os.Exit or syscall.Exit called from C code — PL_halt(0) inside libswipl.so calls exit(). If a Prolog halt/1 goal is executed via a FLI query — whether intentionally in a test or accidentally via a KB rule that calls halt — the entire Go process exits immediately. The mitigation is to never allow KB rules to call halt/0 or halt/1, enforced by a KB load-time linter that checks for these predicates before consultFile returns.

Outcome: Enterprise-Grade Polyglot Concurrency

16.6.1 The Conceptual Transition

Chapter 15's single-query CGO bridge proved that Go and the WAM can share a process and communicate in 20μs. Chapter 16 makes that proof operational at scale: 16 workers on 16 locked OS threads, each with an independent WAM engine context, collectively processing 80,000+ firewall policy decisions per second (16 workers × 5,000 queries/second/worker), with back-pressure to the HTTP layer, graceful KB reload without downtime, and self-healing worker recovery from panics — all within one Go binary, with no network, no serialisation, and no external dependencies.

Single-threaded CGO (Chapter 15)	Thread-locked worker pool (Chapter 16)
1 query at a time — serial	N queries concurrent — one per locked worker
Goroutine may be migrated between FLI calls	Each worker goroutine locked to one OS thread permanently
One WAM engine — shared by all callers (incorrect)	One WAM engine per worker — fully independent stacks
No back-pressure mechanism	Bounded channel applies back-pressure to HTTP handlers
Panic kills the Go binary	Worker panic recovered — engine destroyed, replacement spawned
KB reload requires process restart	KB reload via ControlReload broadcast — rolling, online
SIGSEGV risk from scheduler migration	SIGSEGV risk eliminated by first-statement LockOSThread

16.6.2 Verification Checklist

# Build with pool
logicadmin@logic-node-01:~$ cd /opt/logic-node/go/firewall-bridge
logicadmin@logic-node-01:~$ CGO_ENABLED=1 go build -v -o firewall-bridge .

# Start the server
logicadmin@logic-node-01:~$ ./firewall-bridge &
[Pool] Starting 14 workers (GOMAXPROCS=16, -2 reserved)
[Worker 0] WAM engine attached (OS thread locked)
...
[Worker 13] WAM engine attached (OS thread locked)
Firewall bridge listening on :8080 (pool size: 14)

# 1. Verify all workers started and locked
logicadmin@logic-node-01:~$ curl -s localhost:8080/status | jq .
{
  "PoolSize": 14,
  "QueueDepth": 0,
  "QueueCap": 14,
  "PctSaturated": 0,
  "Stopped": false
}

# 2. Single query — verify correct verdict and worker ID in response
logicadmin@logic-node-01:~$ curl -s -X POST localhost:8080/firewall \
  -d '{"SourceIP":"10.0.1.5","DestPort":443,"Protocol":"tcp"}' | jq .
{
  "allowed": true,
  "reason": "whitelist_match",
  "rule_id": 1,
  "worker_id": 3,
  "latency_us": 19
}

# 3. Concurrent load test — 10,000 requests, 100 concurrent
logicadmin@logic-node-01:~$ hey -n 10000 -c 100 -m POST \
  -H "Content-Type: application/json" \
  -d '{"SourceIP":"10.0.1.5","DestPort":443,"Protocol":"tcp"}' \
  http://localhost:8080/firewall
Summary:
  Total: 2.43s
  Requests/sec: 4115.2
  Mean latency: 24.1ms  ← includes HTTP overhead + queue wait + 19μs WAM query
  99th percentile: 38.2ms
  Errors: 0

# 4. KB reload while under load — no queries fail
logicadmin@logic-node-01:~$ curl -s localhost:8080/reload
{"status":"reload_broadcast"}
# Workers reload asynchronously — no downtime, no error spike in hey output

# 5. Panic recovery — inject a bad request to trigger panic in worker
# (Requires a debug endpoint that calls panic() directly — testing only)
logicadmin@logic-node-01:~$ curl -s localhost:8080/debug/panic-worker
[Worker 7] PANIC recovered: injected test panic — destroying engine and respawning
[Pool Manager] Worker 7 exited — respawning
[Worker 7] WAM engine attached (OS thread locked)
# Pool remains at 14 workers within ~100ms

# 6. Verify pool self-healed
logicadmin@logic-node-01:~$ curl -s localhost:8080/status | jq .PoolSize
14   # ✓ Still 14 workers

# 7. Graceful shutdown — all workers exit cleanly
logicadmin@logic-node-01:~$ kill -SIGTERM $(pgrep firewall-bridge)
[Pool] Shutdown complete
# No goroutine leaks, no OS thread leaks, no WAM engine leaks

16.6.3 What Comes Next

The worker pool established in this chapter executes one query per worker per dispatch cycle. Each firewall_verdict/4 query is deterministic: it either finds a matching CIDR rule or falls through to the default deny. The SLD resolution tree is shallow — at most a handful of unification steps before a cut terminates the search. This is the correct model for IP matching against a bounded rule set.

The model breaks when applied to infrastructure topology queries: "find all reachable hosts from node X", "does a path exist between VLAN 10 and VLAN 30 through any combination of firewall rules and routing entries?" These queries traverse graphs. In pure SLD resolution, graph traversal without cycle detection loops infinitely when the edge relation contains cycles — and any real network topology contains cycles. A query that loops infinitely inside a worker's WAM engine saturates the worker's local stack, triggers a WAM stack overflow, and arrives at the panic recovery path from Section 16.5 for the wrong reasons.

Chapter 17 introduces tabling (SLG resolution) via library(tabling). Tabled predicates memoize their answers in a per-call table stored on the WAM global heap. When the same subgoal is encountered during recursive descent — the cycle case — the tabling mechanism detects the repeat and consults the memo table rather than recursing again. The result is termination-guaranteed graph reachability over arbitrary network topologies, computed once per query and cached for subsequent calls, at the cost of global heap space proportional to the number of distinct subgoal answers. The worker pool from Chapter 16 delivers tabled queries to the embedded WAM pool in Chapter 17 without modification — the Dispatch API is stable, the FLI contract is unchanged, only the Prolog predicate changes from a cut-driven deterministic rule to a tabled recursive one.

Chapter Summary

Concept	Operational Definition	Performance / Security Consequence
Go M:N scheduler	M goroutines multiplexed onto N OS threads; goroutines migrate freely between threads	WAM OS-thread-local engine state is invalid after migration — SIGSEGV or silent wrong result
`runtime.LockOSThread()`	Pins goroutine to current OS thread; scheduler cannot migrate it	Must be first statement in worker goroutine; any code before it is a migration risk
One engine per locked thread	`PL_thread_attach_engine(nil)` called after `LockOSThread()`	N workers = N independent local/global/trail stacks; shared clause database is read-only
`runtime.UnlockOSThread()` in defer	Returns OS thread to Go runtime pool after worker exit	Without unlock: Go runtime terminates the OS thread, permanently shrinking thread pool
Bounded dispatch channel	`make(chan WorkItem, poolSize)` — depth equals pool size	Blocks producers when pool saturated; maximum in-flight = 2×poolSize queries
Worker inCh depth 1	Each worker's personal channel holds at most one pending item	Worker processes serially; at-most-one-queued-per-worker prevents memory accumulation
`ControlReload` broadcast	N copies sent to shared `ctrlCh`; each worker reloads between queries	Rolling reload: no downtime; consistency window = size × avg_query_time (≤320μs at 16 workers)
`ControlShutdown` broadcast	Workers exit cleanly after current query; defer fires, engine destroyed	Pool drains in-flight queries before shutdown; no open queries at exit
`recover()` in worker defer	Catches Go-level panics; C SIGSEGV is NOT catchable — prevented by LockOSThread	Worker panic does not kill binary; pool manager respawns replacement
`PL_thread_destroy_engine()` in recovery	Releases WAM engine from OS thread after panic; all open frames/queries closed	Engine state after panic is undefined — destroy is the only safe action
Respawn backoff (100ms, exponential)	Delays replacement after panic; caps respawn rate per worker slot	Prevents panic loop from monopolising WAM mutex; limits to 160 attach/destroy ops/sec at 16 workers
`halt/0` KB linter	Check loaded KB for `halt/1` calls before `consultFile` returns	`PL_halt()` calls `exit()` — kills entire Go binary, not just the worker; `recover()` cannot intercept
Pool size = `GOMAXPROCS - 2`	N workers = N locked OS threads = N physical CPU cores minus overhead	Avoids over-subscription; N > GOMAXPROCS causes locked threads to compete for CPU cores
`PL_create_engine` + `PL_set_engine`	Pre-allocates WAM engine with explicit local/global stack bounds before attaching to OS thread	Per-worker 64MB cap isolates runaway queries; prevents one worker stack overflow from pressuring 15 others; 16 workers x 128MB = 2GB total max WAM allocation
NUMA node binding (`numactl`/`CPUAffinity`)	Pins all worker OS threads and WAM stack allocations to one NUMA node	Eliminates QPI/UPI cross-socket latency (3-4x penalty on dual-socket); WAM local stack pages stay hot in node 0 L3 cache; 40-60% throughput improvement vs. unbound on dual-socket

Exercises

Exercise 16.1 — Pool Size Empirical Optimum Write a benchmark BenchmarkPoolThroughput that creates pools of size 1, 2, 4, 8, GOMAXPROCS, GOMAXPROCS+4, and GOMAXPROCS*2, runs 10,000 Dispatch calls against each, and records median latency and total throughput. Plot the throughput curve. Identify the pool size at which throughput plateaus — this is the empirical optimum for the test machine. Explain why GOMAXPROCS*2 is not twice the throughput of GOMAXPROCS (hint: locked OS threads beyond GOMAXPROCS contend for CPU cores with each other).

Exercise 16.2 — Per-Worker Predicate Handle Cache Chapter 14 noted that PL_predicate() performs a module table hash lookup on every call. In the single-threaded bridge, caching predicate_t as a sync.Once-initialised package variable was safe. In the pool, each worker has its own WAM engine — predicate_t handles from one engine may not be valid in another. Implement a per-worker predicate cache stored in the worker struct as a map[string]C.predicate_t. Populate it on first query per worker (lazy initialisation). Verify with a benchmark that the per-worker cache produces the same throughput improvement as the global cache from Chapter 14, and that no worker uses another worker's predicate_t handle.

Exercise 16.3 — Strict Atomic KB Cutover Implement the version-tagged clause database approach described in Section 16.4.2. Use a firewall_verdict_v1/4 and firewall_verdict_v2/4 naming scheme. Store the active version as a sync/atomic.Int32 in the Pool struct. Implement AtomicReload(path string) that: (1) loads the new rules as version N+1; (2) broadcasts ControlSync and waits for all workers to acknowledge (confirming all in-flight queries are complete); (3) atomically increments the version integer; (4) broadcasts ControlReload so workers pick up the new version number. Measure the maximum consistency window under 5,000 requests/second load — it should be zero for atomic cutover.

Exercise 16.4 — halt KB Linter Implement lintKB(path string) error that loads the Prolog file into a temporary engine, queries predicate_property(halt(_), defined) and predicate_property(halt, defined), and returns an error if either is true in the user module. Integrate lintKB into consultFile — if the lint fails, consultFile returns an error and the KB is not loaded. Write a test KB file containing :- halt. as a directive and verify that consultFile rejects it before executing the directive.

Exercise 16.5 — Health Check Endpoint with Per-Worker Round-Trip The /status endpoint returns aggregate pool metrics from channel lengths. Implement /healthz that performs a strict per-worker liveness check: dispatches one trivial query (true/0 via FLI) to each worker slot sequentially, measures the round-trip time, and returns a JSON object with per-worker status ({worker_id, latency_us, status: "ok"|"timeout"}). A worker that fails to respond within 50ms is marked as "timeout". The health check must complete within poolSize × 50ms maximum. Use the Dispatch timeout mechanism — do not add a new channel type for health checks.