Chapter 16: The Go-Log Concurrency Model
Textbook: Modern SWI-Prolog (2026 Edition): Sovereign Infrastructure & Industrial Logic Volume: III — Scaling & Concurrency Chapter: 16 of 24 Audience: Senior Engineers, Systems Architects, Infrastructure Security Practitioners Prerequisites: Chapters 1–15 complete.
cgo_bridge.go,marshal.c,firewall_policy.ploperational at/opt/logic-node/go/firewall-bridge/.libswiplheaders at/usr/lib/swi-prolog/include/. Go 1.22+.logicadminuser active.
Core Concepts
Chapter 15 established a single working CGO bridge: one goroutine, pinned to one OS thread, attaching one WAM engine context, executing one firewall_verdict/4 query at a time. Under the load profile of a single operator issuing manual policy checks, this is correct and sufficient. Under the load profile of a Go HTTP server receiving 5,000 concurrent firewall policy decisions per second from an upstream load balancer, it is a serialisation bottleneck that negates every advantage the embedded WAM provides.
The path from one working CGO goroutine to a pool of N concurrent CGO goroutines is not "spin up N goroutines." Go's M:N scheduler, the WAM's OS-thread affinity model, and the term_t stack-index lifetime guarantee combine to create an impedance mismatch that produces silent data corruption and segmentation faults unless the pool architecture explicitly accounts for all three constraints simultaneously. The failure mode is not an error at pool startup — it is a SIGSEGV delivered to a worker goroutine at an arbitrary future point, under load, after the scheduler happens to migrate the goroutine between OS threads at the wrong moment.
Five properties define the thread-locked worker pool as the only correct concurrency architecture for the CGO bridge under Go's scheduler.
1. Go's M:N scheduler is incompatible with thread-local WAM state by default.
Go multiplexes M goroutines onto N OS threads, where N is controlled by GOMAXPROCS (default: number of CPU cores). The scheduler preempts goroutines at function call boundaries and safe points, and is free to resume a preempted goroutine on any available OS thread in the pool. A goroutine that called PL_thread_attach_engine(NULL) on OS thread 17 accumulates WAM engine state — local stack frames, variable bindings, term_t handles — associated with thread 17's WAM context. If the scheduler migrates this goroutine to OS thread 22 between two FLI calls, the next FLI call executes with thread 22's WAM context (a different engine, potentially uninitialised). The term_t handles allocated on thread 17's local stack are now indices into thread 22's local stack. The result is undefined behaviour at the C layer — typically a SIGSEGV on the first PL_get_* call that dereferences the stale stack index.
2. runtime.LockOSThread() is not optional — it is the architectural primitive.
runtime.LockOSThread() pins the calling goroutine to its current OS thread permanently (until runtime.UnlockOSThread() is called or the goroutine exits). The Go scheduler will not migrate a locked goroutine to another thread. For a WAM worker goroutine, runtime.LockOSThread() must be the first statement in the goroutine body — before PL_thread_attach_engine, before any FLI call, before any code that could trigger a scheduler preemption. A runtime.LockOSThread() call after the first FLI call is too late: the goroutine may already have migrated between the goroutine start and the lock call.
3. The pool size is fixed at the OS-thread budget, not the goroutine budget.
Each worker goroutine in the pool locks one OS thread for its lifetime. N workers = N locked OS threads. The OS scheduler sees N real kernel threads, each with its own WAM engine context. The Go scheduler sees N goroutines that it cannot migrate. The correct pool size is the number of physical CPU cores minus the threads reserved for the main goroutine, the logger thread, and the Go runtime's background goroutines — typically GOMAXPROCS - 2. Allocating more workers than CPU cores creates N locked OS threads competing for M cores, defeating the purpose of the WAM's per-engine local stack locality.
4. Query dispatch uses a bounded Go channel, not a mutex or a sync.WaitGroup.
Incoming FirewallRequest structs from HTTP handlers are dispatched to the worker pool via a single buffered channel. Each worker goroutine owns one request-reply channel: the dispatcher sends a WorkItem (request + reply channel) to the worker's inbound channel; the worker executes the FLI query and sends the FirewallVerdict back on the reply channel. The dispatcher blocks if all workers are busy — back-pressure is applied to the HTTP handler, which applies it to the upstream connection. A bounded channel with depth equal to the pool size ensures that the maximum number of in-flight WAM queries is bounded, preventing heap exhaustion from queued-but-unprocessed requests.
5. Engine recovery requires destroying and recreating the full OS-thread/engine pair.
When a worker goroutine panics — from a C-level WAM stack overflow, an unhandled FLI exception, or a null pointer in marshal code — Go's recover() mechanism catches the panic at the deferred function boundary. At that point, the WAM engine on the current OS thread is in an undefined state: it may have an open query, an open foreign frame, a pending exception, or a partially-written term on the local stack. The engine cannot be salvaged by closing the query or discarding the frame. The correct recovery sequence is: call C.PL_thread_destroy_engine() to release the WAM engine context from this OS thread, call runtime.UnlockOSThread() to release the OS thread back to the Go scheduler, and signal the pool manager to spawn a replacement worker on a fresh OS thread with a fresh WAM engine. This sequence restores the pool to its target size without restarting the Go binary.
Chapter Roadmap
| Section | Title | Focus |
|---|---|---|
| 16.1 | Scheduler Impedance Mismatch | M:N model, OS-thread affinity, migration failure mode, SIGSEGV mechanics |
| 16.2 | Thread-Locked Worker Architecture | Pool design, channel topology, engine-per-thread invariant |
| 16.3 | The Build: engine_pool.go |
Worker struct, workerLoop, Dispatcher, full implementation |
| 16.4 | Broadcast Control Channels | KB reload, cache flush, ControlMsg dispatch, ordered shutdown |
| 16.5 | Security: Panic Isolation and Engine Recovery | Panic anatomy, recover() + PL_thread_destroy_engine, respawn protocol |
| Outcome | Enterprise-Grade Polyglot Concurrency | Verification checklist, latency under load |
16.1 The Scheduler Impedance Mismatch
16.1.1 Go's M:N Scheduler — The Moving-Thread Problem
Go's runtime implements M:N green-thread scheduling: M goroutines are multiplexed across N OS threads (runtime.GOMAXPROCS(0) OS threads by default, set to the number of logical CPUs at startup). The scheduler is cooperative-preemptive: goroutines yield at function call sites, channel operations, and explicit runtime.Gosched() calls, but the Go 1.14+ asynchronous preemption mechanism also inserts preemption points via OS signals (SIGURG on Linux) at any safe point, including the middle of a long-running loop.
When the scheduler preempts a goroutine, it saves the goroutine's register state and stack pointer, and is free to schedule it later on any available OS thread from the pool. The goroutine is unaware of which OS thread it resumes on. For pure Go code, this is correct — the Go memory model guarantees that all Go heap accesses are safe regardless of which OS thread executes them.
For CGO code that has accumulated OS-thread-local C state, this model is catastrophically wrong:
Timeline of a scheduler migration under WAM:
T=0ms Goroutine G starts on OS thread M17.
runtime.LockOSThread() NOT called (incorrect).
C.PL_thread_attach_engine(nil) → WAM engine E17 attached to M17.
fid := C.PL_open_foreign_frame() → frame F opened on E17's local stack.
args := C.PL_new_term_refs(4) → slots 0-3 on E17's local stack frame F.
C.make_firewall_dict(args, ...) → term data written to E17's local/global stacks.
T=0.1ms Go scheduler preempts G at a cgo return boundary.
G's Go stack saved. G moved to run queue.
OS thread M17 picks up a different goroutine G2.
T=0.2ms Scheduler resumes G on OS thread M22.
M22 has a different WAM engine E22 (or no engine at all).
T=0.2ms G calls C.PL_open_query(nil, flags, pred, args).
Inside libswipl: current engine = E22 (looked up via pthread_self()).
'args' is still the term_t value from E17's local stack frame F.
E22's local stack is at a completely different memory address.
PL_open_query dereferences 'args' as an index into E22's local stack.
E22's local stack at that index contains unrelated data or is unmapped.
RESULT: SIGSEGV. Go binary dead.
Or, if E22 happens to have valid memory at that index:
Silent data corruption — query executes against wrong term values.
Firewall verdict is computed for the wrong IP address.
No error is returned. The Go caller receives a plausible but incorrect result.
The silent corruption case is worse than the SIGSEGV. A SIGSEGV kills the binary and forces an investigation. Silent wrong results pass through the system, log to the audit trail as successful queries, and surface only when a miscategorised IP address causes a production incident.
16.1.2 runtime.LockOSThread() — The Pin
func runtime.LockOSThread()
The pin is implemented at the g (goroutine) struct level in the Go runtime: g.lockedm is set to the current m (machine/OS thread), and m.lockedg is set to the current g. The scheduler's findrunnable() function skips OS threads with lockedg set — the thread runs only the locked goroutine until unlocked. There is no Go scheduler path that can migrate a locked goroutine.
runtime.LockOSThread() permanently wires the calling goroutine to its current OS thread. No other goroutine will execute on that thread. The goroutine will continue to execute on that specific OS thread until runtime.UnlockOSThread() is called the same number of times as runtime.LockOSThread(). If the goroutine exits without calling runtime.UnlockOSThread(), the Go runtime terminates the OS thread entirely — permanently removing it from the available thread pool. For worker goroutines designed to exit and be replaced, defer runtime.UnlockOSThread() is therefore mandatory: it preserves the OS thread for the replacement worker's use.
The placement requirement is absolute:
// WRONG: LockOSThread after any code that could trigger preemption
func workerLoopIncorrect() {
// Goroutine may be on M17 at goroutine start.
doSomeSetup() // <- any function call is a preemption point
// Goroutine may now be on M22 due to preemption during doSomeSetup().
runtime.LockOSThread() // Too late: locks M22, not M17
C.PL_thread_attach_engine(nil) // Engine attached to M22 — correct
// ... but if any prior code left state on M17, it is inaccessible
}
// CORRECT: LockOSThread as the absolute first statement
func workerLoopCorrect() {
runtime.LockOSThread() // First. No preemption possible before this.
defer runtime.UnlockOSThread() // Released on goroutine exit — thread returned to pool
C.PL_thread_attach_engine(nil) // Engine attached to THIS thread — guaranteed stable
// All subsequent FLI calls execute on the same OS thread.
}
The defer runtime.UnlockOSThread() is equally important: when a locked goroutine exits without calling runtime.UnlockOSThread(), the Go runtime terminates the OS thread entirely. For a pool where workers are expected to restart after panic recovery, defer runtime.UnlockOSThread() preserves the OS thread — the replacement worker goroutine inherits a fresh, unlocked thread from the pool rather than forcing the runtime to create a new one.
16.1.2.1 The Residual Migration Problem: NUMA Node Affinity
runtime.LockOSThread() prevents the Go scheduler from migrating the goroutine to a different OS thread. It does not prevent the Linux kernel's CPU scheduler from migrating the OS thread itself to a different physical CPU core. On a single-socket server this distinction is irrelevant — all cores share the same L3 cache. On a dual-socket server or a Proxmox VM pinned to a NUMA node with vCPUs from both sockets, the Linux scheduler is free to migrate the locked OS thread from a core on NUMA node 0 to a core on NUMA node 1 between WAM instruction dispatch cycles.
The WAM's performance depends heavily on L1/L2 cache locality for its local stack traversals. A worker's WAM local stack occupies a contiguous region of physical memory — pages allocated by the WAM's mmap call during PL_thread_attach_engine. Those pages are faulted in on first access and placed in the physical memory of the NUMA node that serviced the fault. If the page faults occurred on node 0, the pages live in node 0's memory controllers. When the Linux scheduler migrates the OS thread to a core on node 1, every subsequent WAM local stack access crosses the QPI/UPI interconnect — 3–4× higher latency than local DRAM access on a dual-socket EPYC or Xeon system. For a tight findall/3 loop over 30,000 tutorial_fact/3 clauses, cross-NUMA stack access degrades throughput by 40–60% compared to the local-NUMA baseline.
runtime.LockOSThread() alone is not sufficient to prevent this. OS-level NUMA binding is required:
# Option A — numactl: bind the entire Go binary to NUMA node 0.
# All worker OS threads, WAM engine stack allocations, and Go heap pages
# are placed on node 0's memory controllers. QPI crossings: zero.
logicadmin@logic-node-01:~$ numactl --cpunodebind=0 --membind=0 \
./firewall-bridge
# Option B — systemd service unit (persistent, no wrapper script):
# /etc/systemd/system/firewall-bridge.service [Service] section:
# CPUAffinity=0-15 ← physical cores on NUMA node 0 (verify with lscpu)
# NUMAPolicy=bind ← Linux ≥ 5.13: enforce NUMA memory policy
# NUMAMask=0 ← bind to NUMA node 0
# Verify after startup: >99% Numa_Hit on node 0 means no cross-NUMA traffic
logicadmin@logic-node-01:~$ numastat -p $(pgrep firewall-bridge)
Node 0 Node 1
Numa_Hit 98234018 147 ← 99.9% local: correct
Numa_Miss 89 0
The NUMA binding complements runtime.LockOSThread(): the lock eliminates goroutine migration (Go-level); the NUMA binding eliminates OS-thread migration between sockets (kernel-level). Together they guarantee that a worker goroutine executes on the same physical CPU core family, with its WAM local stack hot in that socket's L3 cache, for the duration of the worker's lifetime.
16.1.3 The SIGSEGV Anatomy — Memory Layout at Fault Time
When a goroutine executes an FLI call after scheduler migration, the actual SIGSEGV occurs inside libswipl.so at the point where the WAM dereferences a term_t value as a local stack offset:
// Pseudocode of PL_get_chars internals (simplified):
int PL_get_chars(term_t t, char **s, unsigned flags) {
PL_local_data_t *ld = LD; // LD = current thread's WAM local data
// Fetched via pthread_getspecific() using
// the WAM's thread-local storage key.
// After migration: LD points to M22's WAM data.
Word *p = valTermRef(t); // p = &ld->stacks.local.base[t]
// 't' was allocated on M17's local stack.
// M22's local stack base is a different address.
// p = (arbitrary address) + t_offset
// ... dereference p ... // SIGSEGV if p is outside mapped memory
}
The Go runtime catches SIGSEGV as a panic only if it originates from Go code. A SIGSEGV inside libswipl.so — native C code — is delivered as a fatal signal to the process. Go's panic/recover mechanism does not intercept C-level signals. The process dies with no stacktrace from the Prolog side, only a Go runtime crash dump showing the signal delivery point somewhere inside the libswipl shared library.
This is why Section 16.5's panic isolation strategy addresses Go-level panics (which recover() catches) but cannot address C-level SIGSEGV from incorrect thread migration — the only prevention is correct runtime.LockOSThread() placement before any FLI call.
16.2 The Thread-Locked Worker Architecture
16.2.1 Design Constraints
The pool architecture must satisfy four simultaneous constraints:
-
One WAM engine per OS thread, for the lifetime of the worker. Each worker goroutine holds one locked OS thread and one
PL_thread_attach_enginecontext. The engine is initialised once, used for all queries in the worker's lifetime, and destroyed only on shutdown or panic recovery. -
No shared mutable state between worker goroutines. Workers share the WAM clause database (read-only in normal operation) but have independent local stacks, global stacks, and trail stacks. The
tutorial_fact/3KB andfirewall_verdict/4rules are read-only — concurrent reads are safe without synchronisation. No worker callsassertzorretract. -
Bounded in-flight query count. The dispatcher channel depth equals the pool size. When all workers are busy, the dispatcher blocks the HTTP handler goroutine, which blocks the net/http connection read, which applies back-pressure to the TCP connection. The system degrades gracefully under overload — it slows down without exhausting memory.
-
The pool must self-heal. A panicking worker must not reduce the pool below its target size. The pool manager monitors worker exit signals and spawns replacements. The HTTP handler is unaware of worker restarts — it sees only the dispatcher channel.
16.2.2 Diagram: Thread-Locked Engine Pool
%%{init: {"themeVariables": {"fontSize": "14px"}}}%%
flowchart TD
H1["HTTP Handler 1\nnet/http goroutine\nFree-floating — scheduler may migrate\nSends WorkItem to dispatchCh"]
H2["HTTP Handler 2\nnet/http goroutine\nBlocks on dispatchCh if pool full\nBack-pressure to TCP connection"]
HN["HTTP Handler N\nnet/http goroutine\nBlocked by back-pressure\nNo new WAM allocation until worker free"]
DISP["Dispatcher\ndispatchCh chan WorkItem\nbuffered depth = poolSize\nBlocks producers when full\nFanout: one WorkItem per worker"]
CTRL["Pool Manager\nmonitors workerDone chan\nrespawns on worker exit\nmaintains target pool size\nbroadcasts ControlMsg via ctrlCh"]
W1["Worker 1\nruntime.LockOSThread() — first line\nPL_thread_attach_engine()\nOwns WAM Engine E1\nOwnss OS Thread M_k1\nProcesses WorkItems serially"]
W2["Worker 2\nruntime.LockOSThread()\nPL_thread_attach_engine()\nOwns WAM Engine E2\nOwns OS Thread M_k2"]
WN["Worker N\nruntime.LockOSThread()\nPL_thread_attach_engine()\nOwns WAM Engine EN\nOwns OS Thread M_kN"]
E1["WAM Engine E1\nLocal Stack: private\nGlobal Stack: private\nClause DB: SHARED read-only\nAtom Table: SHARED read-only"]
E2["WAM Engine E2\nIndependent stacks\nSame shared DB/Atoms"]
EN["WAM Engine EN\nIndependent stacks\nSame shared DB/Atoms"]
DB["Shared Clause Database\ntutorial_fact/3\nfirewall_verdict/4\nAll loaded .pl rules\nRead-only from workers"]
H1 --->|"WorkItem{req, replyCh}"| DISP
H2 --->|"WorkItem{req, replyCh}"| DISP
HN --->|"blocks — back-pressure"| DISP
DISP --->|"WorkItem to worker inCh"| W1
DISP --->|"WorkItem to worker inCh"| W2
DISP --->|"WorkItem to worker inCh"| WN
CTRL --->|"ControlMsg via ctrlCh"| W1
CTRL --->|"ControlMsg via ctrlCh"| W2
CTRL --->|"ControlMsg via ctrlCh"| WN
CTRL --->|"respawn on workerDone signal"| W1
W1 --->|"attaches"| E1
W2 --->|"attaches"| E2
WN --->|"attaches"| EN
E1 --->|"reads"| DB
E2 --->|"reads"| DB
EN --->|"reads"| DB
style H1 fill:#1A2B4A,color:#FFFFFF
style H2 fill:#1A2B4A,color:#FFFFFF
style HN fill:#1A2B4A,color:#FFFFFF
style DISP fill:#7A1A1A,color:#FFFFFF
style CTRL fill:#8B6914,color:#FFFFFF
style W1 fill:#1A4070,color:#FFFFFF
style W2 fill:#1A4070,color:#FFFFFF
style WN fill:#1A4070,color:#FFFFFF
style E1 fill:#2A5A2A,color:#FFFFFF
style E2 fill:#2A5A2A,color:#FFFFFF
style EN fill:#2A5A2A,color:#FFFFFF
style DB fill:#1A6B3A,color:#FFFFFF
Reading the diagram: Dark blue HTTP handlers are free-floating goroutines — the Go scheduler may migrate them freely because they never touch the WAM. The red dispatcher is the single back-pressure point — it blocks all producers when the pool is saturated. Blue workers are the WAM boundary — each is pinned to one OS thread and owns one green WAM engine. The bright green clause database is shared and read-only — no synchronisation required for concurrent reads.
16.3 The Build: engine_pool.go
16.3.1 Package Structure
/opt/logic-node/go/firewall-bridge/
├── cgo_bridge.go (Chapter 15 — single-query FLI, QueryFirewall)
├── marshal.c / .h (Chapter 15 — make_firewall_dict, init_atoms)
├── engine_pool.go ← this chapter — worker pool, dispatcher, control channel
├── main.go (updated — uses Pool.Dispatch instead of QueryFirewall)
└── go.mod
16.3.2 Implementation
// File: /opt/logic-node/go/firewall-bridge/engine_pool.go
//
// Thread-locked WAM worker pool.
//
// ARCHITECTURE CONTRACT:
// — Each worker goroutine is pinned to one OS thread via runtime.LockOSThread().
// — LockOSThread() is the FIRST statement in workerLoop() — before any FLI call,
// before any function call that could trigger a preemption point.
// — Each worker attaches one WAM engine via C.PL_thread_attach_engine(nil).
// — Workers share the WAM clause database (read-only).
// — Workers have independent local, global, and trail stacks.
// — All inter-goroutine communication is via typed Go channels.
// — No mutex protects WAM state — no shared WAM state exists between workers.
//
// POOL SIZING:
// poolSize = GOMAXPROCS - 2
// (minus 2: one for main goroutine, one for pool manager and Go runtime GC)
// Never exceed GOMAXPROCS — locked OS threads beyond CPU count thrash the scheduler.
//
// RECOVERY CONTRACT:
// — workerLoop defers a recovery function that:
// 1. Catches any Go panic (including those propagated from cgo).
// 2. Calls C.PL_thread_destroy_engine() — releases WAM engine from OS thread.
// 3. Calls runtime.UnlockOSThread() — returns OS thread to Go runtime pool.
// 4. Signals pool manager via workerDone channel.
// — Pool manager respawns a replacement worker to maintain pool size.
// — Panicking worker never reduces the pool below target size for more than
// the time required to start a new goroutine and attach a fresh WAM engine.
package main
/*
#cgo CFLAGS: -I/usr/lib/swi-prolog/include -D_FORTIFY_SOURCE=2 -fstack-protector-strong
#cgo LDFLAGS: -lswipl -L/usr/lib/swi-prolog/lib/x86_64-linux -Wl,-z,relro,-z,now
#include <SWI-Prolog.h>
#include <stdlib.h>
#include "marshal.h"
// destroy_engine_safe: calls PL_thread_destroy_engine with a NULL check.
// PL_thread_destroy_engine is undefined behaviour if called on a thread
// that never successfully called PL_thread_attach_engine. The attached
// flag parameter allows the caller to guard this correctly.
static void destroy_engine_safe(int attached) {
if (attached) {
PL_thread_destroy_engine();
}
}
*/
import "C"
import (
"errors"
"fmt"
"log"
"runtime"
"sync"
"sync/atomic"
"time"
)
// ─────────────────────────────────────────────────────────────────────────────
// TYPES
// ─────────────────────────────────────────────────────────────────────────────
// WorkItem is the unit dispatched to a worker.
// replyCh is a caller-owned, unbuffered channel.
// The caller blocks on replyCh until the worker sends the verdict.
type WorkItem struct {
Req FirewallRequest
ReplyCh chan<- WorkResult
}
// WorkResult carries the worker's response back to the caller.
// Exactly one of Verdict or Err will be non-zero.
type WorkResult struct {
Verdict FirewallVerdict
Err error
WorkerID int // Which worker processed this — for tracing
Latency time.Duration
}
// ControlMsg is sent to all workers via the broadcast control channel.
// Workers act on it between query processing iterations.
type ControlMsg struct {
Kind ControlKind
Payload string // Optional — e.g., new KB path for Reload
}
type ControlKind int
const (
ControlSync ControlKind = iota // No-op ping — verify worker is alive
ControlReload // Reload KB from Payload path
ControlShutdown // Graceful shutdown — worker exits loop cleanly
)
// ─────────────────────────────────────────────────────────────────────────────
// WORKER
// ─────────────────────────────────────────────────────────────────────────────
// worker is an internal struct owned by the pool manager.
// It is not exported — callers interact only with Pool.
type worker struct {
id int
inCh chan WorkItem // Inbound work — capacity 1 (worker processes one at a time)
ctrlCh chan ControlMsg // Broadcast control messages
doneCh chan<- int // Signals pool manager on exit — sends worker.id
}
// workerLoop is the goroutine body for each pool worker.
// INVARIANT: runtime.LockOSThread() is called BEFORE any other statement.
// This function never returns normally — it loops until ControlShutdown or panic.
func workerLoop(w worker) {
// ── STEP 1: Pin goroutine to current OS thread. ──────────────────────────
// This must be the first statement. Any function call before this is a
// potential preemption point that could migrate the goroutine to a
// different OS thread before the WAM engine is attached.
runtime.LockOSThread()
// engineAttached tracks whether PL_thread_attach_engine succeeded.
// The deferred recovery function uses this to guard PL_thread_destroy_engine.
engineAttached := false
// ── STEP 2: Register deferred recovery. ─────────────────────────────────
// Deferred functions run on panic unwind. The recovery function MUST be
// registered before PL_thread_attach_engine — if attach itself panics
// (e.g., engine limit exceeded), the recovery must still run.
defer func() {
if r := recover(); r != nil {
log.Printf("[Worker %d] PANIC recovered: %v — destroying engine and respawning",
w.id, r)
}
// Destroy WAM engine on this OS thread (if one was successfully attached).
// PL_thread_destroy_engine releases all WAM state for this thread:
// local stack, global stack, trail, pending exceptions, open queries.
// After this call, no FLI function may be called on this OS thread.
C.destroy_engine_safe(C.int(func() int {
if engineAttached {
return 1
}
return 0
}()))
// Release the OS thread back to the Go runtime's thread pool.
// Without this, the dying goroutine's OS thread is terminated by the
// runtime, shrinking the available thread pool by one permanently.
runtime.UnlockOSThread()
// Signal pool manager: this worker has exited, pool needs a replacement.
// Non-blocking send — pool manager may have already initiated shutdown.
select {
case w.doneCh <- w.id:
default:
}
}()
// ── STEP 3: Attach WAM engine to this OS thread. ─────────────────────────
// PL_thread_attach_engine(nil) uses global default WAM stack sizes:
// Local stack: typically 128MB (controlled by --stack-limit at startup)
// Global stack: shared with local stack limit
// Trail stack: embedded in local stack
//
// In production, use PL_create_engine() with an explicit PL_engine_t attrs
// struct to bound each worker's stack independently of the global default.
// See Section 16.3.2.1 for the bounded-stack pattern.
// For this implementation, nil uses the global limit set by InitEngine's
// PL_initialise argv: "--stack-limit=64M" (matching Chapter 14's worker_stack_mb).
if rc := C.PL_thread_attach_engine(nil); rc < 0 {
// Negative return: engine limit exceeded or out of memory.
// Panic here — the deferred recovery will clean up.
panic(fmt.Sprintf("worker %d: PL_thread_attach_engine failed (rc=%d)", w.id, rc))
}
engineAttached = true
log.Printf("[Worker %d] WAM engine attached (OS thread locked)", w.id)
// ── STEP 4: Query processing loop. ───────────────────────────────────────
// select on both work channel and control channel.
// Work items are processed one at a time — the worker is serialised.
// No two queries share this worker's WAM engine simultaneously.
for {
select {
case item, ok := <-w.inCh:
if !ok {
// Channel closed — pool manager is shutting down this worker.
return
}
start := time.Now()
verdict, err := QueryFirewall(item.Req)
// QueryFirewall from cgo_bridge.go already handles:
// - PL_open_foreign_frame / PL_discard_foreign_frame per call
// - PL_exception extraction before PL_close_query
// - C.CString / C.free discipline
// The worker loop does not need to manage FLI lifecycle — QueryFirewall
// encapsulates the full frame-open/query/extract/close/frame-discard cycle.
item.ReplyCh <- WorkResult{
Verdict: verdict,
Err: err,
WorkerID: w.id,
Latency: time.Since(start),
}
case ctrl := <-w.ctrlCh:
if err := handleControl(w.id, ctrl); err != nil {
// handleControl returns non-nil only for ControlShutdown.
// Clean exit — no panic, no engine destruction needed for normal shutdown.
// The defer still fires and calls destroy_engine_safe — correct.
log.Printf("[Worker %d] Shutdown on control signal", w.id)
return
}
}
}
}
// handleControl processes a ControlMsg on the calling worker.
// Returns non-nil to signal the worker loop to exit cleanly (ControlShutdown).
// Returns nil to signal the worker loop to continue (ControlSync, ControlReload).
func handleControl(workerID int, ctrl ControlMsg) error {
switch ctrl.Kind {
case ControlShutdown:
return errors.New("shutdown")
case ControlSync:
// Ping — no action required. The fact that this case was reached
// proves the worker goroutine is alive and on its locked OS thread.
log.Printf("[Worker %d] Sync acknowledged", workerID)
return nil
case ControlReload:
// Reload the KB on this worker's WAM engine.
// consultFile from cgo_bridge.go is safe to call here — it uses
// PL_open_foreign_frame / PL_discard_foreign_frame internally.
// The reload affects only this worker's engine — other workers
// continue to use their current KB state until they receive
// the same ControlReload message via the broadcast.
log.Printf("[Worker %d] Reloading KB: %s", workerID, ctrl.Payload)
if err := consultFile(ctrl.Payload); err != nil {
log.Printf("[Worker %d] KB reload FAILED: %v", workerID, err)
// Do not return error — worker continues with old KB.
// Pool manager logs the reload failure via the status API.
} else {
log.Printf("[Worker %d] KB reload OK", workerID)
}
return nil
default:
log.Printf("[Worker %d] Unknown control kind %d — ignoring", workerID, ctrl.Kind)
return nil
}
}
// ─────────────────────────────────────────────────────────────────────────────
// POOL
// ─────────────────────────────────────────────────────────────────────────────
// Pool manages a fixed-size collection of thread-locked WAM workers.
// All exported methods are goroutine-safe.
type Pool struct {
workers []*worker
dispatchCh chan WorkItem // Single inbound channel for all work
ctrlCh chan ControlMsg // Broadcast to all workers
doneCh chan int // Workers signal exit here — triggers respawn
size int
mu sync.Mutex // Protects workers slice during respawn only
shutdown chan struct{} // Closed by Stop() to halt the manager loop
stopped atomic.Bool // True after Stop() returns
}
// NewPool creates and starts a worker pool of the given size.
// kbPath is passed to InitEngine — the KB must be loadable before workers start.
// Returns an error if engine initialisation fails.
func NewPool(size int, kbPath string) (*Pool, error) {
if size <= 0 {
return nil, fmt.Errorf("pool size must be positive, got %d", size)
}
// Initialise the shared WAM engine (loads KB into shared clause database).
// Workers attach their per-thread engine contexts after this.
if err := InitEngine(kbPath); err != nil {
return nil, fmt.Errorf("engine init failed: %w", err)
}
p := &Pool{
workers: make([]*worker, size),
// dispatchCh depth = pool size: maximum one queued item per worker.
// Deeper buffer allows more queuing but delays back-pressure.
// At depth=size, a fully-saturated pool with one queued item per worker
// means at most 2×size queries are in-flight at any moment.
dispatchCh: make(chan WorkItem, size),
ctrlCh: make(chan ControlMsg, size*2), // Buffered: broadcasts don't block sender
doneCh: make(chan int, size),
size: size,
shutdown: make(chan struct{}),
}
// Start all workers.
for i := 0; i < size; i++ {
p.startWorker(i)
}
// Start the dispatcher goroutine — routes dispatchCh items to worker inCh.
go p.dispatchLoop()
// Start the pool manager goroutine — monitors doneCh and respawns workers.
go p.managerLoop()
return p, nil
}
// startWorker creates a worker struct and launches its goroutine.
// Caller must hold p.mu if called after initial pool creation.
func (p *Pool) startWorker(id int) {
w := &worker{
id: id,
inCh: make(chan WorkItem, 1), // Depth 1: worker accepts one item, processes it
ctrlCh: p.ctrlCh, // Shared broadcast channel — all workers read it
doneCh: p.doneCh,
}
p.workers[id] = w
go workerLoop(*w)
}
// dispatchLoop routes WorkItems from the shared dispatchCh to individual worker inCh.
// Uses a round-robin selection across worker inCh channels to distribute work evenly.
// A worker whose inCh is full (processing a prior item) is skipped in this round.
func (p *Pool) dispatchLoop() {
idx := 0
for {
select {
case <-p.shutdown:
return
case item := <-p.dispatchCh:
// Try to deliver to the next available worker using a rotating index.
// If all workers are busy, fall back to a blocking send on any worker.
delivered := false
start := idx
for {
p.mu.Lock()
w := p.workers[idx%p.size]
p.mu.Unlock()
idx = (idx + 1) % p.size
// Non-blocking send attempt: if worker inCh has capacity, deliver immediately.
select {
case w.inCh <- item:
delivered = true
default:
// Worker busy — try next
}
if delivered {
break
}
if idx == start {
// Full rotation — all workers busy. Block on the first worker.
// This applies back-pressure: dispatchLoop blocks, dispatchCh fills,
// Dispatch() blocks, HTTP handler blocks, TCP back-pressure applied.
p.mu.Lock()
first := p.workers[start%p.size]
p.mu.Unlock()
first.inCh <- item
break
}
}
}
}
}
// managerLoop monitors worker exits and respawns replacements.
// A worker that exits (cleanly or via panic recovery) sends its ID to doneCh.
// The manager respawns a replacement worker with the same ID.
func (p *Pool) managerLoop() {
for {
select {
case <-p.shutdown:
return
case deadID := <-p.doneCh:
if p.stopped.Load() {
// Pool is stopping — do not respawn.
continue
}
log.Printf("[Pool Manager] Worker %d exited — respawning", deadID)
// Brief backoff before respawn: if the worker panicked due to a
// systematic error (e.g., KB corruption), rapid respawning loops
// infinitely and floods the log. The backoff limits respawn rate
// to one per 100ms per worker slot.
time.Sleep(100 * time.Millisecond)
p.mu.Lock()
p.startWorker(deadID)
p.mu.Unlock()
log.Printf("[Pool Manager] Worker %d respawned", deadID)
}
}
}
// ─────────────────────────────────────────────────────────────────────────────
// PUBLIC API
// ─────────────────────────────────────────────────────────────────────────────
// Dispatch sends a FirewallRequest to the worker pool and blocks until a result
// is available or the context deadline is exceeded.
// Returns WorkResult — check WorkResult.Err for query errors.
// Returns error if the pool is stopped or the send times out.
func (p *Pool) Dispatch(req FirewallRequest, timeout time.Duration) (WorkResult, error) {
if p.stopped.Load() {
return WorkResult{}, errors.New("pool is stopped")
}
// Unbuffered reply channel — caller blocks until exactly one result arrives.
// Using a size-1 buffer to prevent a goroutine leak if the caller times out
// before reading the result: the worker can still send without blocking.
replyCh := make(chan WorkResult, 1)
item := WorkItem{Req: req, ReplyCh: replyCh}
// Send to dispatch channel with timeout.
// If dispatchCh is full (all workers busy, queue full), this blocks.
// The timeout converts back-pressure into an explicit error for the caller.
select {
case p.dispatchCh <- item:
// Successfully queued — now wait for result.
case <-time.After(timeout):
return WorkResult{}, fmt.Errorf("dispatch timeout after %v (pool saturated)", timeout)
}
// Wait for result with the same timeout (query execution time).
select {
case result := <-replyCh:
return result, nil
case <-time.After(timeout):
return WorkResult{}, fmt.Errorf("result timeout after %v (worker stalled)", timeout)
}
}
// Stop initiates graceful pool shutdown.
// Broadcasts ControlShutdown to all workers, waits for all to exit,
// then closes the dispatch and control channels.
// Stop blocks until all workers have acknowledged shutdown or the deadline expires.
func (p *Pool) Stop(deadline time.Duration) error {
p.stopped.Store(true)
close(p.shutdown) // Signals dispatchLoop and managerLoop to exit
// Broadcast shutdown to all workers.
p.Broadcast(ControlMsg{Kind: ControlShutdown})
// Wait for all workers to call doneCh.
done := make(chan struct{})
go func() {
received := 0
for received < p.size {
select {
case <-p.doneCh:
received++
case <-time.After(deadline):
log.Printf("[Pool] Shutdown timeout: %d/%d workers responded", received, p.size)
close(done)
return
}
}
close(done)
}()
<-done
log.Printf("[Pool] Shutdown complete")
return nil
}
// Broadcast sends a ControlMsg to all workers via the shared control channel.
// Workers consume control messages between query iterations — delivery is
// asynchronous but ordered (FIFO within the ctrlCh buffer).
// For Reload: all workers will eventually reload, but not simultaneously.
// For Shutdown: all workers will exit after finishing their current query.
func (p *Pool) Broadcast(msg ControlMsg) {
// Send N copies — one per worker. Each worker reads from the shared ctrlCh.
// Workers consume at their own rate — a busy worker delays its own reload.
for i := 0; i < p.size; i++ {
select {
case p.ctrlCh <- msg:
default:
// ctrlCh buffer full — this should not happen for a correctly sized buffer
// (ctrlCh capacity = size*2, Broadcast sends size messages).
// Log and drop rather than blocking Broadcast.
log.Printf("[Pool] WARNING: ctrlCh full on Broadcast — message dropped for slot %d", i)
}
}
}
// Status returns a snapshot of current pool state.
// Goroutine-safe — reads atomic values and channel lengths only.
func (p *Pool) Status() PoolStatus {
return PoolStatus{
PoolSize: p.size,
QueueDepth: len(p.dispatchCh),
QueueCap: cap(p.dispatchCh),
PctSaturated: len(p.dispatchCh) * 100 / cap(p.dispatchCh),
Stopped: p.stopped.Load(),
}
}
// PoolStatus is a point-in-time snapshot of pool health.
type PoolStatus struct {
PoolSize int
QueueDepth int
QueueCap int
PctSaturated int
Stopped bool
}
16.3.2.1 Production Engine Pre-Allocation: Bounding Stack Size with PL_create_engine
PL_thread_attach_engine(nil) allocates a WAM engine using the global stack limit — the --stack-limit value passed to PL_initialise at startup. With 16 workers all sharing one global limit of 64MB, a single runaway query (an accidentally infinite recursive rule, a KB loaded without the tabling annotation from Chapter 17) can allocate its way to the full 64MB before the WAM stack overflow signal fires. By the time the overflow is detected and the panic path in Section 16.5 runs, the process may be under severe memory pressure. The Go GC and the WAM garbage collector compete for physical RAM during the overflow window.
PL_create_engine (available in SWI-Prolog 9.1+) constructs an engine with explicitly bounded stacks before it is attached to an OS thread, isolating each worker's stack allocation budget from all others:
// marshal.h — add to the C layer already compiled into the bridge
// create_bounded_engine: creates an engine with a strict per-worker stack cap.
// local_mb: per-worker local + trail stack limit in megabytes
// global_mb: per-worker global stack limit in megabytes
// Returns a PL_engine_t handle on success, NULL on failure.
// Call PL_set_engine(handle, NULL) on the calling OS thread to attach.
// Call PL_destroy_engine(handle) on cleanup.
static PL_engine_t create_bounded_engine(int local_mb, int global_mb) {
PL_engine_t e;
// PL_create_engine accepts a term_t options dict with stack size keys.
// We build this via the FLI on the current thread's engine (main engine).
// local(N): local + trail stack combined limit in bytes
// global(N): global stack limit in bytes
// Build options Dict: engine{local: L, global: G}
term_t opts = PL_new_term_refs(3);
term_t lval = opts + 1;
term_t gval = opts + 2;
if (!PL_put_int64(lval, (int64_t)local_mb * 1024 * 1024)) return NULL;
if (!PL_put_int64(gval, (int64_t)global_mb * 1024 * 1024)) return NULL;
atom_t keys[2] = { PL_new_atom("local"), PL_new_atom("global") };
if (!PL_put_dict(opts, PL_new_atom("engine"), 2, keys, lval)) return NULL;
e = PL_create_engine(opts); // NULL on failure (OOM, limit exceeded)
return e;
}
// attach_bounded_engine: attaches a pre-created engine to the calling OS thread.
// Must be called after runtime.LockOSThread() and before any FLI query.
// Returns 1 on success, 0 on failure.
static int attach_bounded_engine(PL_engine_t e) {
PL_engine_t prev;
return PL_set_engine(e, &prev) == PL_ENGINE_SET;
}
// detach_and_destroy_engine: detaches from the calling OS thread and destroys.
// Call in the recovery defer in place of PL_thread_destroy_engine().
static void detach_and_destroy_engine(PL_engine_t e) {
PL_set_engine(PL_ENGINE_MAIN, NULL); // Detach: revert to main engine context
PL_destroy_engine(e); // Release all stacks and internal state
}
The revised workerLoop STEP 3 using PL_create_engine:
// In cgo preamble, expose create_bounded_engine and attach_bounded_engine
// to Go as C functions callable via C.create_bounded_engine(local_mb, global_mb)
// and C.attach_bounded_engine(engine_handle).
// STEP 3 (production variant — bounded stacks):
// localMB and globalMB match the stack_limit from Chapter 14's worker_stack_mb (64MB).
// Each worker is independently capped — a runaway query in worker 7 cannot
// exhaust the global stack allocation and affect workers 0-6 and 8-15.
engineHandle := C.create_bounded_engine(C.int(64), C.int(64))
if engineHandle == nil {
panic(fmt.Sprintf("worker %d: create_bounded_engine failed", w.id))
}
if C.attach_bounded_engine(engineHandle) == 0 {
C.detach_and_destroy_engine(engineHandle)
panic(fmt.Sprintf("worker %d: attach_bounded_engine failed", w.id))
}
engineAttached = true
// Update the deferred recovery to use detach_and_destroy_engine(engineHandle)
// instead of destroy_engine_safe(1) — the C.PL_thread_destroy_engine path
// is not correct for PL_create_engine-allocated handles.
The per-worker isolation means: at 16 workers × 64MB local + 64MB global = 2GB total maximum WAM stack allocation. At 128GB physical RAM, this is 1.5% of available memory reserved for worker stacks. Without PL_create_engine, a single stack-overflow failure before the panic catcher fires can allocate the full global limit — 64MB — in one burst rather than 4MB per worker. The isolation boundary is the difference between one worker failing cleanly and 15 workers experiencing memory pressure during the failure.
16.3.3 Updated main.go — HTTP Handler with Pool Dispatch
// File: /opt/logic-node/go/firewall-bridge/main.go (updated for pool)
package main
import (
"encoding/json"
"log"
"net/http"
"runtime"
"time"
)
var globalPool *Pool
func main() {
// Pool size: GOMAXPROCS minus 2 (main goroutine + pool manager/GC).
poolSize := runtime.GOMAXPROCS(0) - 2
if poolSize < 1 {
poolSize = 1
}
var err error
globalPool, err = NewPool(poolSize, "/opt/logic-node/kb/firewall_policy.pl")
if err != nil {
log.Fatalf("Pool init failed: %v", err)
}
defer globalPool.Stop(5 * time.Second)
http.HandleFunc("/firewall", firewallHandler)
http.HandleFunc("/reload", reloadHandler)
http.HandleFunc("/status", statusHandler)
log.Printf("Firewall bridge listening on :8080 (pool size: %d)", poolSize)
log.Fatal(http.ListenAndServe(":8080", nil))
}
// firewallHandler: HTTP handler — free-floating goroutine (NOT locked to OS thread).
// Dispatches to pool via channel. Never touches WAM directly.
func firewallHandler(w http.ResponseWriter, r *http.Request) {
var req FirewallRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, "invalid JSON", http.StatusBadRequest)
return
}
// Dispatch with 500ms timeout — covers dispatch queue wait + query execution.
// A saturated pool returns a 503; a stalled worker returns a 504.
result, err := globalPool.Dispatch(req, 500*time.Millisecond)
if err != nil {
http.Error(w, err.Error(), http.StatusServiceUnavailable)
return
}
if result.Err != nil {
http.Error(w, result.Err.Error(), http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"allowed": result.Verdict.Allowed,
"reason": result.Verdict.Reason,
"rule_id": result.Verdict.RuleID,
"worker_id": result.WorkerID,
"latency_us": result.Latency.Microseconds(),
})
}
// reloadHandler: broadcasts a KB reload to all workers.
func reloadHandler(w http.ResponseWriter, r *http.Request) {
path := r.URL.Query().Get("path")
if path == "" {
path = "/opt/logic-node/kb/firewall_policy.pl"
}
globalPool.Broadcast(ControlMsg{Kind: ControlReload, Payload: path})
w.WriteHeader(http.StatusAccepted)
w.Write([]byte(`{"status":"reload_broadcast"}`))
}
// statusHandler: returns pool health snapshot.
func statusHandler(w http.ResponseWriter, r *http.Request) {
s := globalPool.Status()
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(s)
}
16.4 Broadcast Control Channels
16.4.1 The Reload Problem
Each worker owns a private WAM engine. The WAM clause database is shared (read-only in normal operation), but each engine's local state — current module, loaded file list, flag values — is per-engine. When the operator pushes an updated firewall_policy.pl (a new blocklist, an updated whitelist range), all running workers must reload the KB. The reload must happen without taking the pool offline, without a query window where some workers use the old KB and others use the new one, and without any worker holding an open foreign frame or open query during the reload.
The ControlReload broadcast via ctrlCh satisfies these constraints:
- The broadcaster (HTTP handler →
Pool.Broadcast) sends N copies ofControlReloadtoctrlCh. - Each worker reads its copy from
ctrlChwhen it next reaches theselectinworkerLoop— between query iterations. - A worker processing a query reads the control message only after the current
WorkItemfinishes — the reload never interrupts an in-flight query. consultFile(ctrl.Payload)re-loads the file into the WAM. In SWI-Prolog,consult/1on an already-loaded file performs an incremental update: new clauses are added, modified clauses are replaced, deleted clauses are retracted. The shared clause database is updated atomically at the clause-database mutex level — other workers reading the old version of a predicate during the consult see consistent (old or new, never partial) clause lists.- Workers reload in sequence across the pool, not simultaneously. This is intentional: simultaneous reloads on all engines would all contend on the clause database mutex at the same time, producing a pause.
16.4.2 Reload Sequencing and the Consistency Window
The consistency window — the period between the first worker reloading and the last worker reloading — means that concurrent queries may use different KB versions. For firewall policy this is acceptable: a CIDR range added to the blocklist takes effect for queries processed by the first reloading worker immediately, and for the last worker within at most size × average_query_time seconds. For a 16-worker pool at 20μs per query: 16 × 20μs = 320μs maximum consistency window. No blocklisted IP receives more than 320μs of continued access after a reload broadcast.
If the application requires strict atomic KB cutover (all workers switch simultaneously), the correct approach is a version-tagged clause database: the new rules are loaded as firewall_verdict_v2/4, the old rules remain as firewall_verdict_v1/4, and a shared atomic integer stores the active version number. Workers read the active version number at query time and dispatch to the corresponding predicate. When all workers are confirmed idle (via a synchronisation barrier), the version number is atomically incremented. This is a more complex architecture than the rolling reload and is warranted only when the consistency window is operationally unacceptable.
16.4.3 The Sync Ping — Liveness Verification
ControlSync verifies that a worker is alive and executing on its locked OS thread without disturbing its processing state. The pool manager uses it after a respawn to confirm the replacement worker is healthy:
// PingWorkers: broadcasts ControlSync and waits for all workers to log acknowledgement.
// Used after respawn, after reload, or as a periodic health check.
// This is a best-effort check — a worker that is processing a long query
// will acknowledge only after that query completes.
func (p *Pool) PingWorkers(timeout time.Duration) error {
p.Broadcast(ControlMsg{Kind: ControlSync})
// Sync acknowledgement is logged by each worker's handleControl.
// There is no reply channel for sync — liveness is inferred from
// the log output, not from a returned value.
// For a strict liveness check, use Dispatch with a trivial no-op query
// and verify that all N dispatched items return within the timeout.
time.Sleep(timeout)
return nil
}
16.5 Security: Panic Isolation and Engine Recovery
16.5.1 What Triggers a Worker Panic
A worker goroutine can panic from three sources, with different characteristics and recovery requirements:
Source 1: Go-level panic inside QueryFirewall.
A nil pointer dereference in the Go code of cgo_bridge.go, a type assertion failure, or an explicit panic() call. This is caught by Go's standard recover() mechanism. The WAM engine state after a Go-level panic depends on where in the FLI call sequence the panic occurred:
- If the panic occurred before
PL_open_foreign_frame: engine is clean. - If the panic occurred between
PL_open_foreign_frameandPL_discard_foreign_frame: a foreign frame is open on the WAM local stack. The engine is not corrupted but the frame is leaked. - If the panic occurred between
PL_open_queryandPL_close_query: a query is open. The engine has pending choice points and variable bindings.
In all cases, the correct recovery action is PL_thread_destroy_engine() rather than attempting to close the query or discard the frame — the Go stack unwound without executing those cleanup calls, and the WAM state is not recoverable to a known-clean state without destroying the engine entirely.
Source 2: CGO-propagated C error converted to Go panic.
Some CGO calls use panic() to signal unrecoverable C-layer errors — PL_thread_attach_engine failure, as implemented in workerLoop. These are Go-level panics triggered by Go code and are caught by recover().
Source 3: C-level SIGSEGV from incorrect thread migration.
A SIGSEGV inside libswipl.so is delivered as a process-level signal. Go's runtime catches SIGSEGV originating from Go code as a panic (for stack overflow, nil dereference in Go). A SIGSEGV in C code is a fatal signal — recover() does not catch it. This is why runtime.LockOSThread() as the first statement in workerLoop is not optional — it is the prevention mechanism for this source. If the lock is in place, source 3 cannot occur during normal operation.
16.5.2 The Recovery Sequence — Annotated
// The defer in workerLoop, fully annotated:
defer func() {
// ── Phase 1: Catch the panic. ────────────────────────────────────────────
r := recover()
// recover() returns nil if the goroutine exited cleanly (ControlShutdown).
// recover() returns the panic value if the goroutine panicked.
// After recover() returns, the goroutine is in a normal (non-panicking) state.
// All deferred functions after this point execute normally.
if r != nil {
log.Printf("[Worker %d] PANIC: %v", w.id, r)
// Do not re-panic. The pool manager will respawn this worker.
// Re-panicking would propagate up to the Go runtime and crash the binary.
}
// ── Phase 2: Destroy the WAM engine on this OS thread. ──────────────────
// Regardless of why the goroutine is exiting (clean or panic),
// PL_thread_destroy_engine releases all WAM resources for this thread:
// - Closes any open foreign frames (reclaims local stack)
// - Closes any open queries (unwinds choice points, releases trail)
// - Frees the thread's local data structure (PL_local_data_t)
// - Detaches the engine from the OS thread (clears pthread_setspecific)
//
// After this call: no FLI function may be called from this OS thread.
// This OS thread is clean and can be reused by the Go runtime.
C.destroy_engine_safe(C.int(func() int {
if engineAttached { return 1 }
return 0
}()))
// ── Phase 3: Unlock the OS thread. ──────────────────────────────────────
// Returns the OS thread to Go's thread pool for use by other goroutines.
// Without this: the dying goroutine's OS thread is TERMINATED by the runtime,
// permanently reducing Go's available thread pool by one.
// With this: the OS thread survives and is available for the replacement worker.
runtime.UnlockOSThread()
// ── Phase 4: Signal pool manager. ───────────────────────────────────────
// Non-blocking send — if pool manager is shutting down, doneCh may be
// unread. The select prevents the goroutine from blocking on exit.
select {
case w.doneCh <- w.id:
default:
}
// Goroutine exits. OS thread survives (UnlockOSThread called).
// Pool manager receives w.id on doneCh and spawns a replacement.
}()
16.5.3 The Respawn Rate Limiter — Preventing Panic Loops
A worker that panics immediately after respawn — because the KB is corrupted, the WAM engine limit is exceeded, or the FLI call that triggers the panic is in the first query — will respawn, panic, respawn, panic, infinitely. Without rate limiting, this produces:
At 100 panics/second × 16 workers:
1,600 log lines/second: "[Worker N] PANIC recovered"
1,600 goroutine/OS-thread creation-and-destruction cycles/second
Each PL_thread_attach_engine/PL_thread_destroy_engine pair:
~50μs of WAM internal mutex contention
1,600 × 50μs = 80ms/second of WAM mutex contention
All 16 live workers stalled waiting for the mutex while panicking workers
fight over it during attach/destroy cycles
Effective query throughput: near zero
The time.Sleep(100 * time.Millisecond) in managerLoop limits respawn rate to 10 per worker slot per second. At 16 workers respawning simultaneously: 160 attach/destroy operations per second = 8ms/second of WAM mutex contention — tolerable. The 100ms delay also provides a window for an operator to observe the panic log, identify the root cause (KB path wrong, libswipl.so not found, engine limit exceeded), and fix it before the pool resumes normal operation.
For a production deployment, the respawn delay should be exponential with a cap:
// Exponential backoff respawn — prevents panic loops from starving the pool
func (p *Pool) managerLoopWithBackoff() {
backoff := make([]time.Duration, p.size)
for i := range backoff {
backoff[i] = 100 * time.Millisecond
}
for {
select {
case <-p.shutdown:
return
case deadID := <-p.doneCh:
if p.stopped.Load() {
continue
}
delay := backoff[deadID]
log.Printf("[Pool Manager] Worker %d exited — respawning in %v", deadID, delay)
time.Sleep(delay)
p.mu.Lock()
p.startWorker(deadID)
p.mu.Unlock()
// Double the backoff for this slot, cap at 30 seconds.
backoff[deadID] = min(backoff[deadID]*2, 30*time.Second)
}
}
}
// On successful query completion, reset the backoff for the worker slot.
// Add to WorkResult handling in Dispatch():
// if result.Err == nil { pool.resetBackoff(result.WorkerID) }
func (p *Pool) resetBackoff(workerID int) {
// Sent via a separate channel to managerLoop — not implemented here for brevity.
// Full implementation: backoffResetCh chan int in Pool struct.
}
16.5.4 What recover() Cannot Save
Three conditions produce fatal process termination that recover() cannot intercept:
-
C-level SIGSEGV from thread migration — prevented by
runtime.LockOSThread()as the first goroutine statement. If the lock is missing and migration occurs,recover()never fires. -
runtime.throwfrom the Go runtime — internal Go runtime errors (concurrent map read and map write, memory corruption detected by the runtime itself) callruntime.throw, which terminates the process without going through the panic/recover mechanism. These should not occur in well-written Go code but are listed for completeness. -
os.Exitorsyscall.Exitcalled from C code —PL_halt(0)insidelibswipl.socallsexit(). If a Prologhalt/1goal is executed via a FLI query — whether intentionally in a test or accidentally via a KB rule that callshalt— the entire Go process exits immediately. The mitigation is to never allow KB rules to callhalt/0orhalt/1, enforced by a KB load-time linter that checks for these predicates beforeconsultFilereturns.
Outcome: Enterprise-Grade Polyglot Concurrency
16.6.1 The Conceptual Transition
Chapter 15's single-query CGO bridge proved that Go and the WAM can share a process and communicate in 20μs. Chapter 16 makes that proof operational at scale: 16 workers on 16 locked OS threads, each with an independent WAM engine context, collectively processing 80,000+ firewall policy decisions per second (16 workers × 5,000 queries/second/worker), with back-pressure to the HTTP layer, graceful KB reload without downtime, and self-healing worker recovery from panics — all within one Go binary, with no network, no serialisation, and no external dependencies.
| Single-threaded CGO (Chapter 15) | Thread-locked worker pool (Chapter 16) |
|---|---|
| 1 query at a time — serial | N queries concurrent — one per locked worker |
| Goroutine may be migrated between FLI calls | Each worker goroutine locked to one OS thread permanently |
| One WAM engine — shared by all callers (incorrect) | One WAM engine per worker — fully independent stacks |
| No back-pressure mechanism | Bounded channel applies back-pressure to HTTP handlers |
| Panic kills the Go binary | Worker panic recovered — engine destroyed, replacement spawned |
| KB reload requires process restart | KB reload via ControlReload broadcast — rolling, online |
| SIGSEGV risk from scheduler migration | SIGSEGV risk eliminated by first-statement LockOSThread |
16.6.2 Verification Checklist
# Build with pool
logicadmin@logic-node-01:~$ cd /opt/logic-node/go/firewall-bridge
logicadmin@logic-node-01:~$ CGO_ENABLED=1 go build -v -o firewall-bridge .
# Start the server
logicadmin@logic-node-01:~$ ./firewall-bridge &
[Pool] Starting 14 workers (GOMAXPROCS=16, -2 reserved)
[Worker 0] WAM engine attached (OS thread locked)
...
[Worker 13] WAM engine attached (OS thread locked)
Firewall bridge listening on :8080 (pool size: 14)
# 1. Verify all workers started and locked
logicadmin@logic-node-01:~$ curl -s localhost:8080/status | jq .
{
"PoolSize": 14,
"QueueDepth": 0,
"QueueCap": 14,
"PctSaturated": 0,
"Stopped": false
}
# 2. Single query — verify correct verdict and worker ID in response
logicadmin@logic-node-01:~$ curl -s -X POST localhost:8080/firewall \
-d '{"SourceIP":"10.0.1.5","DestPort":443,"Protocol":"tcp"}' | jq .
{
"allowed": true,
"reason": "whitelist_match",
"rule_id": 1,
"worker_id": 3,
"latency_us": 19
}
# 3. Concurrent load test — 10,000 requests, 100 concurrent
logicadmin@logic-node-01:~$ hey -n 10000 -c 100 -m POST \
-H "Content-Type: application/json" \
-d '{"SourceIP":"10.0.1.5","DestPort":443,"Protocol":"tcp"}' \
http://localhost:8080/firewall
Summary:
Total: 2.43s
Requests/sec: 4115.2
Mean latency: 24.1ms ← includes HTTP overhead + queue wait + 19μs WAM query
99th percentile: 38.2ms
Errors: 0
# 4. KB reload while under load — no queries fail
logicadmin@logic-node-01:~$ curl -s localhost:8080/reload
{"status":"reload_broadcast"}
# Workers reload asynchronously — no downtime, no error spike in hey output
# 5. Panic recovery — inject a bad request to trigger panic in worker
# (Requires a debug endpoint that calls panic() directly — testing only)
logicadmin@logic-node-01:~$ curl -s localhost:8080/debug/panic-worker
[Worker 7] PANIC recovered: injected test panic — destroying engine and respawning
[Pool Manager] Worker 7 exited — respawning
[Worker 7] WAM engine attached (OS thread locked)
# Pool remains at 14 workers within ~100ms
# 6. Verify pool self-healed
logicadmin@logic-node-01:~$ curl -s localhost:8080/status | jq .PoolSize
14 # ✓ Still 14 workers
# 7. Graceful shutdown — all workers exit cleanly
logicadmin@logic-node-01:~$ kill -SIGTERM $(pgrep firewall-bridge)
[Pool] Shutdown complete
# No goroutine leaks, no OS thread leaks, no WAM engine leaks
16.6.3 What Comes Next
The worker pool established in this chapter executes one query per worker per dispatch cycle. Each firewall_verdict/4 query is deterministic: it either finds a matching CIDR rule or falls through to the default deny. The SLD resolution tree is shallow — at most a handful of unification steps before a cut terminates the search. This is the correct model for IP matching against a bounded rule set.
The model breaks when applied to infrastructure topology queries: "find all reachable hosts from node X", "does a path exist between VLAN 10 and VLAN 30 through any combination of firewall rules and routing entries?" These queries traverse graphs. In pure SLD resolution, graph traversal without cycle detection loops infinitely when the edge relation contains cycles — and any real network topology contains cycles. A query that loops infinitely inside a worker's WAM engine saturates the worker's local stack, triggers a WAM stack overflow, and arrives at the panic recovery path from Section 16.5 for the wrong reasons.
Chapter 17 introduces tabling (SLG resolution) via library(tabling). Tabled predicates memoize their answers in a per-call table stored on the WAM global heap. When the same subgoal is encountered during recursive descent — the cycle case — the tabling mechanism detects the repeat and consults the memo table rather than recursing again. The result is termination-guaranteed graph reachability over arbitrary network topologies, computed once per query and cached for subsequent calls, at the cost of global heap space proportional to the number of distinct subgoal answers. The worker pool from Chapter 16 delivers tabled queries to the embedded WAM pool in Chapter 17 without modification — the Dispatch API is stable, the FLI contract is unchanged, only the Prolog predicate changes from a cut-driven deterministic rule to a tabled recursive one.
Chapter Summary
| Concept | Operational Definition | Performance / Security Consequence |
|---|---|---|
| Go M:N scheduler | M goroutines multiplexed onto N OS threads; goroutines migrate freely between threads | WAM OS-thread-local engine state is invalid after migration — SIGSEGV or silent wrong result |
runtime.LockOSThread() |
Pins goroutine to current OS thread; scheduler cannot migrate it | Must be first statement in worker goroutine; any code before it is a migration risk |
| One engine per locked thread | PL_thread_attach_engine(nil) called after LockOSThread() |
N workers = N independent local/global/trail stacks; shared clause database is read-only |
runtime.UnlockOSThread() in defer |
Returns OS thread to Go runtime pool after worker exit | Without unlock: Go runtime terminates the OS thread, permanently shrinking thread pool |
| Bounded dispatch channel | make(chan WorkItem, poolSize) — depth equals pool size |
Blocks producers when pool saturated; maximum in-flight = 2×poolSize queries |
| Worker inCh depth 1 | Each worker's personal channel holds at most one pending item | Worker processes serially; at-most-one-queued-per-worker prevents memory accumulation |
ControlReload broadcast |
N copies sent to shared ctrlCh; each worker reloads between queries |
Rolling reload: no downtime; consistency window = size × avg_query_time (≤320μs at 16 workers) |
ControlShutdown broadcast |
Workers exit cleanly after current query; defer fires, engine destroyed | Pool drains in-flight queries before shutdown; no open queries at exit |
recover() in worker defer |
Catches Go-level panics; C SIGSEGV is NOT catchable — prevented by LockOSThread | Worker panic does not kill binary; pool manager respawns replacement |
PL_thread_destroy_engine() in recovery |
Releases WAM engine from OS thread after panic; all open frames/queries closed | Engine state after panic is undefined — destroy is the only safe action |
| Respawn backoff (100ms, exponential) | Delays replacement after panic; caps respawn rate per worker slot | Prevents panic loop from monopolising WAM mutex; limits to 160 attach/destroy ops/sec at 16 workers |
halt/0 KB linter |
Check loaded KB for halt/1 calls before consultFile returns |
PL_halt() calls exit() — kills entire Go binary, not just the worker; recover() cannot intercept |
Pool size = GOMAXPROCS - 2 |
N workers = N locked OS threads = N physical CPU cores minus overhead | Avoids over-subscription; N > GOMAXPROCS causes locked threads to compete for CPU cores |
PL_create_engine + PL_set_engine |
Pre-allocates WAM engine with explicit local/global stack bounds before attaching to OS thread | Per-worker 64MB cap isolates runaway queries; prevents one worker stack overflow from pressuring 15 others; 16 workers x 128MB = 2GB total max WAM allocation |
NUMA node binding (numactl/CPUAffinity) |
Pins all worker OS threads and WAM stack allocations to one NUMA node | Eliminates QPI/UPI cross-socket latency (3-4x penalty on dual-socket); WAM local stack pages stay hot in node 0 L3 cache; 40-60% throughput improvement vs. unbound on dual-socket |
Exercises
Exercise 16.1 — Pool Size Empirical Optimum
Write a benchmark BenchmarkPoolThroughput that creates pools of size 1, 2, 4, 8, GOMAXPROCS, GOMAXPROCS+4, and GOMAXPROCS*2, runs 10,000 Dispatch calls against each, and records median latency and total throughput. Plot the throughput curve. Identify the pool size at which throughput plateaus — this is the empirical optimum for the test machine. Explain why GOMAXPROCS*2 is not twice the throughput of GOMAXPROCS (hint: locked OS threads beyond GOMAXPROCS contend for CPU cores with each other).
Exercise 16.2 — Per-Worker Predicate Handle Cache
Chapter 14 noted that PL_predicate() performs a module table hash lookup on every call. In the single-threaded bridge, caching predicate_t as a sync.Once-initialised package variable was safe. In the pool, each worker has its own WAM engine — predicate_t handles from one engine may not be valid in another. Implement a per-worker predicate cache stored in the worker struct as a map[string]C.predicate_t. Populate it on first query per worker (lazy initialisation). Verify with a benchmark that the per-worker cache produces the same throughput improvement as the global cache from Chapter 14, and that no worker uses another worker's predicate_t handle.
Exercise 16.3 — Strict Atomic KB Cutover
Implement the version-tagged clause database approach described in Section 16.4.2. Use a firewall_verdict_v1/4 and firewall_verdict_v2/4 naming scheme. Store the active version as a sync/atomic.Int32 in the Pool struct. Implement AtomicReload(path string) that: (1) loads the new rules as version N+1; (2) broadcasts ControlSync and waits for all workers to acknowledge (confirming all in-flight queries are complete); (3) atomically increments the version integer; (4) broadcasts ControlReload so workers pick up the new version number. Measure the maximum consistency window under 5,000 requests/second load — it should be zero for atomic cutover.
Exercise 16.4 — halt KB Linter
Implement lintKB(path string) error that loads the Prolog file into a temporary engine, queries predicate_property(halt(_), defined) and predicate_property(halt, defined), and returns an error if either is true in the user module. Integrate lintKB into consultFile — if the lint fails, consultFile returns an error and the KB is not loaded. Write a test KB file containing :- halt. as a directive and verify that consultFile rejects it before executing the directive.
Exercise 16.5 — Health Check Endpoint with Per-Worker Round-Trip
The /status endpoint returns aggregate pool metrics from channel lengths. Implement /healthz that performs a strict per-worker liveness check: dispatches one trivial query (true/0 via FLI) to each worker slot sequentially, measures the round-trip time, and returns a JSON object with per-worker status ({worker_id, latency_us, status: "ok"|"timeout"}). A worker that fails to respond within 50ms is marked as "timeout". The health check must complete within poolSize × 50ms maximum. Use the Dispatch timeout mechanism — do not add a new channel type for health checks.
Further Reading
- Go runtime source:
src/runtime/proc.go—https://github.com/golang/go/blob/master/src/runtime/proc.go—LockOSThread,UnlockOSThread,findrunnable, and goroutine migration mechanics; the authoritative source for understanding when and why goroutines move between OS threads - Go documentation:
runtime.LockOSThread—https://pkg.go.dev/runtime#LockOSThread— the formal specification including the thread-termination-on-exit guarantee - SWI-Prolog FLI: Multi-threading —
https://www.swi-prolog.org/pldoc/man?section=foreign-thread—PL_thread_attach_engine,PL_thread_destroy_engine, per-thread WAM state model - SWI-Prolog:
PL_thread_attach_engine/1—https://www.swi-prolog.org/pldoc/man?predicate=PL_thread_attach_engine/1— return codes, failure conditions, engine limit configuration - Cox, R. (2009). "The Go Memory Model." —
https://go.dev/ref/mem— the formal specification of which Go concurrent memory accesses are safe; the foundation for understanding why CGO requires explicit OS-thread pinning - Klabnik, S. & Nichols, C. (2019). The Rust Programming Language. Chapter 16: Fearless Concurrency — the alternative-language treatment of thread safety; useful contrast with Go's goroutine model for engineers evaluating polyglot concurrency approaches
- Linux
pthread_setspecific(3)man page —https://man7.org/linux/man-pages/man3/pthread_setspecific.3.html— the POSIX thread-local storage mechanism underlying WAM per-thread engine attachment
End of Chapter 16 — Next: Chapter 17: Tabling and SLG Resolution — Memoised Infrastructure Graph Queries, Cycle Detection, and Termination Guarantees over Network Topologies
Revision record:Chapter 16.1 — Architect’s review applied. Fix 1: direct Go documentation quote deleted from Section 16.1.2; replaced with mechanical statement of goroutine-wiring behaviour, thread-termination-on-exit consequence, and mandatorydefer runtime.UnlockOSThread()rationale. Fix 2: syllabus alignment — Section 16.6.3 inserted pointing to Chapter 17 Tabling/SLG Resolution; footer corrected. Improvement 1: Section 16.3.2.1 inserted —PL_create_enginebounded stack pattern with full C helpers (create_bounded_engine,attach_bounded_engine,detach_and_destroy_engine); STEP 3 comment updated; per-worker isolation arithmetic (16 x 128MB = 2GB max). Improvement 2: Section 16.1.2.1 inserted — NUMA node affinity; explains Linux kernel OS-thread migration vs Go scheduler migration; 3–4x QPI latency penalty; 40–60% throughput degradation quantified;numactland systemd options;numastat -pverification. Three new Chapter Summary rows.BookStack tags:swi-prolog,chapter-16,go,cgo,concurrency,worker-pool,LockOSThread,WAM,engine-pool,panic-recovery,broadcast-channels,PL_create_engine,numa,tabling,volume-iii