Skip to main content

Chapter 4: Modern Structures — Dicts, Strings, and Go-Interop Data

Overview

The knowledge bases we built in Part I were powerful for what they were, but they shared a common characteristic: every entity was represented as a flat compound term with a fixed number of positional arguments. The predicate app(swi_prolog, '10.2.1', open_source, language_runtime) works well when there are four attributes, but consider what happens when we need to model something richer — a server with a hostname, an IP address, a role, an operating system, a RAM capacity, a list of installed packages, a last-seen timestamp, and a health status. A compound term with nine positional arguments is legal Prolog, but it is a maintenance burden: every predicate that touches it must remember that the fifth argument is RAM and the seventh is packages, and adding a tenth attribute means updating every clause in the codebase that mentions that predicate.

In SWI-Prolog 10.x, there is a better way. The Dict type — introduced in SWI-Prolog 7 and matured through the 9.x and 10.x series into the stable, high-performance structure it is today — provides named field access, O(log N) lookup, and a syntax that will look immediately familiar to anyone who has written JSON or Go structs. This chapter introduces Dicts properly, examines SWI-Prolog's string handling, and builds the data layer that the Go integration in Part III will depend on. By the end of the chapter, the knowledge base will be able to represent complex entities, manipulate text from real system sources, and exchange data with the outside world in formats that require no translation layer.

This is also the chapter where the gap between "Prolog as a teaching language" and "Prolog as an industrial tool" begins to close visibly. The patterns introduced here — Dict-based entity modelling, string processing, and structured term construction — are the same patterns that appear in production SWI-Prolog systems managing infrastructure, processing legal documents, and routing LLM requests.

4.1 The Problem with Positional Arguments

Before introducing Dicts, it is worth being precise about why positional compound terms become a problem at scale, because the alternative we are moving to only makes sense in the context of the limitation it solves.

Consider the vm/4 predicate from Chapter 1: vm(mint_logic_lab, online, 16, 4). The four arguments represent name, status, RAM in gigabytes, and CPU core count. This is fine. Now imagine the requirement evolves — we need to add the VM's IP address, its Proxmox node, its primary storage pool, its last snapshot timestamp, and whether it has the QEMU guest agent active. We now have a vm/9 predicate. The clause vm(mint_logic_lab, online, 16, 4, '192.168.10.20', pve_node1, local_lvm, '2026-06-01T14:30:00', true) is syntactically valid, but any rule that pattern-matches it must either bind all nine arguments or scatter anonymous variables across the positions it does not care about: vm(Name, online, _, _, IP, _, _, _, _). This is fragile. Insert a new argument between position 4 and 5 and every clause in the codebase that mentions vm/9 needs updating.

The practical consequence is that large Prolog codebases written in the positional style accumulate a particular kind of technical debt: predicates that accept more arguments than they logically need, simply to thread values through to deeper rules that do need them. Dicts eliminate this problem entirely.

THE POSITIONAL ARGUMENT FRAGILITY PROBLEM
─────────────────────────────────────────────────────────────────
  Original predicate (4 arguments — manageable):

  vm(mint_logic_lab,  online,  16,  4)
     │               │        │    └─ arg 4: cores
     │               │        └────── arg 3: ram_gb
     │               └─────────────── arg 2: status
     └─────────────────────────────── arg 1: name

  ──────────────────────────────────────────────────────────────
  After adding 5 new fields (9 arguments — fragile):

  vm(mint_logic_lab, online, 16, 4, '192.168.10.20', pve_node1, local_lvm, '2026-06-01', true)
     arg1            arg2   arg3 arg4 arg5             arg6       arg7       arg8          arg9

  Every rule that touches this predicate must now track 9 positions:
  vm(Name, online, _, _, IP, _, _, _, _)   ← status at arg2, IP at arg5
  vm(Name, _,      _, _, _,  Node, _, _, _) ← node at arg6

  Insert a new field at position 5?
  → Every clause referencing vm/9 in the entire codebase breaks.
  ──────────────────────────────────────────────────────────────
  With Dicts: access only the fields you need, by name.
  New fields added → zero existing rules require changes.

4.2 Dicts: Named Fields and Structural Flexibility

A Dict in SWI-Prolog is written as a tag followed by a set of key-value pairs enclosed in curly braces. The tag identifies the type or role of the Dict, the keys are atoms, and the values can be any Prolog term — atoms, numbers, strings, lists, compound terms, or other Dicts.

VM = vm{
    name:     mint_logic_lab,
    status:   online,
    ram_gb:   16,
    cores:    4,
    ip:       '192.168.10.20',
    node:     pve_node1,
    storage:  local_lvm,
    agent:    true
}.

This is a complete, valid Prolog term. The tag vm gives the Dict its identity — it tells both the programmer and the engine what kind of entity this Dict represents. The tag is not a module name or a type declaration enforced by the runtime; it is a symbolic label that participates in unification. Two Dicts with different tags will not unify with each other even if their fields are identical, which means we can use tags as lightweight type discriminators without needing a separate type system.

Field access uses the . operator: VM.name evaluates to mint_logic_lab, VM.ram_gb evaluates to 16. This is O(log N) access because SWI-Prolog stores Dict keys in a sorted AVL-tree-like structure internally, rather than scanning them linearly as a list would require. For a Dict with eight fields — as in the VM record above — this is a minor distinction. For a Dict representing a parsed JSON object with two hundred fields, or a knowledge base that performs millions of field accesses per second across a large reasoning task, the difference between O(N) and O(log N) is the difference between a system that scales and one that does not.

The tag can itself be a variable, which is occasionally useful when writing predicates that must operate on Dicts of different types:

?- D = vm{name: test, status: online},
   is_dict(D, Tag).
Tag = vm.

?- D = sensor{id: cpu0, value: 62.0},
   is_dict(D, Tag).
Tag = sensor.

The is_dict/2 predicate extracts the tag without committing to any particular tag value, which is how we can write generic Dict-handling predicates that work across the entire knowledge base regardless of what type of entity they receive.

Dicts can be nested — a field value can itself be a Dict. This is the natural representation for hierarchical data and maps directly to nested JSON objects:

?- Config = network{
       interface: eth0,
       addresses: addresses{
           ipv4: '192.168.10.20',
           ipv6: 'fe80::1'
       },
       mtu: 1500
   },
   Config.addresses.ipv4 = IPv4.
IPv4 = '192.168.10.20'.

Chained dot access (Config.addresses.ipv4) works as expected, evaluating left to right. Each . operator performs a single field lookup on the result of the previous one. This makes deeply nested structures readable without the awkward get_dict chains that the equivalent operation would require in languages without dot notation.

Create a new file ~/logic-lab/prolog/infrastructure.pl and begin building a Dict-based infrastructure knowledge base:

% infrastructure.pl
% Dict-based infrastructure knowledge base for the Proxmox homelab.
% Part II, Chapter 4 - Modern SWI-Prolog (2026 Edition)

:- module(infrastructure, [
    vm_record/2,
    vms_by_status/2,
    vms_by_node/2,
    vm_summary/2,
    sensor_reading/2,
    sensor_alert/2,
    update_vm_status/4
]).

:- use_module(library(dicts)).
:- use_module(library(aggregate)).

% vm_record(+Name, -VMDict)
% The primary knowledge base of VM records, modelled as Dicts.
vm_record(mint_logic_lab, vm{
    name:      mint_logic_lab,
    status:    online,
    ram_gb:    16,
    cores:     4,
    ip:        '192.168.10.20',
    node:      pve_node1,
    storage:   local_lvm,
    agent:     true,
    role:      development
}).
vm_record(debian_core, vm{
    name:      debian_core,
    status:    offline,
    ram_gb:    8,
    cores:     2,
    ip:        '192.168.10.21',
    node:      pve_node1,
    storage:   local_lvm,
    agent:     true,
    role:      orchestrator
}).
vm_record(pfsense_fw, vm{
    name:      pfsense_fw,
    status:    online,
    ram_gb:    4,
    cores:     2,
    ip:        '192.168.10.1',
    node:      pve_node1,
    storage:   local_lvm,
    agent:     false,
    role:      firewall
}).
vm_record(backup_target, vm{
    name:      backup_target,
    status:    online,
    ram_gb:    32,
    cores:     4,
    ip:        '192.168.10.30',
    node:      pve_node2,
    storage:   zfs_pool,
    agent:     true,
    role:      storage
}).

The immediate difference from the positional style is readability. Each field is self-documenting. A rule that only cares about a VM's status and IP address can access exactly those fields without acknowledging the others:

% vms_by_status(+Status, -VMList)
% Returns a list of VM Dicts filtered by status.
vms_by_status(Status, VMList) :-
    findall(VM,
        (vm_record(_, VM), VM.status = Status),
        VMList).

% vms_by_node(+Node, -VMList)
% Returns all VM Dicts assigned to a given Proxmox node.
vms_by_node(Node, VMList) :-
    findall(VM,
        (vm_record(_, VM), VM.node = Node),
        VMList).

Load the file and test these:

?- vms_by_status(online, VMs),
   maplist([VM]>>(format("  ~w (~w)~n", [VM.name, VM.ip])), VMs).
  mint_logic_lab (192.168.10.20)
  pfsense_fw (192.168.10.1)
  backup_target (192.168.10.30)
VMs = [vm{agent:true, cores:4, ip:'192.168.10.20', ...}, ...].

The maplist/2 call here uses a lambda expression — the [VM]>>(...) syntax. This is SWI-Prolog's lambda notation, provided by library(yall) (Yet Another Lambda Library), which is loaded automatically in 10.x. The [VM]>>(Goal) term reads as "a goal that, given VM, executes Goal." It allows us to pass an inline goal to higher-order predicates like maplist/2 without defining a separate named predicate for it. We will use this syntax frequently from this point forward because it significantly reduces the number of small helper predicates that would otherwise clutter the module's namespace.

4.3 Field Access, Pattern Matching, and Dict Unification

Dict field access with the . operator (VM.status) is syntactic sugar for the built-in get_dict/3 predicate: get_dict(status, VM, Value). Both forms work, but they have different failure characteristics that matter in practice. The dot notation VM.status throws a type error if VM is not a Dict, and throws an existence error if the key status is absent from the Dict. The get_dict/3 predicate, by contrast, fails cleanly in either case without throwing an exception. In rule bodies where the shape of an incoming Dict is guaranteed by the caller, dot notation is the cleaner choice. In defensive predicates that must handle malformed or incomplete input — such as a predicate that validates data arriving from an external source before it enters the knowledge base — get_dict/3 is the safer tool.

% safe_get_status(+Dict, -Status)
% Retrieves the status field from a Dict if present, otherwise unknown.
safe_get_status(Dict, Status) :-
    (   get_dict(status, Dict, Status)
    ->  true
    ;   Status = unknown
    ).
?- safe_get_status(vm{name: test, status: online}, S).
S = online.

?- safe_get_status(vm{name: test}, S).
S = unknown.

This if-then-else pattern around get_dict/3 will appear frequently in the Go integration chapters, where Prolog receives Dicts constructed from external JSON data that may not always contain every expected field.

Dicts also participate in unification, but with important semantics. Two Dicts unify if they have the same tag and their shared keys have unifiable values. Keys that are present in one Dict but not the other do not cause failure — they are simply ignored during unification. This partial unification behaviour is what makes Dicts useful as flexible patterns:

?- vm{status: S, role: R} = vm{name: test, status: online, role: development, cores: 4}.
S = online,
R = development.

The pattern Dict on the left has only two keys; the data Dict on the right has four. Unification succeeds, binding S and R, and ignoring the name and cores fields that the pattern does not mention. This is structurally analogous to pattern matching against a Go struct where only certain fields are inspected — a partial specification that matches any Dict containing at least those keys with compatible values.

DICT PARTIAL UNIFICATION
─────────────────────────────────────────────────────────────────
  Pattern (2 keys):          Data (4 keys):
  ┌─────────────────┐        ┌─────────────────────────────┐
  │ vm{}            │        │ vm{}                        │
  │  status: S  ────┼──────▶ │  name:   test         (ignored)
  │  role:   R  ────┼──────▶ │  status: online    →  S = online
  └─────────────────┘        │  role:   development →  R = development
                             │  cores:  4            (ignored)
                             └─────────────────────────────┘

  Keys present in pattern but not data  → unification FAILS
  Keys present in data but not pattern  → silently IGNORED
  Keys present in both                  → values must UNIFY

  Result: S = online, R = development
─────────────────────────────────────────────────────────────────
  A Dict pattern is a CONTRACT, not a complete description.
  Rules only declare the fields they actually need.

We can use partial unification directly in rule heads and in findall goals. The aggregate_all examples we saw earlier — aggregate_all(count, vm_record(_, vm{status: online}), Count) — are using exactly this: the pattern vm{status: online} matches any VM Dict whose status field is online, regardless of what other fields it has.

The dict_pairs/3 predicate is the bridge between the Dict world and the list world, and it is worth understanding because it will appear in the Go serialisation code in Chapter 6. It converts a Dict to a list of Key-Value pairs and back:

?- dict_pairs(vm{name: test, status: online, cores: 4}, Tag, Pairs).
Tag = vm,
Pairs = [cores-4, name-test, status-online].

?- dict_pairs(D, vm, [name-mint_logic_lab, status-online, ram_gb-16]).
D = vm{name:mint_logic_lab, ram_gb:16, status:online}.

The pairs are always in key-sorted order in the output. The reverse direction — constructing a Dict from a list of pairs — is equally direct. This bidirectionality means we can freely convert between Dicts and association lists, which is useful when interfacing with predicates that predate the Dict type and still operate on key-value lists.

Dict modification uses put_dict/4:

% update_vm_status(+Name, +NewStatus, -OldVM, -NewVM)
% Returns a new VM Dict with the status field updated.
update_vm_status(Name, NewStatus, OldVM, NewVM) :-
    vm_record(Name, OldVM),
    put_dict(status, OldVM, NewStatus, NewVM).
?- update_vm_status(debian_core, online, Old, New).
Old = vm{agent:true, cores:2, ip:'192.168.10.21', name:debian_core,
         node:pve_node1, role:orchestrator, status:offline, storage:local_lvm},
New = vm{agent:true, cores:2, ip:'192.168.10.21', name:debian_core,
         node:pve_node1, role:orchestrator, status:online, storage:local_lvm}.

The original Dict is unchanged. put_dict/4 creates a new Dict with the updated field, leaving the original intact. This immutability is consistent with Prolog's single-assignment semantics — we never destructively modify a term, we produce a new version of it. In the context of the Go-Prolog bridge we build in Part III, this property is extremely valuable: the Prolog engine never mutates shared data, which eliminates an entire class of concurrency bugs that would otherwise require explicit mutex protection.

put_dict/3 (three-argument form) allows updating multiple fields simultaneously from a list of pairs:

?- vm_record(debian_core, VM),
   put_dict([status-online, ram_gb-16], VM, Updated).
Updated = vm{agent:true, cores:2, ip:'192.168.10.21', name:debian_core,
             node:pve_node1, role:orchestrator, ram_gb:16,
             status:online, storage:local_lvm}.

Notice that ram_gb was updated from 8 to 16 at the same time as status moved from offline to online. The three-argument put_dict/3 takes a list of Key-Value pairs as its first argument and applies all updates atomically, returning a single new Dict. This is the form to reach for whenever more than one field needs changing — it is cleaner and more efficient than chaining multiple put_dict/4 calls.

4.4 Modelling Hardware Sensors

A practical application of Dicts that connects directly to homelab infrastructure management is sensor data modelling. Modern Proxmox hosts expose hardware sensor data — CPU temperature, fan speeds, power draw, drive temperatures — through the Linux lm-sensors subsystem. In Chapter 6, a Go process will read this data and pass it to the Prolog engine as Dicts. For now, we model representative sensor readings as static facts to establish the data structure that Go will eventually populate dynamically.

Add the following to infrastructure.pl:

% sensor_reading(+SensorID, -SensorDict)
% Models hardware sensor readings from the Proxmox host.
% In production, these facts are asserted dynamically by the Go process.
sensor_reading(cpu0_temp, sensor{
    id:       cpu0_temp,
    type:     temperature,
    value:    62.0,
    unit:     celsius,
    source:   coretemp,
    node:     pve_node1,
    critical: 95.0,
    warning:  80.0
}).
sensor_reading(cpu1_temp, sensor{
    id:       cpu1_temp,
    type:     temperature,
    value:    58.5,
    unit:     celsius,
    source:   coretemp,
    node:     pve_node1,
    critical: 95.0,
    warning:  80.0
}).
sensor_reading(nvme0_temp, sensor{
    id:       nvme0_temp,
    type:     temperature,
    value:    41.0,
    unit:     celsius,
    source:   nvme,
    node:     pve_node1,
    critical: 70.0,
    warning:  55.0
}).
sensor_reading(psu_power, sensor{
    id:       psu_power,
    type:     power,
    value:    187.0,
    unit:     watts,
    source:   acpi,
    node:     pve_node1,
    critical: 400.0,
    warning:  320.0
}).

% sensor_alert(+SensorID, -AlertDict)
% True if a sensor reading exceeds its warning or critical threshold.
sensor_alert(ID, alert{
    sensor:    ID,
    level:     critical,
    value:     S.value,
    threshold: S.critical,
    unit:      S.unit
}) :-
    sensor_reading(ID, S),
    S.value >= S.critical.

sensor_alert(ID, alert{
    sensor:    ID,
    level:     warning,
    value:     S.value,
    threshold: S.warning,
    unit:      S.unit
}) :-
    sensor_reading(ID, S),
    S.value >= S.warning,
    S.value < S.critical.

Query the alert system:

?- sensor_alert(ID, Alert).
false.

No current alerts — all sensor values are within their warning thresholds. We can verify the margins directly:

?- sensor_reading(cpu0_temp, S),
   Margin is S.warning - S.value.
S = sensor{critical:95.0, id:cpu0_temp, ...},
Margin = 18.0.

The CPU temperature of 62°C is 18 degrees below the warning threshold of 80°C. The is/2 operator performs arithmetic evaluation here — as we noted in Chapter 3, S.warning - S.value without is/2 would produce the compound term -(80.0, 62.0), not the number 18.0. Dict field access expressions like S.warning are evaluated to their bound values before arithmetic is applied, so the expression works as expected.

The sensor_alert/2 predicate produces a new Dict as its output — an alert Dict constructed from the fields of the sensor Dict. This is precisely the pattern that the Go integration will use: Go sends a sensor Dict in, Prolog evaluates it against threshold rules and returns an alert Dict out. The entire exchange is structured data in, structured data out, with the logic in Prolog and the I/O in Go.

4.5 Composing Dicts: The vm_summary Predicate

To close the Dict section, we build a vm_summary/2 predicate that aggregates information from multiple sources — the VM record and the sensor readings — into a single summary Dict. This demonstrates Dict construction as a compositional tool and produces a term that is directly serialisable to JSON.

% vm_summary(+VMName, -SummaryDict)
% Constructs a comprehensive summary Dict for a named VM.
vm_summary(Name, summary{
    name:        VM.name,
    status:      VM.status,
    role:        VM.role,
    ip:          VM.ip,
    node:        VM.node,
    ram_gb:      VM.ram_gb,
    cores:       VM.cores,
    alert_count: AlertCount
}) :-
    vm_record(Name, VM),
    aggregate_all(count,
        sensor_alert(_, _),
        AlertCount).
?- vm_summary(mint_logic_lab, Summary).
Summary = summary{
    alert_count: 0,
    cores: 4,
    ip: '192.168.10.20',
    name: mint_logic_lab,
    node: pve_node1,
    ram_gb: 16,
    role: development,
    status: online
}.

This Dict is directly serialisable to JSON — which is exactly what will happen in Chapter 6 when we wire the Go HTTP server to the Prolog engine. SWI-Prolog's library(http/json) can convert a Dict to a JSON object with a single predicate call, and the inverse — from a Go-generated JSON payload to a Prolog Dict — is equally direct. The data model we are establishing here is not incidental. It is the contract between the Prolog reasoning layer and the Go I/O layer, and defining it clearly now means that the Go integration code in Part III has a stable, well-understood target to work against.

4.6 Strings in SWI-Prolog 10.x

String handling in SWI-Prolog has a history worth knowing, because it explains some behaviour that can initially seem inconsistent. In very early SWI-Prolog, double-quoted text like "hello" was interpreted as a list of character codes — integers representing ASCII or Unicode values. This was the ISO Prolog standard behaviour and was useful for character-level text processing, but it was verbose and inefficient for the modern use case of handling log messages, JSON values, and file paths. SWI-Prolog 7 introduced a proper string type, and in 10.x, double-quoted text creates a native string object by default. The Prolog flag double_quotes controls this behaviour, and its default value in 10.x is string. If the source file being loaded was written for an older SWI-Prolog version and behaves unexpectedly with string predicates, adding :- set_prolog_flag(double_quotes, string). at the top of the file will normalise the behaviour.

The distinction between atoms and strings is practical rather than cosmetic, and it is worth being precise about it. Atoms are interned identifiers: when the engine loads the atom open_source, it stores it once in a global atom table and thereafter represents every occurrence of open_source as a single pointer to that entry. Comparing two atoms for equality is therefore a pointer comparison — extremely fast regardless of the length of the atom's name. Atoms are suited for symbolic values: categories, states, identifiers, keys. Strings, by contrast, are heap-allocated sequences of Unicode code points. They are not interned — two strings with identical content are two separate heap objects unless explicitly compared. This makes strings better suited for text that arrives from the outside world and needs to be processed: log lines, configuration file contents, HTTP response bodies, file paths, and the natural language input from the LLM interface that we introduce in Part V.

The practical rule of thumb is: if a value is a label, a key, or a state that will be compared symbolically, use an atom. If a value is text that will be split, searched, concatenated, or passed to an external system, use a string. When in doubt, atom_string/2 converts in either direction.

Let us examine the string predicates systematically with examples drawn from real infrastructure management tasks. The first and most frequently used is split_string/4:

% Splitting a CSV line from a system report
?- split_string("mint_logic_lab,online,16,4", ",", "", Fields).
Fields = ["mint_logic_lab", "online", "16", "4"].

% Splitting with padding removal (strips spaces around separators)
?- split_string("mint_logic_lab , online , 16 , 4", ",", " ", Fields).
Fields = ["mint_logic_lab", "online", "16", "4"].

% Splitting a file path into components
?- split_string("/home/logicdev/logic-lab/prolog", "/", "", Parts).
Parts = ["", "home", "logicdev", "logic-lab", "prolog"].

Notice the empty string at the head of the path split result — the leading / produces an empty component before the first separator. When processing file paths, this is handled by filtering with exclude(=(""), Parts, CleanParts) or simply pattern-matching with Parts = ["" | Meaningful].

String construction is equally important. The format/2 predicate, which we have been using for terminal output throughout this book, also works as a string builder when given string(S) as its first argument:

?- format(string(Msg), "VM ~w is ~w on node ~w", [mint_logic_lab, online, pve_node1]).
Msg = "VM mint_logic_lab is online on node pve_node1".

This is the cleanest way to construct strings that incorporate variable values, and it is the pattern we will use in Chapter 5 when building log entry formatters and in the LLM prompt constructors of Part V.

The string_concat/3 predicate concatenates two strings and is useful for simple two-part joins, but for joining more than two parts, atomic_list_concat/2 (which accepts a list) is more readable:

% Two-part join
?- string_concat("backup_", "job_001", Name).
Name = "backup_job_001".

% Multi-part join — atomic_list_concat handles mixed atoms and strings
?- atomic_list_concat(['/backups', '/', mint_logic_lab, '/', '20260615'], Path).
Path = '/backups/mint_logic_lab/20260615'.

% With a separator — join a list of components with a delimiter
?- atomic_list_concat(["INFO", "sshd", "Connection accepted"], " | ", Line).
Line = 'INFO | sshd | Connection accepted'.

For working with the individual characters of a string, string_codes/2 gives access to the list of Unicode code points, and string_chars/2 gives the list of single-character atoms. The code-point approach is faster for algorithmic text processing; the character atom approach is more readable for pattern matching:

?- string_chars("online", Chars).
Chars = [o, n, l, i, n, e].

?- string_codes("AB", Codes).
Codes = [65, 66].

A common real-world task is extracting a numeric value from a string that mixes text and digits — for example, a memory string like "16GB" from a system report. The combination of split_string/4 and number_string/2 handles this cleanly:

% parse_memory_string(+MemStr, -Bytes)
% Parses strings like "16GB", "512MB", "2048KB" into byte counts.
parse_memory_string(MemStr, Bytes) :-
    string_upper(MemStr, Upper),
    (   split_string(Upper, "G", "", [NumStr | _]),
        number_string(N, NumStr)
    ->  Bytes is N * 1_073_741_824
    ;   split_string(Upper, "M", "", [NumStr | _]),
        number_string(N, NumStr)
    ->  Bytes is N * 1_048_576
    ;   split_string(Upper, "K", "", [NumStr | _]),
        number_string(N, NumStr)
    ->  Bytes is N * 1_024
    ;   number_string(Bytes, MemStr)
    ).
?- parse_memory_string("16GB", Bytes).
Bytes = 17179869184.

?- parse_memory_string("512MB", Bytes).
Bytes = 536870912.

The 1_073_741_824 notation uses underscore separators in integer literals, which SWI-Prolog 10.x supports for readability — 1_073_741_824 is the same value as 1073741824. This is a minor but welcome addition that makes large constant values in system code much easier to read and audit.

Now add the version parsing predicates we will need for the infrastructure knowledge base:

% parse_version(+VersionString, -Major, -Minor, -Patch)
% Parses a semantic version string into its numeric components.
parse_version(VersionStr, Major, Minor, Patch) :-
    split_string(VersionStr, ".", "", Parts),
    Parts = [MajorStr, MinorStr, PatchStr | _],
    number_string(Major, MajorStr),
    number_string(Minor, MinorStr),
    number_string(Patch, PatchStr).

% version_at_least(+InstalledStr, +MinMajor, +MinMinor)
% True if the installed version meets a minimum requirement.
version_at_least(InstalledStr, MinMajor, MinMinor) :-
    parse_version(InstalledStr, Major, Minor, _),
    (   Major > MinMajor
    ->  true
    ;   Major =:= MinMajor, Minor >= MinMinor
    ).

% version_compare(+VersionStrA, +VersionStrB, -Order)
% Order is `gt`, `lt`, or `eq` comparing A to B.
version_compare(A, B, Order) :-
    parse_version(A, MajA, MinA, PatA),
    parse_version(B, MajB, MinB, PatB),
    compare_triples(MajA-MinA-PatA, MajB-MinB-PatB, Order).

compare_triples(X-_-_, Y-_-_, gt) :- X > Y, !.
compare_triples(X-_-_, Y-_-_, lt) :- X < Y, !.
compare_triples(_-X-_, _-Y-_, gt) :- X > Y, !.
compare_triples(_-X-_, _-Y-_, lt) :- X < Y, !.
compare_triples(_-_-X, _-_-Y, gt) :- X > Y, !.
compare_triples(_-_-X, _-_-Y, lt) :- X < Y, !.
compare_triples(_,     _,     eq).
?- parse_version("10.2.1", Major, Minor, Patch).
Major = 10, Minor = 2, Patch = 1.

?- version_at_least("10.2.1", 10, 0).
true.

?- version_at_least("9.6.0", 10, 0).
false.

?- version_compare("10.2.1", "10.1.9", Order).
Order = gt.

?- version_compare("9.6.0", "10.0.0", Order).
Order = lt.

The =:=/2 operator used in version_at_least is numeric equality — it evaluates both sides as arithmetic expressions before comparing. This sits alongside related operators that are frequently confused by developers coming from other languages. =/2 performs unification without arithmetic evaluation: ?- X = 1+2. binds X to the compound term +(1,2), not the integer 3. ==/2 tests structural identity without binding any variables: ?- 1+2 == 1+2. succeeds, but ?- X == 1+2. fails if X is unbound. =:=/2 evaluates both sides arithmetically: ?- 1+2 =:= 3. succeeds. The arithmetic inequality operators follow the same evaluation model: =\= (not equal), <, >, =< (less than or equal), and >=.

The version_compare/3 predicate uses the standard three-argument compare/3 idiom but with explicit comparison steps because semantic versioning requires comparing three numeric components in priority order rather than a single value. The cuts in compare_triples/3 are green cuts — each clause handles one unambiguous ordering case and commits to it immediately, preventing the engine from trying the less-specific eq clause when a directional result has already been established.

4.7 The Compound Term as a Domain Specific Language

Before moving to the chapter project, it is worth examining one more data modelling pattern: using compound terms to create a small Domain Specific Language for system administration tasks. A DSL in this context is not a separate language — it is a set of Prolog terms whose structure encodes the semantics of a domain so naturally that reading the terms feels like reading the domain's own vocabulary. This is one of the areas where Prolog's homoiconicity — the property that code and data share the same syntactic representation — becomes a genuine practical asset rather than a theoretical curiosity.

The question of when to use a Dict versus a compound term DSL is worth addressing directly, because both are legitimate tools and choosing between them affects readability and maintainability. Dicts are the right choice for entities — things with a stable identity and a set of named attributes that may change over time. The VM records in section 4.2 are Dicts because a VM is an entity: it has a name, a status, an IP, and the set of attributes we track may expand as requirements grow without breaking existing rules. Compound term DSLs are the right choice for structured actions or specifications — things whose hierarchical structure carries semantic meaning beyond just a bag of attributes. A backup job specification is not just a collection of values; it has a source and a destination and a schedule and a retention policy, and the relationships between these components are part of the meaning. Nested compound terms express those relationships more clearly than a flat Dict would.

Consider backup operations. A backup job has a source, a destination, a schedule, a retention policy, and a method. We model this with nested compound terms:

% backup_job(JobID, Spec)
% Models scheduled backup jobs using a compound term DSL.
backup_job(job_001,
    backup(
        source(vm(mint_logic_lab), path('/home/logicdev/logic-lab')),
        destination(vm(backup_target), path('/backups/mint')),
        schedule(daily, time(02, 30)),
        retention(keep_last(7), keep_weekly(4)),
        method(rsync, options([compress, checksum, delete_stale]))
    )
).

backup_job(job_002,
    backup(
        source(vm(debian_core), path('/etc')),
        destination(vm(backup_target), path('/backups/debian-etc')),
        schedule(hourly, minute(0)),
        retention(keep_last(24), keep_daily(7)),
        method(rsync, options([compress, checksum]))
    )
).

These terms are readable without supplementary documentation. schedule(daily, time(02, 30)) means "daily at 02:30." retention(keep_last(7), keep_weekly(4)) means "keep the last 7 backups and 4 weekly backups." The structure is enforced by the terms themselves — there is no way to accidentally write retention(7, 4) and have it silently accepted as meaning the same thing. A rule that pattern-matches on retention(keep_last(N), _) will simply fail to match retention(7, 4), making the error immediately visible rather than silently propagating a malformed value.

Rules that query and validate this DSL are equally readable:

% jobs_for_source_vm(+VMName, -JobIDs)
% Returns all backup job IDs where the source is the named VM.
jobs_for_source_vm(VMName, JobIDs) :-
    findall(ID,
        backup_job(ID, backup(source(vm(VMName), _), _, _, _, _)),
        JobIDs).

% next_backup_path(+JobID, -FullPath)
% Constructs the full destination path for a job's next backup run.
next_backup_path(JobID, FullPath) :-
    backup_job(JobID, backup(_, destination(_, path(BasePath)), _, _, _)),
    get_time(Now),
    format_time(atom(DateStr), '%Y%m%d', Now),
    atomic_list_concat([BasePath, '/', DateStr], FullPath).

% job_uses_compression(+JobID)
% True if the job's rsync method includes the compress option.
job_uses_compression(JobID) :-
    backup_job(JobID, backup(_, _, _, _, method(rsync, options(Opts)))),
    member(compress, Opts).

% validate_backup_job(+JobID, -Errors)
% Validates a backup job spec against the live infrastructure knowledge base.
% Returns a list of error terms; empty list means the job is valid.
validate_backup_job(JobID, Errors) :-
    backup_job(JobID, backup(
        source(vm(SrcVM), _),
        destination(vm(DstVM), _),
        _Schedule, _Retention, _Method
    )),
    findall(Error,
        (   \\+ vm_record(SrcVM, _), Error = error(unknown_source_vm, SrcVM)
        ;   \\+ vm_record(DstVM, _), Error = error(unknown_destination_vm, DstVM)
        ;   vm_record(SrcVM, SrcDict), SrcDict.status \\= online,
            Error = error(source_vm_offline, SrcVM)
        ;   vm_record(DstVM, DstDict), DstDict.status \\= online,
            Error = error(destination_vm_offline, DstVM)
        ),
        Errors
    ).
?- jobs_for_source_vm(mint_logic_lab, Jobs).
Jobs = [job_001].

?- next_backup_path(job_001, Path).
Path = '/backups/mint/20260615'.

?- job_uses_compression(job_001).
true.

?- validate_backup_job(job_001, Errors).
Errors = [].

?- validate_backup_job(job_002, Errors).
Errors = [error(source_vm_offline, debian_core)].

The validate_backup_job/2 predicate is the most important addition here. It uses the infrastructure knowledge base to validate the backup job specification before any execution takes place. debian_core is currently offline in our knowledge base, so job_002 fails validation with a source_vm_offline error. This is the fundamental pattern of the entire book stated at the level of a single predicate: declare a specification as a compound term, validate it against a live knowledge base before acting on it, and return structured error terms that both the logging system and the human operator can understand. The Go process in Part III will never issue a backup command without first querying validate_backup_job/2 and confirming the error list is empty.

The get_time/1 and format_time/3 predicates handle date and time in the path construction. get_time/1 returns the current POSIX timestamp as a float. format_time/3 formats it according to a strftime-style format string, with atom(DateStr) as the output type specifier producing an atom. The atomic_list_concat/2 predicate joins the list of path components into a single atom, bridging atoms and strings seamlessly — if BasePath or DateStr happen to be strings rather than atoms, atomic_list_concat handles both without complaint.

4.8 The Mint System Health Monitor: Chapter Project

The chapter project brings together Dicts, strings, and compound terms into a unified system health monitor. The monitor aggregates VM status, sensor readings, and backup job status into a single health report, demonstrating how the data modelling techniques of this chapter compose into a coherent whole that could stand alone as a useful homelab tool even before the Go integration arrives.

Create ~/logic-lab/prolog/health_monitor.pl:

% health_monitor.pl
% Unified system health monitor for the Proxmox homelab.
% Part II, Chapter 4 - Modern SWI-Prolog (2026 Edition)

:- module(health_monitor, [
    system_health/1,
    health_report/0,
    component_status/3
]).

:- use_module(library(aggregate)).
:- use_module(infrastructure).

% component_status(+Type, +ID, -StatusDict)
% Unified status interface for any system component.
component_status(vm, Name, status{
    type:  vm,
    id:    Name,
    state: VM.status,
    role:  VM.role,
    node:  VM.node
}) :-
    vm_record(Name, VM).

component_status(sensor, ID, status{
    type:  sensor,
    id:    ID,
    state: State,
    value: S.value,
    unit:  S.unit
}) :-
    sensor_reading(ID, S),
    (   S.value >= S.critical -> State = critical
    ;   S.value >= S.warning  -> State = warning
    ;                            State = normal
    ).

% system_health(-HealthDict)
% Computes an overall system health summary.
system_health(health{
    overall:      Overall,
    vms_online:   OnlineCount,
    vms_offline:  OfflineCount,
    alerts:       Alerts,
    alert_count:  AlertCount
}) :-
    aggregate_all(count, vm_record(_, vm{status: online}),  OnlineCount),
    aggregate_all(count, vm_record(_, vm{status: offline}), OfflineCount),
    findall(A, sensor_alert(_, A), Alerts),
    length(Alerts, AlertCount),
    (   AlertCount =:= 0, OfflineCount =:= 0 -> Overall = healthy
    ;   AlertCount  >  0                      -> Overall = degraded
    ;                                            Overall = warning
    ).

% health_report/0
% Prints a formatted health report to the terminal.
health_report :-
    system_health(H),
    format("~n╔══════════════════════════════════╗~n"),
    format("║     HOMELAB HEALTH REPORT        ║~n"),
    format("╠══════════════════════════════════╣~n"),
    format("║ Overall Status : ~w~n", [H.overall]),
    format("║ VMs Online     : ~w~n", [H.vms_online]),
    format("║ VMs Offline    : ~w~n", [H.vms_offline]),
    format("║ Active Alerts  : ~w~n", [H.alert_count]),
    format("╠══════════════════════════════════╣~n"),
    format("║ VM DETAILS~n"),
    forall(
        vm_record(_, VM),
        format("║  [~w] ~w  ~w  ~w~n",
               [VM.status, VM.role, VM.name, VM.ip])
    ),
    format("╚══════════════════════════════════╝~n~n").

Load and run the report:

?- health_report.

╔══════════════════════════════════╗
║     HOMELAB HEALTH REPORT        ║
╠══════════════════════════════════╣
║ Overall Status : warning
║ VMs Online     : 3
║ VMs Offline    : 1
║ Active Alerts  : 0
╠══════════════════════════════════╣
║ VM DETAILS
║  [online]  development   mint_logic_lab  192.168.10.20
║  [offline] orchestrator  debian_core     192.168.10.21
║  [online]  firewall      pfsense_fw      192.168.10.1
║  [online]  storage       backup_target   192.168.10.30
╚══════════════════════════════════╝

true.

The overall status is warning because debian_core is offline and there are no sensor alerts. The three-way if-then-else in system_health/1 evaluates the conditions in priority order: critical sensor alerts take precedence over offline VMs, which take precedence over a clean bill of health. When we wire this to Go in Chapter 6, the system_health/1 predicate will be called by the Go HTTP handler, the resulting health Dict will be serialised to JSON, and a monitoring endpoint will serve it. The Prolog code shown here does not change at all for that transition — only the I/O wrapper changes, from a terminal format/2 call to a JSON HTTP response.

HEALTH MONITOR DATA FLOW
─────────────────────────────────────────────────────────────────
  SOURCE FACTS                   DERIVED DICTS
  ────────────────               ──────────────────────────────
  vm_record/2        ──────────▶ component_status(vm, ...)
  (infrastructure.pl)            vm{status, role, node, ...}

  sensor_reading/2   ──────────▶ component_status(sensor, ...)
  (infrastructure.pl)            sensor_alert/2
                                 alert{level, value, threshold}

                    both feed ──▶ system_health/1
                                  health{
                                    overall:     warning,
                                    vms_online:  3,
                                    vms_offline: 1,
                                    alerts:      [],
                                    alert_count: 0
                                  }
                                        │
                          ┌─────────────┴──────────────┐
                          │  NOW (Chapter 4)            │  LATER (Chapter 6)
                          │  health_report/0            │  Go HTTP handler
                          │  format/2 → terminal        │  json_write_dict/2
                          │                             │  → HTTP/JSON response
                          └─────────────────────────────┘
─────────────────────────────────────────────────────────────────
  The Prolog logic layer does not change between these two modes.
  Only the I/O wrapper changes. This is the separation of concerns
  that makes the Go-Prolog architecture practical at scale.

That separation of concerns — logic entirely in Prolog, I/O entirely in Go — is the architectural principle that the rest of this book is built on.

4.9 Chapter Summary and What Comes Next

The transition from positional compound terms to Dicts is not just a syntactic improvement. It is a shift in how we think about data in the knowledge base. A Dict is a self-describing entity whose field names carry the semantics of the data rather than relying on positional convention that exists only in the programmer's memory. The put_dict/4 pattern for immutable updates, the partial unification behaviour that allows Dicts to serve as flexible patterns, and the direct correspondence between Prolog Dicts and JSON objects together make this data model the natural foundation for the Go integration ahead.

The string handling tools introduced in section 4.6 — split_string/4, number_string/2, format_time/3 — are the same tools we will use in Chapter 5 to parse real log files from /var/log. And the backup job DSL of section 4.7 is a direct preview of the orchestration architecture in Part IV, where compound terms carry structured plans from the Prolog reasoning engine to the Go executor.

Chapter 5 takes us into Definite Clause Grammars. A DCG is a notation for writing parsers in Prolog that is so natural it almost disappears into the text being parsed. We will write a parser for real Linux log entries from /var/log/syslog, turning raw text like Jun 15 14:32:01 mint-logic-lab sshd[1234]: Accepted publickey for logicdev from 192.168.10.5 into a structured Prolog fact that can be queried, filtered, and reasoned over. The combination of DCG-parsed log data with the Dict-based infrastructure knowledge base of this chapter gives us a system that genuinely knows what is happening on the running machine — a foundation for the automated monitoring and recovery logic that arrives in Part III.


Appendix 4A: Dict Predicate Reference

The following are the core Dict API predicates in SWI-Prolog 10.x. All are available after loading library(dicts).

Predicate Description
get_dict(+Key, +Dict, -Value) Access a field; fails if key absent
put_dict(+Key, +Dict, +Value, -NewDict) Return new Dict with single field set
put_dict(+Pairs, +Dict, -NewDict) Return new Dict with multiple fields set
del_dict(+Key, +Dict, ?Value, -NewDict) Return new Dict with field removed
dict_keys(+Dict, -Keys) Get sorted list of all keys
dict_pairs(+Dict, ?Tag, ?Pairs) Convert between Dict and Key-Value pair list
is_dict(+Term) True if Term is a Dict
is_dict(+Term, ?Tag) True if Term is a Dict with given tag

Appendix 4B: String Predicate Reference

Predicate Description
split_string(+Str, +SepChars, +PadChars, -SubStrings) Split string on separator characters
string_concat(+A, +B, -C) Concatenate two strings
string_length(+Str, -Len) Length in characters
string_lower(+Str, -Lower) Convert to lowercase
string_upper(+Str, -Upper) Convert to uppercase
string_codes(+Str, -Codes) Convert to list of Unicode code points
atom_string(?Atom, ?Str) Convert between atom and string
number_string(?Number, ?Str) Convert between number and string
format(string(S), Fmt, Args) Build a string using format/2 notation

Appendix 4C: Snapshot Checkpoint

Snapshot name: 05-chapter-4-complete
Description:   Dict-based infrastructure KB, sensor monitor,
               backup DSL, and health monitor complete.
               Files: infrastructure.pl, health_monitor.pl