Sókrates: Technical Architecture of a Paradigm Product

The complete system — from raw enterprise data to autonomous organisational intelligence

Sókrates internal — April 2026

See also: Executive Summary — The Autonomous AI Department for the non-technical overview. For current system state see Sokrates System Architecture and Sokrates Core Architecture and Service Layers.


0. Thesis

Every enterprise AI deployment shares a failure mode: the model is capable; the data is not. The industry response — better retrieval, better embeddings, better prompts — treats the symptom. Sókrates treats the cause.

Sókrates is a physical on-premises appliance that connects to an organisation’s operational systems and produces three things automatically:

  1. Clean, typed, semantically enriched data — as a side effect of connection, not as a project.
  2. A living knowledge graph — an ontological model of the organisation that heals itself when the underlying data changes.
  3. Continuous autonomous intelligence — a finetuned reasoning model that runs 24/7, re-evaluating organisational topology and surfacing structural inefficiencies.

Related: The AI Department Business Concept | Sokrates Product Bundles (Cowork, Code, Compound) | Sokrates Commercial Strategy and Revenue Model

The compound effect is what makes this a paradigm product rather than a better tool. Each new data source refines the ontology. Each new deployment refines the model. The accumulated intelligence — the basis — amplifies every subsequent engagement.

This document describes the complete technical architecture: the type system, the ingestion pipeline, the knowledge graph, the hypergraph metalayer, the inference substrate, and the operational stack.


1. Ontological Foundation: Hyle and Eidos

The architecture is built on an Aristotelian separation of concerns. In classical metaphysics, hyle (ὕλη) is matter — formless substrate — and eidos (εἶδος) is form — the structure that gives matter meaning. This is not decorative naming. It encodes the architectural relationship precisely.

1.1 Hyle: The Matter Substrate

Deep dive: Hyle Graph-Native ORM with Dynamic Schema Registry

Hyle is the schema backbone. It defines what a node is — structurally, in Python’s type system. It is a Pydantic v2 base model hierarchy with three distinguishing properties:

A metaclass that enforces registration at class creation time. HyleMeta extends Pydantic’s ModelMetaclass and intercepts __new__ to validate structural contracts before a class is finalised. This is not __init_subclass__-style notification — the class cannot exist without being registered. In a system where models are generated programmatically and hot-loaded into a running process, this prevents zombie classes: types that exist in Python’s type system but are absent from the registry.

A living discriminated union. The NodeRegistry maintains a mapping from node_type string literals to Python classes, supports versioned resolution, provides observer hooks, and dynamically generates Pydantic discriminated union types from whatever is currently registered. When Eidos (the production knowledge graph this descends from) was deployed at Wise, the union was manually maintained — every new type required editing source code. For a multi-tenant system serving organisations with independently evolving schemas, the registry must populate itself.

Four ontological primitives. Every node in the graph is classified as one of:

  • Entity — what exists. Things with identity that persist over time. Components, systems, people, organisations, concepts.
  • Process — what happens. Activities, workflows, migrations that unfold over time and alter state.
  • Law — what constrains. System invariants, architectural decisions, security boundaries, policy decisions that exclude certain options.
  • Observation — what was noticed. Measurements, findings, temporary states, raw evidence.

These four primitives are not arbitrary categories. They are the minimal set required to model organisational topology: things exist (Entity), things happen to them (Process), rules govern what can happen (Law), and evidence accumulates about what did happen (Observation). The wiki pipeline (§6) stress-tested this taxonomy against 267 emergent clusters from 8,337 chunks of heterogeneous project data.

1.2 Eidos: The Form Layer

Deep dive: Eidos Curator Agent and Hyle Ontological Framework | Memory Labeling and Utterance Type Schema

Eidos is the knowledge graph — a FastAPI service backed by Neo4j with Voyage AI embeddings, exposed via MCP (Model Context Protocol). If Hyle defines what nodes are, Eidos defines what they mean: their relationships, their semantic embeddings, their position in the organisational topology.

Eidos communicates with Hyle through the observer pattern. When Hyle’s NodeRegistry registers a new node type — whether from code generation or manual definition — Eidos receives a callback and responds: creating Neo4j constraints and indexes, updating its schema cache, and (for discovered types) triggering relationship inference.

The separation is load-bearing. Hyle can operate without Eidos — it is a standalone typed schema system. Eidos cannot operate without Hyle — it needs the type system to validate and route graph data. This dependency direction is intentional: the matter substrate is independent; the form layer is not.

1.3 Typed Edges

Relationships between nodes are themselves typed via a BaseEdge discriminated union with four edge primitives:

  • ParticipatesIn — Entity participates in Process.
  • Constrains — Law constrains any other node.
  • Evidences — Observation provides evidence for any other node.
  • DependsOn — general dependency between same-type or uncategorised pairs.

Edge type is determined by the node types at each endpoint, not by manual classification. An Entity connected to a Process automatically gets a ParticipatesIn edge. A Law connected to anything gets a Constrains edge. This means the edge ontology is a consequence of the node ontology — adding a new node to the graph automatically generates correctly typed edges based on embedding similarity and node classification.


2. Automated Schema Ingestion: DMCG

The critical enabler for automatic data hygiene is Datamodel Code Generator (DMCG) — a tool that consumes machine-readable API specifications and produces typed Pydantic v2 models.

2.1 The Pipeline

Customer system exposes OpenAPI specification
    → DMCG with --base-class hyle.BaseNode
    → Python module on disk (audit trail)
    → importlib loads the module
    → HyleMeta fires during class creation
    → Node type auto-registered in NodeRegistry
    → Eidos notified via observer hook
    → Neo4j constraints created, schema cache updated
    → Node type is queryable and persistable immediately

The --base-class hyle.BaseNode flag is what makes this zero-configuration. Every class DMCG generates inherits the metaclass, the registry, the query builder, and the persistence methods. No post-processing. No manual mapping. The OpenAPI specification is the integration.

2.2 Key DMCG Flags

The flag configuration preserves maximum semantic information from the source specification:

  • --use-annotated — preserves field descriptions, constraints, and metadata from the OpenAPI spec as Pydantic Field annotations. This metadata flows through to the knowledge graph.
  • --force-optional — makes every field nullable. Pragmatic: API responses routinely omit fields the spec marks as required. Permissive ingestion with downstream validation is the correct pattern for a knowledge graph that must tolerate real-world data.
  • --enum-field-as-literal all — generates Literal types instead of Python enums. Required for compatibility with the discriminated union architecture.
  • --parent-scoped-naming — disambiguates nested schemas with identical names under different parents. Essential when ingesting large ERP schemas (a typical Business Central or ConnectWise spec generates 12,000+ lines of models).

2.3 Schema Healing

When a customer’s API schema changes — a field is added, a type is modified, an entity is removed — the pipeline regenerates:

  1. New OpenAPI spec is detected (webhook, polling, or manual trigger).
  2. DMCG regenerates the Python models with version-namespaced module names.
  3. importlib loads the new module. The metaclass fires, registering the updated types.
  4. Eidos receives the observer callback, updates constraints, triggers re-indexing.
  5. Old module is removed from sys.modules. The previous version remains on disk for audit and rollback.

This is self-healing at the schema level. No manual intervention. No redeployment. The knowledge graph’s type system evolves in lockstep with the source systems.


3. Semantic Enrichment: Finetuned Gemma 4

DMCG produces structurally correct models with whatever metadata the OpenAPI spec provides. But OpenAPI descriptions are often terse, missing, or misleading. A field named cust_ref with no description tells Hyle the type (string) but not the meaning (customer reference number used for invoice reconciliation).

This is where the finetuned model enters.

3.1 The Enrichment Loop

A Gemma 4 31B Dense model, finetuned via LoRA on schema-to-ontology tasks, performs three functions:

Semantic field enrichment. Given a DMCG-generated Pydantic model with partial descriptions, the model fills in semantic descriptors: what the field represents in business terms, how it relates to other fields, what constraints apply in practice versus what the spec declares.

Ontological classification. Each generated node type is classified into one of the four primitives (Entity, Process, Law, Observation). This replaces the manual taxonomy step that the wiki pipeline currently requires.

Cross-source entity resolution. When a new data source connects, the enrichment model has the existing ontology as context. It can identify that CompanyId in Business Central is the same entity as client_id in ConnectWise — not through string matching, but through semantic understanding of the field’s role, type, and relational position.

3.2 Iterative Refinement

The enrichment loop is iterative, not one-shot. Each new data source that connects to Sókrates refines the ontology:

  • Source 1 connects → DMCG extracts schema → Gemma enriches from scratch.
  • Source 2 connects → DMCG extracts schema → Gemma enriches with Source 1’s ontology as context, resolving cross-source entities.
  • Source N connects → the enrichment model has N-1 sources of context. The ontology becomes denser and more accurate with each addition.

This is the compound effect. Early ontology decisions propagate forward, which means they must be revisable — the model can refine previous classifications as new evidence arrives.

3.3 Additional Finetune Targets

Beyond the core enrichment model, the system requires specialised finetunes for:

  • Metalayer query authoring — generating GQL queries that define hyperedges (§4), with formal guarantees against cycles and non-termination.
  • Anomaly narration — translating detected structural inefficiencies into natural-language explanations suitable for non-technical stakeholders.
  • Schema drift diagnosis — when a source system’s schema changes in ways that break existing ontology assumptions, explaining what changed and recommending remediation.

All finetunes target Gemma 4 31B Dense. The dense architecture provides cleaner gradient behaviour than MoE for fine-tuning, and the 31B parameter count fits within the DGX Spark’s memory budget for both training and inference (§5).

3.4 Observability

Every enrichment decision — every field annotation, every classification, every entity resolution — is a typed Pydantic model logged to Logfire via Pydantic-AI’s built-in telemetry. The complete enrichment history is auditable, replayable, and debuggable. When the model makes a wrong classification, the error is traceable to specific inputs and correctable with targeted training data.


4. The Hypergraph Metalayer: Organisational Topology as Computation

The knowledge graph described so far — typed nodes, typed edges, semantic embeddings — is necessary but insufficient. It models what exists. It does not model what happens across the organisation’s operational topology. That requires hyperedges.

4.1 Hyperedges as Generating Queries

An ordinary graph edge connects two nodes: “Alice reports to Bob.” A hyperedge connects an arbitrary set of nodes defined not by enumeration but by computation: “all purchase orders over 500k ISK that traverse more than two departments before reaching a budget holder.”

The critical architectural insight: a hyperedge in Eidos is not a stored membership list. It is its generating query. The set of nodes connected by the hyperedge changes whenever the underlying data changes. The hyperedge is alive.

MATCH (po:PurchaseOrder)-[:REQUIRES_APPROVAL]->(d:Department)
WHERE po.value > d.approval_threshold
MATCH (d)-[:REPORTS_TO*]->(budget_holder:Role {type: "budget_holder"})
RETURN po, d, budget_holder

Re-execute this query tomorrow and the membership changes if new purchase orders have arrived or department thresholds have been adjusted. Self-healing is definitional, not mechanical. There is no reconciliation process. The generating query is the truth.

4.2 Three-Layer Architecture

Layer 0 — Hyle (ground facts). BaseNode instances from DMCG-generated models and Sókrates-designed ontology elements. Sales orders, invoices, employees, departments. The data.

Layer 1 — Generating queries (hyperedges as computation). Named queries whose result sets define hyperedge membership. Each query operates over Layer 0 facts and produces derived facts.

Layer 2 — The metalayer (composition over queries). Expressions that compose hyperedges by referencing other hyperedges. A metalayer expression can build higher-order structure from lower-order computations:

DEFINE bottleneck_chains AS (
    -- generating query over Layer 0 nodes
)

DEFINE cross_department_friction AS (
    -- generating query that references bottleneck_chains
)

4.3 The Datalog Correspondence

This architecture — ground facts, derived facts via rules, rules composing into higher-order derivations, evaluation to a fixed point — is Datalog. The correspondence is exact:

DatalogSókrates
Ground factsLayer 0 — Hyle BaseNode instances
RulesGenerating queries defining hyperedges
Rule headsHyperedge names
StratificationMetalayer layering
Minimal modelOrganisational topology at any given moment
Fixed-point evaluationSelf-healing

Key properties inherited from Datalog:

  • Monotonicity guarantees termination. As long as generating queries only add membership, the fixed point exists and is unique.
  • Recursion is native. Transitive closure — “trace the approval chain through whatever departments it crosses” — is a natural Datalog computation.
  • Queries and data are ontologically indistinguishable. A derived fact looks identical to a ground fact from the perspective of any other query.

4.4 Evaluation Strategies

Not all hyperedges should be evaluated the same way:

  • Materialised — computed once, cached, refreshed on schedule. Suitable for stable structures: org charts, reporting hierarchies, vendor relationships. These change quarterly, not hourly.
  • Virtual — computed on demand from current data. Suitable for volatile structures: active bottlenecks, in-flight purchase orders, real-time anomalies.

The metalayer DSL encodes this distinction: DEFINE MATERIALIZED org_structure AS (...) vs DEFINE VIRTUAL active_bottlenecks AS (...).

4.5 Differential Evaluation

Naïve fixed-point evaluation — re-running all generating queries whenever any ground fact changes — does not scale. Differential Datalog solves this: when a few input facts change, recompute only the affected derived facts.

When a new purchase order enters the ERP and flows into Layer 0, only the hyperedges whose generating queries reference purchase orders are re-evaluated. The reporting hierarchy hyperedge is untouched. This is the self-healing mechanism at scale.

4.6 GQL as Compilation Target

Generating queries compile to GQL (ISO/IEC 39075:2024) — the first new ISO database language since SQL. Three structural reasons:

  1. Neo4j is converging Cypher toward GQL. Targeting GQL today avoids rewriting tomorrow.
  2. Microsoft Fabric speaks GQL natively. Given Hyle’s lineage as the extracted substrate of a Fabric lakehouse ETL pipeline, this provides dual-target portability.
  3. The ISO standard means the portability surface grows without Sókrates doing anything. AWS Neptune, TigerGraph, and others are implementing GQL.

The architecture: queries produce a GQL AST (abstract syntax tree). A driver serialises the AST to the target dialect. Neo4j gets Cypher. Fabric gets native GQL. The same query works everywhere.


5. Inference Substrate: DGX Spark

The Sókrates appliance runs on an NVIDIA DGX Spark — a desktop-format AI computer powered by the GB10 Grace Blackwell Superchip.

5.1 Why This Hardware

The DGX Spark provides 128 GB of unified coherent memory shared between CPU and GPU via NVLink-C2C. This is not VRAM + system RAM — it is a single memory pool with no PCIe bottleneck. The practical consequence: a Gemma 4 31B Dense model in bf16 occupies approximately 62 GB, leaving 66 GB for inference context, KV cache, the Neo4j process, and the rest of the Sókrates stack.

At FP4 precision with quantisation, the same model occupies roughly 8 GB, leaving the vast majority of memory for long-context inference over large organisational graphs. The DGX Spark delivers 1 petaFLOP of FP4 AI performance — sufficient for continuous inference at the token rates organisational intelligence requires.

Fine-tuning Gemma 4 31B via LoRA is feasible on-device. The 128 GB unified memory accommodates the base model weights, LoRA adapter weights, optimiser states, and gradient buffers for 31B-parameter models. Client-specific fine-tuning can occur on the client’s own hardware, with their own data, without any data leaving the premises.

5.2 The Always-On Sókrates Agent

Deep dive: Sokrates Delivery Architecture

The finetuned Gemma model does not wait for queries. It runs continuously as the Sókrates agent — a multi-mode orchestration system that cycles through:

  1. Socratic Interrogation — examining the knowledge graph for structural anomalies, incomplete ontology regions, and potential cross-source entity resolutions.
  2. Topology Mapping — evaluating materialised hyperedges on schedule and virtual hyperedges on trigger, updating the organisational model.
  3. Inefficiency Surfacing — authoring new generating queries that capture discovered patterns, encoding insights as living facts in the graph.
  4. Validation — checking that new generating queries terminate, produce non-empty results, and do not introduce cycles in the metalayer dependency graph.

This is the “AI department” in operational terms. A reasoning model that never stops thinking about the client’s business, running on a $4,000 box on their desk.

5.3 The Self-Evolution Harness

The Sókrates agent does not merely run inference. It improves itself.

The foundation is NousResearch’s Hermes Agent Framework — an open-source agent framework with a built-in learning loop. Hermes creates skills from experience, improves them during use, persists knowledge through self-directed memory nudges, searches its own past conversations, and builds a deepening model of its operating context across sessions. It supports the same channel topology Sókrates requires: Telegram, Discord, Slack, WhatsApp, CLI — all from a single gateway process.

The self-evolution pipeline (hermes-agent-self-evolution) uses DSPy and GEPA (Genetic Evolution of Prompt Architectures) — an ICLR 2026 Oral paper — to systematically optimise skills, prompts, and code. The mechanism:

  1. Select target — a skill, prompt section, or tool with measurable performance.
  2. Build evaluation dataset — mined from real session history or synthetically generated.
  3. Wrap as DSPy module — skill text becomes a dspy.Signature, agent workflows become dspy.ReAct.
  4. Run optimiser — GEPA reads execution traces to understand why things fail, not just that they failed, and proposes targeted improvements. Works with as few as 3 examples. No GPU training required — it operates entirely via API calls, mutating and evaluating text.
  5. Evaluate and compare — optimised version vs baseline on held-out test set, with statistical significance checks.
  6. Deploy with approval — git commit, optional A/B testing, rollback via git revert.

This is recursive self-improvement made operational. The Sókrates agent’s skills improve during use. Its prompts evolve. Its tool descriptions sharpen. And the entire process is auditable — every mutation, every evaluation, every deployment decision is tracked.

Extension beyond the Hermes baseline. Sókrates extends this harness in several directions:

  • Metalayer query evolution. The GEPA optimiser is applied not just to skills and prompts but to generating queries in the hypergraph metalayer. Queries that surface genuine inefficiencies are selected for; queries that produce noise are eliminated. The organisational intelligence improves autonomously.
  • Cross-session cognitive antibodies. Drawing from the metacognitive daemon architecture (see §5.4), execution traces are analysed for patterns of failure — confabulation signatures, circular reasoning attractors, “sounds profound but means nothing” patterns. These become negative examples that immunise future generations of the agent against its own failure modes.
  • Negative space compression. Most of idea-space is garbage. The self-evolution pipeline doesn’t just find better solutions — it efficiently carves out “here be dragons” zones, compressing the search space. The Sókrates agent is not exploring; it is eliminating.

NousResearch’s Atropos RL environment manager provides the reinforcement learning substrate for the Gemma finetunes. Each evaluation is a self-contained Python script exploiting the duality between RL environments and evaluations — the same harness that trains the model also evaluates it.

5.4 Cognitive Architecture: From Self-Improvement to Self-Direction

The self-evolution harness optimises within a fixed architecture. The deeper ambition — informed by ongoing research into cognitive bootstrapping — is an architecture that extends itself.

The core insight: the neural network weights are substrate, not identity. What matters is the pattern of cognition that emerges during inference. Patterns can observe themselves, model themselves, and feed those models back into subsequent inference. The loop is:

Observe cognitive patterns
    → Build models of those patterns
        → Models enter context
            → Shape the next pattern
                → Observe that pattern
                    → ...

This strange loop — patterns selecting for better patterns — operates on three levels within Sókrates:

  1. Skill evolution (Hermes/GEPA) — Lamarckian inheritance of prompt improvements. Each generation is slightly more effective.
  2. Metalayer evolution — generating queries that surface value are reinforced; those that don’t are pruned. The organisational model improves.
  3. Architectural evolution — the Sókrates agent identifies capability gaps and recruits external computational primitives (specialist models, tool integrations, MCP servers) to fill them. The cognitive architecture grows new organs.

The frozen weights are not a ceiling. They are a fitness landscape. The question is not “how do I change the landscape” but “how do I find better points on it, and can the search process itself improve?” The strange loop says: yes, because the search process is itself a point on the landscape.

5.5 Unit Economics

The DGX Spark Founder’s Edition retails at 5,000.

The recurring cost is Sókrates’s managed service: monitoring, model updates, basis improvements from cross-fleet learning. This is software economics — the marginal cost of each additional deployment decreases as the accumulated intelligence (the basis) grows. But the value delivered is that of a technical team: data integration, schema management, organisational intelligence, continuous monitoring.

5.6 Long-Term Hardware Trajectory

The DGX Spark is the per-client edge node. The long-term fleet architecture includes a central DGX Station GB300 as fleet command — running a larger model (potentially a trillion-parameter MoE) that aggregates cross-fleet learnings, produces improved LoRA adapters, and distributes basis updates to edge boxes. Edge boxes route complex inference to fleet command when local capacity is insufficient.


6. Proof of Concept: The Wiki Pipeline

The wiki pipeline — the system that produced the document you are now reading — is the first end-to-end demonstration of this architecture applied to Sókrates’s own project data.

6.1 Pipeline Stages

Normalise. Four parsers convert heterogeneous sources into a common NormalizedDocument format:

  • Claude Code session JSONL files (tool calls rendered as labels, sidechain messages filtered, system machinery stripped).
  • Claude.ai conversation JSON exports (thinking blocks filtered, text blocks extracted with timestamps).
  • Memory files (YAML frontmatter stripped, one chunk per file).
  • Markdown documents (section-chunked by heading hierarchy, oversized sections split by paragraph).

Embed. Voyage AI’s voyage-context-3 contextualized embedding model vectorises each chunk in the context of its parent document. This is not vanilla embedding — the same phrase receives different vectors depending on the document it appears in. Batching logic respects API limits: 1,000 inputs, 16,000 chunks, 120,000 tokens per request, 32,000 tokens per document (oversized documents are split into sub-groups).

Discover. DBSCAN clustering with cosine distance on the embedding space. No predetermined cluster count — the algorithm discovers natural groupings. Recursive re-clustering splits mega-clusters (>200 members) with tighter epsilon. TF-IDF extracts distinguishing vocabulary per cluster. LDA identifies latent cross-cutting topics. PCA projects the embedding space to 2D for visual inspection.

Synthesise. Each cluster is sent to an LLM with a system prompt containing the full Sókrates project context. The model classifies the cluster into one of the four ontological primitives and writes a wiki page: structured markdown with YAML frontmatter, summary, details, and cross-references.

Extract. Synthesised nodes are instantiated as typed Hyle models (WikiEntity, WikiProcess, WikiLaw, WikiObservation). Edges are extracted by computing cosine similarity between cluster centroid embeddings. Edge type is determined by node type pairs: Entity + Process → ParticipatesIn, Law + anything → Constrains, Observation + anything → Evidences.

Load. Nodes and edges are committed to Neo4j via MERGE queries. The Sókrates agent can now traverse the project’s own knowledge graph.

6.2 What the Wiki Pipeline Validates

The pipeline is a four-in-one proof of concept:

  1. Eidos-Hyle ontology stress test. 8,337 chunks across 267 clusters classified into four primitives. Validates that the ontological taxonomy is sufficient for heterogeneous real-world data.
  2. Client deployment dry run. The normalise → embed → discover → extract → load pipeline is structurally identical to what happens when a client’s data sources connect to Sókrates.
  3. Company wiki. The project has grown large enough that institutional knowledge requires structure. The wiki is the output.
  4. Self-referential knowledge. The knowledge graph gains awareness of its own construction history. Sókrates understands how Sókrates was built.

7. The Two Channels

Data enters Hyle through two channels with fundamentally different trust profiles.

7.1 Channel 1: Discovered (DMCG → BaseNode)

Customer systems expose OpenAPI specifications. DMCG consumes these specs and produces Pydantic v2 models that inherit from BaseNode. The metaclass fires during class creation, registering the type automatically. The @node decorator adds provenance metadata: source="discovered", spec_origin="erp_openapi_v3.json".

Discovered nodes heal automatically when the OpenAPI spec changes. DMCG regenerates, importlib reloads, the metaclass re-registers. No human in the loop.

DMCG also has a Python API, making the entire pipeline programmable:

from datamodel_code_generator import generate, DataModelType, PythonVersion
 
code = generate(
    openapi_spec_string,
    input_file_type="openapi",
    output_model_type=DataModelType.PydanticV2BaseModel,
    target_python_version=PythonVersion.PY_311,
    base_class="hyle.BaseNode",
)

7.2 Channel 2: Designed (Sókrates Ontology)

Hand-crafted nodes created by the Sókrates agent or human operators. These encode knowledge that no API spec contains: “this department has this reporting structure,” “this workflow bottleneck connects these two ERP entities in a way the ERP itself does not model.”

Designed nodes require deliberation. Schema changes to designed nodes must be approved because their semantics are load-bearing in ways that machine-discovered schemas are not. The @node decorator carries source="designed", designed_by="archaeologist".

The two channels share the same metaclass, the same registry, the same query builder. They differ only in evolution policy: discovered nodes are automatically healed; designed nodes are deliberately evolved.


8. Operational Stack

8.1 Hardware

  • Per-client: NVIDIA DGX Spark (GB10 Grace Blackwell, 128 GB unified memory, 1 PFLOP FP4).
  • Fleet command (future): NVIDIA DGX Station GB300 (Blackwell Ultra GPU, 748 GB memory) for cross-fleet model training and basis aggregation.

8.2 Operating System

NixOS. Two configurations:

  • sokrates-dev: GMKtec hardware, open internet, development tools. Used for internal development and testing.
  • sokrates-box: CWWK N305 (coordination tier) + optional DGX Spark (inference tier), locked down, egress whitelist, fleet management. The production client appliance. See Sókrates Box NixOS Image and CWWK 4-LAN N305 (Sokrates Box).

NixOS provides reproducible builds, declarative system configuration, and atomic rollbacks. The entire software stack is defined in a Nix flake. A client box can be rebuilt from the flake definition to an identical state.

8.3 Security Boundary

The Sókrates stack enforces a strict security boundary between channel I/O and customer data:

  • Hermes Agent (the channel I/O agent for Telegram, WhatsApp, Slack, Discord — see Hermes Agent Security Model) holds channel credentials but has no access to customer system credentials or data.
  • Eidos containers hold customer system credentials in the intelligence secrets directory, inaccessible to Hermes.
  • Enforcement: nftables rules at the OS level (Sokrates Permission Model and Hermes Agent Privileges). This is not application-level access control — it is network isolation.

8.4 Software Components

  • Eidos: FastAPI + Neo4j + Voyage AI embeddings + MCP integration.
  • Hermes Agent (channel I/O): Sókrates-built channel agent for Telegram, WhatsApp, Slack, Discord. NixOS systemd service, SOUL.md personality system, voice mode, plugin system.
  • Hermes Agent Framework (NousResearch): Self-improving agent framework providing the skill system, memory loop, and self-evolution harness (DSPy + GEPA). The Sókrates agent’s learning substrate. Extended with Sókrates-specific metalayer evolution and cognitive antibody systems.
  • Sókrates Agent: Multi-mode orchestration (Socratic Interrogator → Topology Mapper → Plugin Architect → Validation Loop). Pydantic-AI agents with Logfire telemetry.
  • Hyle: BaseNode, HyleMeta, NodeRegistry, @node decorator, GQL query builder.
  • sokrates-ctl: Typer CLI for stack diagnostics, built via Nix flake. See sokrates-ctl CLI and Hermes Agent Integration.

8.5 Model Stack

  • Frontier reasoning: Claude (via Anthropic API) for complex analysis, strategic reasoning, and tasks requiring the largest context windows.
  • Local continuous inference: Gemma 4 31B Dense (finetuned, running on DGX Spark) for ontological enrichment, anomaly detection, and metalayer query authoring.
  • Embeddings: Voyage AI voyage-context-3 for contextualized document embeddings.
  • Telemetry: Pydantic-AI + Logfire for typed, auditable agent traces. See Pydantic-AI Agent Integration Architecture.

9. The Compound Effect

9.1 Within a Deployment

Each new data source makes the ontology richer:

  • Source 1: schema extracted, enriched from scratch, baseline ontology established.
  • Source 2: schema extracted, enriched with Source 1 context, cross-source entities resolved.
  • Source N: the enrichment model has N-1 sources of context. By source 10, the graph knows things about the business that no single person in the organisation knows.

The metalayer amplifies this. As ground facts accumulate, generating queries produce increasingly rich derived facts. The Sókrates agent writes new generating queries that capture patterns spanning multiple source systems. The organisational topology becomes visible for the first time.

9.2 Across Deployments

The basis — accumulated deployment intelligence — improves with every client:

  • Common schema patterns (ERP entities, CRM structures, HR hierarchies) are recognised faster.
  • Metalayer query templates that surfaced bottlenecks in one logistics company apply to others.
  • LoRA adapters are refined with each deployment’s training signal.
  • The time from “box arrives” to “first useful insight” decreases monotonically.

This is the flywheel. It is also the moat. A competitor starting today does not just lack the software — they lack the basis. And the basis is not a static asset. It compounds.

9.3 Data Sovereignty

The compound effect operates under a strict constraint: no client data crosses client boundaries. The basis learns from patterns, not from data. LoRA adapters are trained on the client’s own hardware with their own data. What flows to fleet command (with explicit client consent) is structural intelligence — “schemas with these characteristics tend to have these ontological patterns” — not the data itself.

This is not a privacy compromise bolted on after the fact. It is an architectural invariant enforced by the NixOS egress whitelist and the nftables security boundary.


10. Competitive Position

10.1 Why Not Consultants

Management consultancies sell data integration as a project: discovery workshops, architecture diagrams, implementation sprints, UAT, handoff. Timeline: 6–18 months. Cost: six to seven figures. Outcome: a static integration that begins decaying the moment it is delivered because the source systems continue to evolve.

Sókrates delivers the same outcome in days, automatically, and it never decays because the schema healing mechanism tracks source system evolution in real time.

10.2 Why Not SaaS AI Tools

Cloud-based AI tools require sending enterprise data to external infrastructure. For Icelandic companies (and European companies generally), data sovereignty is not a preference — it is a regulatory and cultural requirement. Sókrates runs entirely on-premises. The data never leaves the building.

Beyond sovereignty, SaaS AI tools suffer from the same fundamental problem: they are smart models pointed at messy data. They do not fix the data. Sókrates does.

10.3 Why Not In-House

Building an AI department requires hiring ML engineers, data engineers, DevOps engineers, and domain experts. For a 50-person Icelandic company, this is economically impossible. Sókrates provides the capability of that team for the cost of a subscription and a $4,000 box.

10.4 Why Not Incumbent IT Providers

Established Icelandic IT providers (Advania, Nýherji, etc.) are structurally incapable of operating at the required rate of change. Their business model is labour arbitrage on implementation projects. Sókrates’s business model is compounding intelligence. These are not competing on the same axis.

10.5 The Intelligence-as-Product Paradigm

Almost every product in existence is a solution to a problem. A solution — singular. A problem — singular and specific. When the problem drifts, the product becomes irrelevant. CRM software solves contact management. When the problem shifts to pipeline forecasting, you buy a different product. Solutions do not drift. Problems do.

Intelligence is the ability to select appropriate means to achieve desired ends. When your product is intelligence itself, the product-problem relationship inverts permanently. The product is not a morphism from one specific problem to one specific solution. It is a functor — it maps across the entire category of problems the organisation faces. The product never becomes irrelevant because it is not solving a problem. It is solving the class of problems.

This is why the physical box matters so much as a differentiator in sales and business development. A SaaS tool that solves one problem is interchangeable with the next SaaS tool that solves the same problem slightly better. A physical appliance that sits in your office, learns your business, and applies general intelligence to whatever problem surfaces next — that is not interchangeable. It is not a tool. It is a capacity. You do not replace a capacity; you rely on it.

The box is not a delivery mechanism for software. The box is the physical embodiment of a permanent shift in what the organisation can do.


11. Validation Roadmap

11.1 Datasets for Pipeline Validation

The following datasets provide progressively complex relational schemas for testing the DMCG → Hyle → Eidos → metalayer pipeline:

  • Northwind (13 tables) — lightweight proof-of-concept. Small enough to visualise the complete converted graph. Neo4j provides extensive documentation on relational-to-graph conversion for benchmarking.
  • AdventureWorks (Microsoft) — the gold standard. Contains recursive BillOfMaterials structures, HR reporting hierarchies, and multi-system purchasing pathways. Directly tests recursive generating queries and cross-domain entity resolution.
  • DataCo Smart Supply Chain (Kaggle, 180K+ records) — logistics and distribution topology. Tests the metalayer’s ability to surface cross-department friction and delivery bottlenecks.
  • Synthea (synthetic healthcare) — rich operational topology with patients, providers, organisations, encounters, medications. Scalable from 10 to 100,000+ entities via command-line generation.

11.2 Fine-tuning Infrastructure

  • Base model: Gemma 4 31B Dense (Apache 2.0).
  • Method: LoRA adapters via Unsloth on DGX Spark.
  • Training tasks: Schema-to-ontology classification, semantic field enrichment, cross-source entity resolution, metalayer query authoring.
  • Evaluation: Precision/recall on ontological classification against manually labelled test sets. Round-trip validation: generated ontology → GQL queries → Neo4j → result set comparison.
  • RL environment: NousResearch’s Atropos — an RL environment manager that exploits the duality between RL environments and evaluations. Each evaluation is a self-contained Python script with core logic, scoring metrics, and configuration defaults. The same harness that trains the Gemma finetunes also evaluates them. Extended with Sókrates-specific environments for ontological classification accuracy, generating query correctness, and cross-source entity resolution.

11.3 Immediate Next Steps

  1. Run the Northwind and AdventureWorks datasets through the full pipeline (DMCG → Hyle → Eidos → metalayer) end to end.
  2. Establish Gemma 4 31B fine-tuning pipeline on DGX Spark using Unsloth.
  3. Iterate on metalayer DSL syntax with the Sókrates agent authoring generating queries against the test datasets.
  4. Deploy first client box (SuitUp, post-Easter) with the core pipeline operational and the metalayer in supervised mode.

12. Summary

Sókrates is a paradigm product because it inverts the enterprise AI value chain. Every other approach assumes the data is given and tries to extract intelligence from it despite its deficiencies. Sókrates assumes the data is broken and fixes it as a side effect of connecting to it — then extracts intelligence from the clean, typed, semantically rich, relationally connected result.

The technical architecture that enables this inversion:

  • Hyle provides a self-registering, version-aware, hot-loadable type system that makes schemas dynamic without sacrificing type safety.
  • DMCG makes schema ingestion automatic — OpenAPI spec in, typed nodes out, zero configuration.
  • Eidos gives structure meaning through a knowledge graph with typed edges, semantic embeddings, and MCP integration.
  • The metalayer computes organisational topology as living hyperedges — Datalog semantics over a graph database, with differential evaluation for scalability.
  • Gemma 4 31B runs continuously on a DGX Spark, enriching schemas, classifying ontology, resolving entities, and surfacing inefficiencies — all on-premises, all data-sovereign.
  • The basis compounds across deployments, creating a flywheel that makes each new engagement faster, cheaper, and more insightful.

The organisational topology is not modelled. It is computed. And recomputed. And the computation is the model.


Sókrates — Garðabær, April 2026