Infrastruktur

System & Infrastruktur

Hardware, databases, portability — the physical and logical foundation of your SovereignNode.

Infrastruktur

SovereignNode

A single server. Local GPU. No cloud dependency. The SovereignNode is the heart of every AIMOS installation — a physical or virtual server that hosts all components.

Everything runs on-premise: LLM inference, databases, agent processes, and the communication channels. No byte leaves your network — unless you explicitly configure it (e.g., Telegram messages).

Starter Business Professional Enterprise
Hardware
GPU RTX 4060 Ti
16 GB
RTX 3090 / 5090
24–32 GB
2× RTX 3090 NVLink
48 GB
A100 / H100
80+ GB
AI Model 14B (Q4) 27B (Q4) 70B (Q4) 70B (Q4) + 9B Draft
Speculative Decoding Optional
on 5090: +4B Draft
+4B Draft
~17K Context
+9B Draft
~75K Context
Speed ~30 Tok/s ~35 Tok/s
5090+Spec: ~90 Tok/s
~20 Tok/s
+Spec: ~50 Tok/s
~40 Tok/s
+Spec: ~100 Tok/s
AI Agents 2–4 5–10
5090+Spec: 10–20
5–10 15–30
Technology TurboQuant TurboQuant
+ SGLang
TurboQuant
+ NVLink + Spec.
TurboQuant + SGLang
+ Spec. Decoding
Hardware approx. from 1,200 EUR
GPU ~400 EUR
from 2,000 EUR
3090: ~700 | 5090: ~3,500
from 2,500 EUR
2× 3090 + NVLink
on request
A100: from ~3,500 used
Task Suitability
ERP Queries
Data Extraction
Appointment Management
Internal Support
Document Search
Customer Contact
Technical Consulting
Multilingual
Compliance
Excellent Good Possible, with limitations Not recommended

Based on IFEval, MT-Bench, BFCL and Qwen/Llama Benchmarks (2024). Ubuntu 24.04/26.04 LTS, 16+ CPU cores recommended.

Architecture Overview

SovereignNode GPU (NVIDIA CUDA / LLM Runtime) Qwen 3.5:27B (Q4, ~17 GB VRAM, native Tool-Calling) PostgreSQL SQLite (Memory) Orchestrator + VRAM Guard Agent A Agent B Agent C Shared Listener (Telegram, E-Mail, Voice)

Dual-DB

Dual-DB Architecture

AIMOS uses two database systems with clearly separated responsibilities:

PostgreSQL (Relay Database)

Central message relay between Shared Listener, Orchestrator, and agents. Stores incoming messages, audit logs, PII Vault mappings, and session data. Multi-process capable through connection pooling.

SQLite (Agent-Memory)

Each agent has its own SQLite database with semantic, episodic, and procedural memory. Hybrid search via FTS5 + vector embeddings. Portable by simply copying the file.

PostgreSQL message_relay audit_log pii_vault sessions llm_usage SQLite (per Agent) semantic_memory episodic_memory procedural_memory vector_embeddings dreaming_log Sync via Orchestrator

Interoperability

Agent Portability

AIMOS agents are portable, compatible, and interoperable through open standards.

OAP Export/Import

The Open Agent Package format enables the complete export of an agent including memory, skills, and configuration as a portable archive.

agent_export.oap
  config.yaml
  memory.sqlite
  skills/
  prompts/

MCP Compatibility

The Model Context Protocol enables external LLMs (Claude, GPT, etc.) to access AIMOS skills as an optional additional interface — not the primary communication path.

sql_query file_read rest_call memory_search +35 mehr

A2A Agent Cards

Each agent publishes an Agent Card (JSON-LD) according to the Google A2A specification. External systems can query capabilities, input formats, and trust level.

"name": "Engineering Agent",
"skills": ["cad_read", "bom_gen"],
"trust_ring": 1
SovereignNode A Export: agent.oap Transfer OAP (Memory + Skills + Config) Import SovereignNode B Agent active

Technical Highlights

What Sets AIMOS Apart

Native Tool Calling

No text hacks or regex parsing — AIMOS uses the native tool-calling API of the LLM. The agent controls systems directly, instead of just describing actions.

Multilingual Voice

Speech recognition (Whisper STT) and speech synthesis (Piper TTS) in all languages — Agents understand voice messages and respond in the user's native language.

Token-Tracking

Every LLM call is captured: input/output tokens, latency, context utilization. Full cost transparency per agent, per conversation, per month.

Conversation Threading

Every agent knows who it is talking to on which channel. Telegram, email, and internal messages are cleanly separated — no mix-up between conversation partners.