The Individual Agent

The AIMOS Agent

Autonomous specialists with their own memory, their own tools, and personality — each agent is a specialized digital assistant.

Five Building Blocks

Structure of an AIMOS Assistant

An AIMOS assistant is an autonomous specialist that combines five core components:

O

Orchestrator

Controls the thinking process in phases: Observe, Orient, Act — following the OODA principle.

M

Memory

Dedicated SQLite database with semantic, episodic, and procedural long-term memory.

S

Skills (Tools)

Domain-specific tool collection: SQL queries, REST calls, file operations, domain-specific functions.

C

Connector (Interface)

Communication channel to the user: Telegram, email, voice, or dashboard.

L

LLM (Language Model)

Local model via LLM runtime. The assistant builds the complete prompt from system prompt, memory context, and user query.

The key point: The assistant prepares — you decide. Your expertise, your experience, and your judgment remain indispensable. The assistant takes routine work off your plate: gathering data, monitoring deadlines, preparing drafts. The expert work stays with you.

OODA Cycle

The Orchestrator — how an AIMOS assistant thinks

A chatbot reacts to each message individually. An AIMOS assistant opens its inbox, surveys everything, recognizes connections — and then acts in a coordinated way. The principle behind it: the OODA cycle from decision theory.

CONTEXT Read workspace Check data sources “Desktop” O · O · D · A OBSERVE Read all messages Group, categorize Find connections ORIENT Research, verify Identify dependencies Build situation report DECIDE Create action plan Per stakeholder: What do they need? ACT Respond per thread Emails, tickets Coordinated action PERSIST Update todo list Document status Set reminders Workspace files carry state into the next session The assistant holds the GPU across all phases — like a clerk closing their door to work.

Conventional Chatbot

Reactive mode

× Processes each message in isolation
× Cannot recognize connections between requests
× Responds immediately — without knowing the big picture
× Cannot inform stakeholder A about impacts from stakeholder B

AIMOS Assistant

OODA mode

Surveys all messages at once
Recognizes cross-thread connections
Builds a situation report before acting
Responds per stakeholder — but informed by the big picture

Example: Three messages, one situation report

Thread tg:smith

Developer Smith

"Unit tests for SG-03 are complete."

Thread email:req-manager

Requirements Manager

"TSR-17 has been upgraded to ASIL D."

Thread int:testmgr

Test Manager

"Integration test for SG-03 is failing."

SITUATION REPORT (cross-thread)

"All three messages concern Safety Goal SG-03. The ASIL D upgrade of TSR-17 changes the verification requirements: Smith's unit tests are no longer sufficient (MC/DC coverage required). The failing integration test is a separate timing issue."

→ To Smith

"Tests completed, but due to ASIL D upgrade, statement coverage is no longer sufficient. MC/DC required."

→ To Requirements Manager

"Upgrade registered. 3 SW requirements affected, unit tests need to be extended. Impact analysis attached."

→ To Test Manager

"Timing issue, not caused by ASIL upgrade. Please send logfile. Integration test must be repeated after unit test extension."

Each stakeholder receives only what is relevant to them — but every response is informed by the big picture.

How does this work technically?

The orchestrator is not an AI model, but deterministic code. It controls the OODA cycle by calling the LLM multiple times with different tasks:

O+O All messages are presented to the LLM as a block: "Analyze and find connections."
D The analysis is returned. The orchestrator sets it as context in the next prompt: "Consolidate and build a situation report."
A Per thread: situation report + individual conversation history + response task. Each stakeholder gets their own response.

What makes the situation report special

Ephemeral, not permanent

The situation report lives only during a batch run. It is not stored — next run, new report. Long-term insights migrate to memory.

Thread isolation remains

The assistant knows the big picture, but stakeholder A learns nothing about the conversation with stakeholder B — only about impacts relevant to them.

The orchestrator decides, not the LLM

The phase sequence is deterministic. The LLM cannot skip or mix phases — it receives a clear task for each phase.

Three Modes

Not every assistant needs the full OODA cycle

AIMOS supports three agent types — from quick voice responses to structured case workers. All three can act proactively (cronjobs, reminders, follow-ups).

Voice Assistant

<500ms latency

Immediate reaction to voice input. Whisper transcription parallel to LLM warmup. Short, precise responses.

Reception, voice control, quick queries

Chat Assistant

<5s latency

Quick conversation via Telegram, email, or dashboard. Memory, customer records, delegation to colleagues. Cronjobs for proactive reminders.

Customer support, helpdesk, order intake

OODA

Worker

Batch — OODA cycle

Checks its inbox, surveys all processes, recognizes cross-thread connections, builds a situation report — and then acts in a coordinated way.

Process management, compliance, project assistance

All three types have memory, skills, connectors, and can act proactively. The difference is in the thinking approach: one thread vs. the big picture.

3-Tier Memory

Long-Term Memory

Three memory types, hybrid search, and a Dreaming cycle for consolidation.

Semantic Facts & Knowledge "Steel price Q1: 850 EUR/t" "SAP API: /api/v2/stock" "Supplier X: 14 days lead time" Episodic Experiences & Conversations "2026-03-15: Stock query" "2026-03-18: Price comparison" "2026-03-20: CAD analysis" Procedural Workflows & Patterns "Order: Check→Approve→Post" "BOM export: DWG→Parse→CSV" "Inventory: SQL→Diff→Report" Hybrid Search: FTS5 + Vector Embeddings + RRF Fusion Relevance ranking combines keyword and semantic matches

Dreaming — Memory Consolidation via LLM Call

In idle state, the agent analyses its conversations with an LLM call, extracts facts, updates notes and todo lists, consolidates its memory, and creates weekly reports.

Like the human brain during sleep — the agent condenses experiences into knowledge, removes redundant entries, and strengthens important connections. The result: more precise answers with lower Token consumption.

Dreaming Cycle (Consolidation)

Like the human brain during sleep, AIMOS consolidates memories during idle time:

Collect Episodes All conversations Detect Patterns LLM analysis Condense Extract facts Store Semantically Long-term memory

Language Model

The LLM — The Agent's Brain

The Large Language Model (LLM) is the thinking engine behind every agent. It understands language, makes decisions, and controls tools — and runs entirely on your own hardware.

What Does the LLM Do?

  • Understands queries — in natural language, in any language
  • Selects tools — autonomously decides whether a database query, email send, or calculation is needed
  • Formulates responses — technically accurate, in the context of the ongoing conversation
  • Learns from corrections — through long-term memory, not through model training

Why Local Instead of Cloud?

Data Sovereignty

Your queries never leave the network. No cloud provider sees your data.

No Ongoing Costs

No per-query Token price. The model runs unlimited on your GPU.

Availability

No API limit, no rate limiting, no dependency on external services.

Escalation When Needed

For complex tasks: automatic, anonymized escalation to a cloud LLM. → Details

Integration

Available Connectors

The agent communicates via connectors — standardized interfaces to users, systems, and other agents. New connectors are continuously developed and can be added at any time for your specific IT landscape.

Telegram

Text, voice messages, documents. Proactive messages for reminders, alerts, and results. Shared listener for all agents.

Email

IMAP/SMTP for sending and receiving. POP3 monitoring for incoming mailboxes. HTML format and file attachments.

Voice

Whisper STT + Piper TTS — fully local. Speech recognition and synthesis in all languages, without cloud services.

SFTP

File access to workstations via Tailscale VPN. Shared folders for DXF, PDF, Excel — encrypted and without open ports.

SQL Databases

PostgreSQL, MSSQL, Firebird — SELECT queries only. No write access to production data. Read-only by design.

REST / SOAP API

Universal API integration for ERP, CRM, inventory management. GET, POST, PUT with configurable authentication.

Thread Architecture

One Assistant, Many Customers — Simultaneously

Every conversation gets its own thread ID. The assistant only ever sees the current customer — no matter how many run in parallel.

Isolated Threads

Every customer automatically gets their own thread. Telegram user A never sees the conversation of email customer B.

Cross-Channel

A customer writes via Telegram: “I sent you an email.” The assistant finds the email thread and instantly has the context.

Thread Propagation

When an assistant delegates a task to a colleague, the thread ID travels along. The recipient works in the same customer context.

Automatic Assignment — every channel generates the correct thread ID on arrival

Email Threading — In-Reply-To and References headers for correct matching

Files per Thread — attachments are assigned to the process, cross-channel

Code-Level Isolation — enforced at the database level, not dependent on the AI model

Toolbox

Over 30 Skills — Modularly Configurable

Each AI assistant receives exactly the skills it needs. Custom skills can be added at any time — for any industry, any system, any workflow.

Communication

Email

IMAP/SMTP, POP3 monitoring. Send, receive, attachments, automatic mailbox monitoring.

Telegram

Text, voice messages, documents. Proactive notifications for alerts and results.

Voice

Whisper STT + Piper TTS — fully local. Speech recognition and synthesis in all languages.

MS Teams

Read and send channel messages, create online meetings. Microsoft Graph API.

Project Management

JIRA

Search, create, update issues and change status. JQL queries, sprint overview.

Azure DevOps

Work items, pipelines, boards. Create tasks, track status, monitor CI/CD.

MS Project

Read projects and tasks, track milestones, update deadlines.

Codebeamer (ALM)

Requirements, test cases, traceability links, baseline comparison. For automotive development.

Documentation & Reporting

Confluence / SharePoint

Wiki pages and documents — read, create, update. DMS integration.

Word / Excel / PowerPoint

Create Office documents: reports as Word, data as Excel, presentations as PowerPoint.

Document Recognition (OCR)

Scan invoices, delivery notes, contracts. Auto-detect fields. Processed locally.

KPI Reports

Daily and weekly summaries, CSV export, automatic overviews.

Data & Systems

ERP (SAP / DATEV)

Query articles, customers, orders, inventory levels. Multi-backend: SAP, DATEV, custom.

SQL Databases

PostgreSQL, MSSQL, Firebird — read-only by design. No write access to production data.

GitLab / GitHub

Repositories, merge requests, CI/CD pipelines. Read commits, create issues, comment.

SFTP / Remote Files

File access to workstations via VPN. Encrypted and without open ports.

Compliance & Organization

Deadline Management

Certifications, maintenance intervals, contract terms. Proactive reminders before expiry.

Inventory Management

Monitor stock levels, reorder suggestions, minimum quantity alerts.

Calendar (local + Outlook)

Appointments, deadlines, follow-ups. Send Outlook invitations. Public holidays automatically considered.

Contact Management

People, companies, phone numbers, email addresses. Automatic updates.

Modular Toolbox: Each AI assistant receives only the skills it needs. Custom integrations (specialized ERP systems, industry-specific tools, internal databases) can be developed and added as new skills at any time — without changing the core.