IIMS – Architecture & Design (Target Model)

Information Infrastructure Management System (IIMS)

Unified Operations Coordination Platform

Purpose

This document represents the final consolidated architecture and design reference for IIMS.

It intentionally describes the target design of the system (the expected end-state architecture and workflows), without separating implementation stages in the main body.

The goal is to:

Present a coherent and professional system design to management and stakeholders
Reflect a carefully designed long‑term architecture
Allow the current implementation to be positioned as an incremental delivery of this design

A dedicated section at the end documents the current implementation status and roadmap, so teams clearly understand what is already delivered and what will be implemented next.

1. System Vision

IIMS (Information Infrastructure Management System) is a unified operations coordination platform that connects monitoring, topology, incidents, maintenance, and ticketing into a single operational model.

Instead of replacing existing platforms such as Zabbix, Zammad, or Keycloak, IIMS integrates them and acts as a central operational coordination and intelligence layer.

The purpose of IIMS is simple:

Give operations teams one place to understand what is happening, where it is happening, what is impacted, and what actions are being taken.

IIMS becomes the daily operational control center for infrastructure operations.

2. High-Level System Architecture

The following diagram shows the end-to-end architecture of IIMS, including identity, application, secrets management, persistence, and provider integrations.

IIMS High-Level Architecture

This diagram illustrates:

OIDC authentication with Keycloak and JWT
Secure API access between UI and IIMS-API
Secret isolation using Vault
Provider-agnostic monitoring and ticketing integrations

3. Core Principles

Integrate, Don’t Replace

IIMS integrates existing monitoring, ticketing, and identity systems rather than replacing them.

One Operational View

All operational information is visible in one place:

Sites and assets
Alerts and incidents
Maintenance windows
Tickets and workflows
Topology and geo maps

Provider Independence

All external systems are abstracted behind adapters. The domain model never depends on vendor APIs.

Correlation Over Raw Data

IIMS focuses on correlating signals into operational problems rather than displaying raw events.

Operations First

The design is driven by operator questions:

Which site is affected?
What is the impact?
Is this planned maintenance?
Who is responsible?

4. Core Domain Model

Infrastructure Model

Site – A physical or logical location (region, data center, branch)
Asset – A monitored object (device, VM, service)
Link – A topology connection between sites and/or assets

Sites provide hierarchy and geo‑context. Assets are the primary monitored entities. Links describe connectivity and dependencies.

Monitoring Configuration Model

Monitoring Target – A configured monitoring backend
Collector – A proxy or agent pool for data collection
Pack – A reusable monitoring template bundle
Profile – A concrete monitoring plan applied to assets

This model allows provider‑agnostic provisioning and routing.

Operational Model

Alert – A technical signal from monitoring
Incident – A human‑level operational problem
Ticket – An external ITSM record linked to an incident
Maintenance – A planned suppression window
Activity Event – An audit and operational timeline record

Alerts are technical symptoms. Incidents are human workflows. Tickets represent process and communication.

5. Provider & Integration Architecture

Adapter Pattern

All external systems are integrated through adapters:

Monitoring adapters (Zabbix, future Prometheus)
Ticketing adapters (Zammad, future ITSM)
Identity adapters (Keycloak)

Adapters translate domain operations into provider‑specific APIs and normalize responses.

Routing Model

Routing resolves:

Which monitoring target an asset uses
Which collector performs the collection

Routing is:

Policy‑driven
Site‑aware
Provider‑independent

Targets and collectors are selected before any provider call is executed.

6. Operational Workflows

Alert Ingestion

Alerts are ingested from monitoring providers, normalized, cached, and evaluated against maintenance rules.

Maintenance Suppression

Active maintenance windows suppress alerts, block incident and ticket creation, and preserve audit history.

Alert Correlation

Alerts are grouped into incidents using rules, operator actions, and future correlation engines.

Incident Lifecycle

Incidents track:

Status and severity
Related alerts
Linked tickets
Activity history

They represent the central unit of human operations.

Ticket Integration

Tickets are created from incidents and synchronized with external ITSM systems.

Topology & Geo Impact

Topology and geo views provide situational awareness and support impact analysis and root cause workflows.

7. Performance & Scalability

IIMS uses cached read models for:

Alert summaries
Incident counters
Site and asset health
GeoMap and topology views

Background workers refresh caches to ensure fast UI performance at scale.

8. Extensibility

The architecture supports:

Multiple monitoring providers
Multiple ticketing systems
Multi‑tenant deployments
Advanced automation and intelligence engines

New providers and capabilities can be added without breaking existing workflows.

9. Implementation Status (Informational)

This section documents the current delivery status and future roadmap. It does not change the target design above.

9.1 Current Implementation (Initial Delivery)

The current implementation delivers:

Full site and asset hierarchy
Geo‑location and topology visualization
Zabbix monitoring integration
Zammad ticket integration
Alert ingestion and maintenance suppression
Manual and rule‑light incident management
Cached dashboards and operational summaries

This establishes IIMS as a stable unified operations coordination platform.

9.2 Roadmap & Next Capabilities

Planned future capabilities include:

Automated alert correlation and incident creation
Topology‑based impact propagation and root cause inference
SLA timers, ownership, and escalation workflows
Multi‑provider monitoring (Zabbix + Prometheus)
Automatic routing failover and capacity‑aware balancing
Real‑time dashboards and collaborative incident workspaces

These enhancements build directly on the existing foundation without breaking current operations.

10. Summary

This document describes the target architecture and operational design of IIMS as a long‑term platform.

The current implementation represents an incremental delivery of this design, focused on stability and unified visibility.

Future enhancements will gradually evolve IIMS into an intelligent, automated operations platform while preserving architectural continuity and operational safety.