Example code for learning and development
Observability and orchestration stack for network automation. Intended for demonstration, testing, and development environments.
Observability and orchestration for network automation: gNMIc (gNMI streaming), Prometheus, ClickHouse (ClickStack), Grafana — plus GitLab CI/CD and Ansible for config collection, diff, apply, and rollback. Docker Compose with optional syslog and IPFIX. Built for AI/MCP troubleshooting and operational insights.
Demo: "Clients at the fabric site report connectivity issues." The NetOps Assistant runs the troubleshoot-site flow (NetBox, Prometheus, syslog, config drift), then drills into border-1 vs border-2 and leaves. It isolates an EVPN/VXLAN fault—wrong VNI mapping on border-1 (vlan configuration 965 / NVE)—and walks through GitLab apply (dry-run → apply) and post-checks until the fabric is healthy again.
netops-stack is a composable Docker-based stack that provides:
- Ingestion: gNMIc (gNMI streaming), Vector (syslog UDP 514), optional IPFIX/NetFlow/sFlow
- Storage: ClickStack (ClickHouse + HyperDX UI), Prometheus (scrape and optional remote_write to ClickHouse)
- Visualisation: Grafana, HyperDX (logs, traces, metrics)
- Orchestration: GitLab CI/CD + Ansible — config collection, compare (drift), apply dry-run, and rollback; MCP-triggered pipelines
You choose which pieces to run via Compose overlays. The stack aligns with the NAF (Network Automation Forum) Framework reference architecture.
The stack follows the NAF (Network Automation Forum) Framework: six functional blocks plus network infrastructure.
| Block | Purpose |
|---|---|
| Intent | Desired state (config, expectations). Structured data, API. |
| Observability | Actual state persistence and logic. Historical data, query language, drift. |
| Orchestrator | Coordinates automation. Event-driven, scheduling, dry-run, traceability. |
| Executor | Applies writes to the network (SSH, NETCONF, gNMI/gNOI). Driven by Intent. |
| Collector | Reads from the network (gNMI, SNMP, Syslog, flow). Feeds Observability. |
| Presentation | Dashboards, GUIs, CLI. Interfaces with Intent, Observability, Orchestration. |
Data flow: Collector → Observability; Orchestrator ↔ Collector; Intent + Orchestrator → Executor → Network Infrastructure; Presentation ↔ Intent, Observability, Orchestrator.
| NAF block | Tools | In-repo / external |
|---|---|---|
| Presentation | LibreChat with NetOps Assistant (MCP Client), Grafana, HyperDX | In-repo + LibreChat |
| Observability | Prometheus, Vector, ClickHouse | In-repo (compose) |
| Orchestrator | GitLab CI/CD | In-repo (gitlab/) |
| Intent (SoT) | NetBox, GitLab | External + in-repo |
| Collector | gNMIc, Syslog, IPFIX, SSH | In-repo (gNMIc, Vector) + SSH in Ansible |
| Executor | Ansible | In-repo (gitlab/ansible): apply, rollback |
- Presentation: LibreChat with NetOps Assistant (MCP Client), Grafana (dashboards), HyperDX (logs/traces/metrics).
- Observability: Prometheus (metrics, PromQL), Vector (log pipeline), ClickHouse (logs, e.g.
default.syslog, metrics/traces). - Orchestrator: GitLab CI/CD (gitlab/README.md) — collect, compare, apply dry-run, rollback; MCP-triggered pipelines.
- Intent (SoT): NetBox, GitLab (config baseline and desired state). Consumed via MCP.
- Collector: gNMIc (gNMI streaming), Syslog (Vector), IPFIX (optional), SSH (Ansible).
- Executor: Ansible in pipeline (gitlab/ansible); MCP triggers dry-run, operator runs manual apply/rollback.
| NAF block | netops-stack | Gap / note |
|---|---|---|
| Intent | NetBox | Aligned; SoT for sites, devices, interfaces, IPs. |
| Collector | gNMIc, Vector, optional IPFIX | Strong fit; SNMP not in stack. |
| Observability | Prometheus, ClickHouse, HyperDX | Strong fit; add drift/events later if needed. |
| Executor | Ansible (SSH) | In-repo; apply/rollback playbooks in gitlab/ansible. |
| Orchestrator | GitLab CI/CD + Ansible + GitLab MCP | Scheduled collect/diff; MCP-triggered dry-run. See gitlab/README.md. |
| Presentation | Grafana, MCP assistant | Aligned. |
Conclusions: Intent, Collector, Observability, and Presentation map to NetBox, gNMIc/syslog, Prometheus/ClickHouse, and Grafana/MCP. Executor is Ansible (SSH) in pipeline (gitlab/ansible). Orchestrator is GitLab CI/CD + Ansible. Reference: NAF Framework.
Documentation by folder below.
- Docker Engine 20.10+
- Docker Compose 2.0+
- (Optional) NetBox, GitLab, and network device access for full orchestration
# 1. Clone repository
git clone https://github.com/pamosima/netops-stack.git
cd netops-stack
# 2. Configure environment
cp .env.example .env
# Set NETOPS_STACK_HOST to the IP or hostname where the stack runs
# 3. Start base + ClickStack (HyperDX, ClickHouse, Prometheus)
docker compose -f compose.yaml -f compose-clickstack.yaml up -d
# 4. Open HyperDX (first use: create a user)
# http://<host>:8080| Overlay | Purpose |
|---|---|
compose-syslog.yaml |
Syslog → Vector → ClickHouse |
compose-ipfix.yaml |
IPFIX/NetFlow/sFlow |
IPFIX/NetFlow/sFlow: Use compose-ipfix.yaml; collector listens on host UDP 4739 (IPFIX), 2055 (NetFlow v9), 6343 (sFlow). Point devices at the host IP and the relevant port. Prometheus scrapes the IPFIX service on 8081. See compose-ipfix.yaml for the image and service name.
Full stack example (IOS-XE, IPFIX, syslog):
docker compose -f compose.yaml -f compose-ipfix.yaml \
-f compose-clickstack.yaml -f compose-syslog.yaml up -dSee clickhouse/README.md for deployment and ports.
- gNMIc — gNMI streaming telemetry from network devices
- Vector — Syslog and log ingestion (UDP 514)
- Prometheus — Metrics scrape and storage
- ClickStack — ClickHouse + HyperDX for logs, metrics, traces
- Grafana — Dashboards and alerting
- Config collection — Ansible collects running config; optional commit/MR to repo
- Compare (drift) — Diff running vs repo baseline; optional persist
.difffor MCP - Apply — Dry-run then manual apply of config changes
- Rollback — Restore devices to collected baseline (Cisco configure replace or block replace)
Pipelines are triggered via GitLab API (e.g. from network-mcp-docker-suite GitLab MCP server). See gitlab/README.md.
Use cases
With the NetOps MCP server (Cursor, LibreChat, etc.), prompt in natural language; the assistant uses flows and tools (compare, apply, rollback). Example prompts:
| Use case | How to prompt (MCP) |
|---|---|
| Check drift | "Compare running config to baseline for sw11-1" or "Is there config drift on device X?" — Triggers compare pipeline; assistant reports diff or "no changes." |
| Apply a config change | "Configure X on device Y" (e.g. "Add VLAN 100 to sw11-1", "Set NTP server on core-01") — Assistant reads netops://flows/configuration, gets running config, builds desired block, uploads to ansible/configs/desired/<host>.txt and triggers apply dry-run; you review the job log and run manual apply_config in GitLab. |
| Refresh baseline | "Refresh the config baseline from devices" or "Collect running configs and update baseline" — Triggers collect pipeline; add "and create a merge request" to open an MR. |
| Roll back to baseline | "Roll back sw11-1 to baseline" or "Restore device X to last collected config" — Triggers rollback pipeline; dry-run shows diff, then you run manual rollback_apply in GitLab. |
| Troubleshoot a device | "Troubleshoot sw11-1" or "Why is device X having issues?" — Assistant runs troubleshoot flow (NetBox + Prometheus + ClickHouse + optional compare/diff); use "run compare pipeline" if you want a fresh drift check. |
Details: netops-mcp-server/README.md (flows, tools, resources). Pipeline variables and manual jobs: gitlab/README.md.
- Prometheus MCP — Query metrics (netops-stack Prometheus)
- ClickHouse MCP — Query syslog and logs (netops-stack ClickHouse)
- GitLab MCP — Trigger compare, apply dry-run, rollback pipelines
Use the netops-stack profile in network-mcp-docker-suite to run MCP servers that connect to this stack.
Documentation lives in each component folder. Start here:
| Folder | README | Contents |
|---|---|---|
| gitlab/ | README | Orchestrator: Ansible + GitLab CI/CD, pipeline setup, rollback (configure replace, NETCONF), MCP |
| clickhouse/ | README | ClickStack: HyperDX, ClickHouse, syslog table, ports, deploy |
| vector/ | README | Syslog ingestion, device config (Cisco), IPFIX pointer |
| gnmic/ | README | gNMI streaming, IOS-XE targets, subscriptions |
| prometheus/ | README | Scrape config, metrics, MCP |
| grafana/ | README | Dashboards, provisioning |
| nats/ | README | Message bus for gNMIc pipeline |
| cml/ | README | Cisco Modeling Labs topology (BGP-EVPN lab), import and stack alignment |
| netbox/ | README | NetBox bulk-import seed data (regions, sites, roles, types, devices) matching the lab |
| netops-mcp-server/ | README | MCP server: GitLab, Prometheus, ClickHouse, NetBox, IOS-XE, flows |
- Deployment: DEPLOYMENT.md
# Start stack (choose overlays as needed)
docker compose -f compose.yaml -f compose-clickstack.yaml up -d
# View status
docker compose ps
# View logs
docker compose logs -f
# Stop
docker compose downSee SECURITY.md for vulnerability reporting, supported versions, and branch-protection guidance (OpenSSF-aligned). This repo uses CodeQL (.github/workflows/codeql.yml) and Dependabot (.github/dependabot.yml).
- No hardcoded credentials; use
.envand GitLab CI/CD variables - Store
ANSIBLE_USER,ANSIBLE_PASSWORD,GITLAB_PUSH_TOKEN, NetBox tokens in CI variables (masked) - Python deps: lockfiles —
netops-mcp-server/uv.lock, compiledgitlab/requirements.txt/clickhouse/requirements-export.txt; Docker base images pinned by digest - For production: use secrets management, restrict network access, and follow gitlab/README.md hardening
After CodeQL / Dependabot run: resolve open alerts (gh api repos/<owner>/<repo>/code-scanning/alerts, .../dependabot/alerts) or merge safe update PRs.
- Telemetry: The telemetry idea in this stack draws on gnp-stack — the gNMIc–NATS–Prometheus stack for network telemetry. Thanks to that project for pioneering a “point and shoot” streaming-telemetry experience.
- Presentation: Network Observability Redefined with a Modern Open-Source Pluggable Tech-Stack — Jan Untersander, Ramon Bister, Sascha Häring (OST/INS), CHNUG #1 (4 Dec 2025 @ OST).
Contributions are welcome. See CONTRIBUTING.md for how to report issues and suggest improvements. This project adheres to the Contributor Covenant Code of Conduct.
This project is licensed under the Cisco Sample Code License, Version 1.1 — see the LICENSE file for details.
This project is example code for demonstration and learning. It is not officially supported by Cisco Systems and is not intended for production use without proper testing and customization for your environment.
Third-party components (Grafana plugins, Ansible collections, etc.) are subject to their own licenses; see component documentation and the NOTICE file.

