Contents

SPONSOR VIEW SOURCE
BROKERLESS IoT H3 SPATIAL GRIDS ZEROMQ ANOMALY CORROBORATION CHAOS MONKEY

Synapse Ingestion Engine

Synapse is a brokerless, edge-first IoT ingestion engine utilizing peer-to-peer validation via Uber's H3 spatial index and local statistical corroboration. By replacing centralized message brokers like MQTT or Kafka with direct ZeroMQ PUB/SUB topologies, it removes central single points of failure, localizes anomaly detection, and ensures extreme performance under constrained edge computational footprints.

bash
$ git clone https://github.com/onyks-os/Synapse.git
$ cd Synapse && docker compose -f docker/docker-compose.yml up -d --scale sensor=5
[Synapse] Recreating docker_monitor_1 ... done
[Synapse] Recreating docker_sensor_1 ... done
[Synapse] Scaling sensors ... 5 instances active.
[Synapse] Bootstrapping spatial index: resolution 7 H3 grids activated.
[Synapse] ZmqListener bound to tcp://0.0.0.0:5555. Subscribing to all topics.
[Synapse] Spatial corroboration loaded: median absolute deviation (MAD).
[Synapse] Dashboard started. URL: http://localhost:8080
EDGE INGESTION PRINCIPLE
Synapse operates on the principle that geographic proximity is fundamental to validating edge sensor integrity. Instead of analyzing anomaly signals in isolation or forwarding massive raw datasets to a remote cloud datacenter, data is corroborated locally against immediate geographic neighbors using a pluggable robust estimator.
0x02

Quickstart

1. Local Deployment (Docker Compose)

The easiest way to execute a full multi-sensor simulation is to spin up the containerized network:

bash
$ docker compose -f docker/docker-compose.yml up -d --build --scale sensor=10
$ docker ps
CONTAINER ID IMAGE COMMAND STATUS PORTS
f38e9102c912 synapse-monitor "python src/main_mon…" Up 12 seconds 0.0.0.0:8080->8080/tcp, 0.0.0.0:5555->5555/tcp
a1281bf0528c synapse-sensor "python src/main_sen…" Up 10 seconds
... (9 more sensor replicas running)

2. Manual Development Setup

If running directly in Python, set up the virtual environment and install the development dependencies:

bash
$ python3 -m venv .venv
$ source .venv/bin/activate
$ pip install -e ".[dev]"
# Start Monitor in terminal 1
$ python src/main_monitor.py
# Start one or more Sensors in terminal 2+
$ python src/main_sensor.py

3. Navigating the Dashboard

Open your browser and navigate to http://localhost:8080. You will be greeted with an interactive geographic visualization rendering hexagonal H3 cells.

  • Green Hexagons: All active sensor nodes inside this spatial cell are healthy and report congruent values.
  • Yellow/Orange Hexagons: The cell contains anomalous (FAULTY) or silent (DEAD) sensors.
  • Red Hexagons: The majority of the sensors in this coordinate cell have failed local verification.
0x03

System Architecture

Unlike traditional centralized IoT systems that rely on a large queueing middleware cluster, Synapse decouples ingestion completely into point-to-point zero-overhead networks.

Feature Traditional Architectures Synapse Architecture
Ingestion Topology Hub-and-Spoke (Centralized Cloud Broker) Direct ZeroMQ PUB/SUB links
Anomaly Detection Cloud Data lake / Offline stream analytics Local / Edge spatial corroboration
Resilience & SPOF Single point of failure at the broker clustering layer Dynamic autonomous sockets with multi-path recovery
Computing Footprint High (JVM heavy brokers, databases, microservices) Ultra-lightweight Python/ZeroMQ runtime

Core Architectural Components

DATA EMITTER Sensor Node (main_sensor.py)

A daemon executing directly on the physical IoT hardware or microcontrollers. It generates unique identifiers, extracts local data readings, maps coordinates to H3 spatial indices (typically resolution 7), and publishes heartbeats over its ZeroMQ PUB socket at periodic intervals.

STATE ENGINE Monitor (main_monitor.py)

An async central daemon acting as the spatial ingestion target. It houses the following:

  • ZmqListener: A high-throughput async SUB socket that acts as the ingestion boundary.
  • NodeRegistry: An in-memory concurrent state tracker that groups nodes by H3 cell and holds the authoritative state.
  • GarbageCollector: A background loop cleaning up silent (DEAD) sensors after timeout limits are crossed.
  • DashboardServer: A Flask REST API and web application displaying the geographical state map.
0x04

Mathematical Formalism

1. Geospatial Mesh via H3 Grid

Geographic coordinates are transformed into a single 64-bit integer H3 cell representing a hexagonal region. The default resolution of 7 covers an average area of 4.3 square kilometers with an edge length of approximately 1.2 kilometers. Peers are defined strictly as nodes occupying the same cell.

2. Pluggable Outlier Detection (Spatial Corroboration)

When a sensor publishes a telemetry ping, the Monitor triggers a Leave-One-Out (LOO) test. For node n reporting value x_n inside cell C, the peer comparison set is defined as:

P_n = { x_i | i ∈ C, i ≠ n, status(i) == ALIVE }

Classic Z-Score (Method: zscore)

Uses sample mean (μ) and sample standard deviation (σ) computed over the peer set P_n:

Z = |x_n - μ| / σ

Limitation: Non-robust estimators. A single high-variance faulty peer in P_n heavily distorts μ and σ, triggering false negatives or masking other outliers.

Modified Z-Score via Median Absolute Deviation (Method: mad - Default)

Uses the median (m) and the Median Absolute Deviation (MAD) of the peer set, providing a robust estimator unaffected by up to 50% compromised sensors:

MAD = median( { |p - m| for p ∈ P_n } )
Modified Z = 0.6745 × |x_n - m| / MAD

The constant 0.6745 scales the denominator such that the modified Z-score is mathematically equivalent to standard Z-scores when data is normally distributed. If the modified Z exceeds the configured threshold (default 3.5), the node status changes to FAULTY.

0x05

API Reference

The monitor exposes a versioned JSON REST API at http://localhost:8080/api/v1/ to query network topology and metrics.

Endpoint Method Response Description
/api/v1/nodes GET Array of Node Objects List all registered nodes, their coordinates, values, and status.
/api/v1/cells GET Dictionary of H3 Cells Cell summary mapping with alive/faulty/dead counts and centroids.
/metrics GET Prometheus Text Data Scrapable metrics on rate limits, invalid payloads, and node counts.
/live GET {"status":"ok"} Liveness probe.
/ready GET JSON Status Object Readiness probe indicating connection counts.

Example API Node Response

JSON
{
  "node_id": "sensor_42",
  "type": "mock",
  "status": "ALIVE",
  "last_seen": 1709894982.5,
  "h3_cell": "871fb4670ffffff",
  "lat": 45.4642,
  "lon": 9.1895,
  "last_value": 24.5
}
0x06

Messaging Protocol

ZeroMQ transmits structured non-blocking telemetry messages using lightweight framing. Sockets connect via the TCP boundary, delivering JSON telemetry payloads.

Payload Structure Schema (PING)

JSON Schema
{
  "schema_version": 1,
  "node_id": "sensor_0001",
  "type": "mock",
  "timestamp": 1709894981.12,
  "status": "PING",
  "h3_cell": "871fb4670ffffff",
  "payload": {
    "value": 22.45,
    "lat": 45.4642,
    "lon": 9.1895
  }
}
SCHEMALESS FILTERS
The monitor runs rapid schema validation. PING packets lacking mandatory fields (node_id, h3_cell, payload.value) are silently dropped at the ZmqListener loop, triggering the Synapse_invalid_payload_total Prometheus metric to alert operations.
0x07

Security & Encryption

In adversarial environments, edge telemetry must be protected against eavesdropping and injection attacks. Synapse utilizes CurveZMQ (Elliptic Curve Cryptography based on curve25519) to encrypt the transport layer.

Key generation & Environment Setup

Generate Curve keys and container-specific environment configurations automatically:

bash
$ python tools/generate_curve_setup.py --count 5 --out-dir .curve-setup
[Curve] Generating keys...
[Curve] Created server.env with server key pair
[Curve] Exported allowlist containing 5 authorized client public keys
[Curve] Created 5 sensor-specific environment config files in .curve-setup/sensor-envs/
SECURE AUTHENTICATION (ZAP)
ZeroMQ's Zero Access Protocol (ZAP) is executed on the Monitor's SUB socket. When Curve is enabled, unauthenticated connections or clients possessing keys not listed inside the allowed_client_publickeys.txt file are rejected at the TCP handshake phase, protecting the ingestion pool.

Rate Limiting via Token-Bucket

To mitigate denial-of-service attempts by faulty or flooded nodes, the registry enforces an optional token-bucket rate limiter on a per-node basis:

  • Bucket Capacity: Maximum burst allowance (default 10 tokens).
  • Replenish Rate: Rate of token restoration per second (default 1.0).
  • Action: Telemetry packets that deplete the node's bucket are dropped.
0x08

Development & Testing

Synapse was built following strict test-driven development (TDD) protocols. The entire codebase is heavily verified with deterministic, offline unit tests.

Running the Test Suite

To verify the registry state engine, rate limiters, and corroboration statistics:

bash
$ pytest
============================= test session starts ==============================
platform linux -- Python 3.11.4, pytest-7.4.2
collected 42 items
tests/test_registry.py PASSED [ 23%]
tests/test_corroboration.py PASSED [ 52%]
tests/test_rate_limiter.py PASSED [ 78%]
tests/test_http_api.py PASSED [100%]
========================== 42 passed in 1.48 seconds ===========================

Linter & Quality Standards

We leverage ruff and mypy for absolute code quality and strict typing checks:

bash
$ ruff check src/
All checks passed.
$ mypy src/
Success: no issues found in 18 source files.
0x09

Chaos & Resilience

To validate system behavior under adversarial or chaotic network conditions, Synapse ships with an automated Chaos Monkey agent (tools/chaos_monkey.py).

Failure Simulation Modes

KILL MODE

Randomly terminates running Docker sensor containers (SIGKILL). Validates that the Monitor shifts their status to DEAD upon crossing the timeout threshold.

ANOMALY MODE

Overrides sensor data paths to broadcast high-amplitude outlier telemetry. Tests whether the H3 spatial corroboration moves them to FAULTY.

FLOOD MODE

Floods the ingestion port with malformed payloads or high-velocity packet bursts, testing parser isolation and rate limiters.

Executing Chaos Simulations

bash
# Run anomaly injection every 5 seconds
$ python tools/chaos_monkey.py --mode anomaly --intensity high --interval 5
[Chaos] Attacking network with mode: anomaly (intensity: high)
[Chaos] Target selected: sensor_0003 in cell 871fb4670ffffff
[Chaos] Overriding sensor_0003 telemetry to 999.00 (Outlier injected)
0x0A

Roadmap & Future

Synapse is an evolving research project exploring edge ingestion paradigms.

V1 - PROOF OF CONCEPT

Completed basic Python ingestion, ZeroMQ binding infrastructure, in-memory NodeRegistry, and primitive status updates.

V2 - SPATIAL CORROBORATION & SECURE RUNTIME

Completed H3 geometric indexing support, Median Absolute Deviation (MAD) robust outlier metrics, CurveZMQ encryption, token-bucket rates, and web dashboards.

V3 - WASM PREDICTIVE RUNTIMES (IN DEVELOPMENT)

Migrating outlier and anomaly models directly into WebAssembly (WASM) modules to enable sandboxed cryptographic computation directly on microcontroller layers, signing payloads at the sensor core.