Appearance
Observability
Telemetry, Metrics & Execution Transparency
Forge Pool exposes structured observability across the entire execution lifecycle.
Observability is not optional.
It exists to:
- validate deterministic execution
- detect anomalies
- support replay and audit
- monitor economic flow
- enforce trust boundaries
Observability is divided into:
- Control Plane Telemetry
- Execution Plane Telemetry
1. Control Plane Observability
Control plane telemetry is exposed via HQ.
It includes:
- project usage
- job registry
- credit accounting
- identity events
- policy configuration
- token activity
Control plane answers:
- Who executed?
- Under what policy?
- At what cost?
- Under which identity?
2. Execution Plane Observability
Execution plane telemetry originates from:
- Hub
- Agents
- Aggregation layer
It includes:
- shard planning metadata
- agent execution metrics
- deterministic reduction logs
- verification signals
- replay metadata
Execution plane answers:
- How was execution structured?
- Which agents participated?
- Was integrity preserved?
- Can this run be replayed?
3. Job-Level Transparency
Each Job exposes:
job_id- Kernel workload (
op.name,version,profile) - shard count
- participating agents
- execution duration
- verification mode
- replay seed
- credit usage
In HQ → Jobs:
You can inspect:
- execution timeline
- shard distribution
- reduction summary
- replay metadata
- billing record
Jobs are immutable once completed.
Immutability is foundational to audit integrity.
4. Shard-Level Telemetry
Each shard reports:
- execution duration
- hardware class (CPU / GPU)
- partial result size
- result hash
- verification participation
Shard telemetry enables:
- anomaly detection
- performance profiling
- reliability scoring
- corruption detection
Shard metadata is bound to job context.
5. Agent-Level Metrics
Providers can monitor:
HQ → Providers → Nodes
Available signals:
- online status
- heartbeat freshness
- shard throughput
- verification participation ratio
- latency distribution
- hardware classification
- credits earned
Hub tracks:
- historical reliability
- correctness ratio
- tail latency behavior
- scheduling weight
Reliable nodes are prioritized.
6. Scheduler & Tail Signals
Internally, Hub tracks:
- queue depth
- shard dispatch latency
- tail latency outliers
- rebalance events
- agent health drift
These signals influence:
- shard routing
- verification intensity
- workload distribution
This prevents systemic skew.
7. Replay Telemetry
Replay observability includes:
- root seed
- shard seed derivation
- workload version binding
- aggregation checksum
- output hash
Replay metadata ensures:
- forensic reconstruction
- regulatory defensibility
- reproducibility verification
Replay telemetry is part of the execution artifact.
8. Studio Run Observability
Each Studio Run includes:
- flow version
- graph hash
- job IDs triggered
- execution timestamps
- artifact references
- final output snapshot
Run history is version-bound.
Flow reproducibility depends on deterministic adapters.
9. Credit & Economic Observability
Credits are recorded per:
- job
- shard
- workload type
- verification overhead
- resource class
HQ exposes:
- credit balance
- historical burn rate
- provider earnings
- per-adapter usage breakdown
Economic observability aligns incentives with execution correctness.
10. Failure Visibility
When execution fails:
- error code recorded
- failure reason stored
- partial shards marked
- verification divergence logged
- billing outcome recorded
Clients should log:
- job_id
- full request payload
- response payload
- retry decision
Failure telemetry supports root cause analysis.
11. Health & Reliability Model
Node health scoring considers:
- shard completion ratio
- verification consistency
- latency stability
- uptime consistency
- resource reporting accuracy
Reliability influences:
- scheduling weight
- shard volume
- earning potential
Health scoring reduces economic attack surfaces.
12. Time Filtering & Diagnostics
HQ supports filtering by:
- time range
- project
- workload type
- node
- verification mode
This enables:
- capacity planning
- cost forecasting
- anomaly investigation
- deterministic replay analysis
13. Audit & Export Model
Enterprise environments may require:
- job metadata export
- replay artifact export
- ledger export
- verification logs
- execution trace archives
Forge Pool supports audit-ready data structures.
Observability Philosophy
Distributed compute without transparency is unsafe.
Forge Pool exposes:
- structural telemetry
- deterministic replay metadata
- shard integrity signals
- economic traceability
Execution truth must be observable.
Observability is the foundation of distributed determinism.
