Appearance
Benchmark Methodology
Forge Pool benchmarks aim to provide transparent and reproducible measurements of distributed compute behavior.
Test Environment
Benchmarks are executed on real Forge Pool networks composed of heterogeneous provider nodes.
Typical characteristics include:
- commodity CPUs
- mixed hardware classes
- geographically distributed nodes
- real network latency conditions
This approach reflects realistic deployment environments rather than idealized lab clusters.
Measurement Definitions
Execution Wall Time
Time from job dispatch to final aggregation.
Includes:
- shard dispatch
- agent execution
- aggregation
Does not include external API latency.
Kernel Wall Time
Time measured when executing through the Forge execution kernel.
Includes:
- orchestration
- policy evaluation
- billing path
- replay materialization
- result hashing
Kernel wall time is expected to be higher than raw execution wall time.
Shard Time
Time required for an individual agent to complete its assigned shard.
Warmup and Steady State
Some workloads exhibit warmup effects.
Benchmarks therefore distinguish between:
- cold start runs
- steady state execution
Only steady state runs are used for scaling projections.
Scaling Assumptions
Scaling projections assume:
- comparable per-agent CPU capacity
- similar shard sizing
- no control-plane bottleneck
- stable network latency
Real deployments may vary.
Reproducibility
Forge Pool supports deterministic replay through:
- root seed control
- per-shard seed derivation
- result hashing
- execution manifests
Benchmark runs can therefore be replayed to validate results.
