Skip to content

Benchmark Methodology

Forge Pool benchmarks aim to provide transparent and reproducible measurements of distributed compute behavior.


Test Environment

Benchmarks are executed on real Forge Pool networks composed of heterogeneous provider nodes.

Typical characteristics include:

  • commodity CPUs
  • mixed hardware classes
  • geographically distributed nodes
  • real network latency conditions

This approach reflects realistic deployment environments rather than idealized lab clusters.


Measurement Definitions

Execution Wall Time

Time from job dispatch to final aggregation.

Includes:

  • shard dispatch
  • agent execution
  • aggregation

Does not include external API latency.


Kernel Wall Time

Time measured when executing through the Forge execution kernel.

Includes:

  • orchestration
  • policy evaluation
  • billing path
  • replay materialization
  • result hashing

Kernel wall time is expected to be higher than raw execution wall time.


Shard Time

Time required for an individual agent to complete its assigned shard.


Warmup and Steady State

Some workloads exhibit warmup effects.

Benchmarks therefore distinguish between:

  • cold start runs
  • steady state execution

Only steady state runs are used for scaling projections.


Scaling Assumptions

Scaling projections assume:

  • comparable per-agent CPU capacity
  • similar shard sizing
  • no control-plane bottleneck
  • stable network latency

Real deployments may vary.


Reproducibility

Forge Pool supports deterministic replay through:

  • root seed control
  • per-shard seed derivation
  • result hashing
  • execution manifests

Benchmark runs can therefore be replayed to validate results.