Benchmark Methodology

Forge Pool benchmarks aim to provide transparent and reproducible measurements of distributed compute behavior.

Test Environment

Benchmarks are executed on real Forge Pool networks composed of heterogeneous provider nodes.

Typical characteristics include:

commodity CPUs
mixed hardware classes
geographically distributed nodes
real network latency conditions

This approach reflects realistic deployment environments rather than idealized lab clusters.

Measurement Definitions

Execution Wall Time

Time from job dispatch to final aggregation.

Includes:

shard dispatch
agent execution
aggregation

Does not include external API latency.

Kernel Wall Time

Time measured when executing through the Forge execution kernel.

Includes:

orchestration
policy evaluation
billing path
replay materialization
result hashing

Kernel wall time is expected to be higher than raw execution wall time.

Shard Time

Time required for an individual agent to complete its assigned shard.

Warmup and Steady State

Some workloads exhibit warmup effects.

Benchmarks therefore distinguish between:

cold start runs
steady state execution

Only steady state runs are used for scaling projections.

Scaling Assumptions

Scaling projections assume:

comparable per-agent CPU capacity
similar shard sizing
no control-plane bottleneck
stable network latency

Real deployments may vary.

Reproducibility

Forge Pool supports deterministic replay through:

root seed control
per-shard seed derivation
result hashing
execution manifests

Benchmark runs can therefore be replayed to validate results.

Benchmark Methodology ​

Test Environment ​

Measurement Definitions ​

Execution Wall Time ​

Kernel Wall Time ​

Shard Time ​

Warmup and Steady State ​

Scaling Assumptions ​

Reproducibility ​