Blob Storage Architecture

Large-Object Storage for Planetary Compute Workloads

Forge Blob is the large-scale binary object storage layer of the Forge Pool architecture.
It provides efficient storage and retrieval of:

model inputs
media segments
scientific datasets
intermediate artifacts
video/audio chunks
adapter-specific binary payloads

Blob Storage is optimized for workloads requiring MB–GB scale data, complementing KV (small metadata) and VMem (medium-size numeric memory).

1. Purpose & Role in the Architecture

Blob Storage:

stores large binary objects
enables distributed adapters (FFmpeg, BLAS, PCA, CAT)
supports chunked upload/download
ensures deterministic availability
integrates directly with Hub routing
minimizes memory pressure on Agents
supports streaming and partial range reads

It is the persistent binary backbone of the planetary compute network.

2. Architecture Overview

Forge Blob consists of:

1. Object Store

Backed by a pluggable storage backend (S3, MinIO, GCS, Azure, filesystem).

2. Blob Gateway

Hub-facing REST interface for:

write operations
read operations
multi-part upload
signed URLs
range reads

3. Chunk Manager

Splits large files into deterministic chunks for distributed processing.

4. Metadata Index

Stores:

object size
hash checksums
content-type
adapter association
lifecycle/expiration

5. Security Layer

Per-project access controls with signed URLs.

3. Blob Lifecycle

Client uploads object (single or multi-part)
Hub records metadata + checksum
Object is stored in Backend Store
Adapters request chunks through signed URLs
Agents stream chunk data via QUIC
Results reference Blob IDs, not raw data
Blob may auto-expire according to lifecycle policy

This enables memory-efficient distributed compute —
Agents never load full objects unless required.

4. API Endpoints

POST `/v1/blob/upload`

json

{
  "object_name": "video/segment_00013.ts",
  "content_type": "video/mp2t"
}

Returns signed URL for uploading.

GET `/v1/blob/{id}`

Provides metadata and access instructions.

GET `/v1/blob/download?id=...`

Returns signed URL for download.

Range Requests

Agents or adapters may request:

Range: bytes=2000000-4000000

Enabling partial consumption of large datasets.

5. Deterministic Storage Guarantees

Forge Blob enforces:

content-hash verification
object immutability
consistent regional replication (optional)
identity-bound access
audit logging

Blob IDs are globally unique and derived from project scope and content hash, with system-level entropy to prevent collisions.

6. Blob Usage Across Adapters

FFmpeg Adapter

segment ingestion
chunked transcoding
deterministic stitching

BLAS / Scientific Compute

matrix tiles
dataset partitions
PCA coefficient storage

PCA / Climate

anomaly fields
eigenvector chunks

ForgeCAT UTC

cyclone track files
ensemble members

ETA / Logistics

historical baseline files
weather-conditioning datasets

Blob Storage is deeply integrated with the entire compute ecosystem.

7. Performance Expectations

Operation	Typical Runtime
Metadata fetch	1–3 ms
Chunk download	10–40 ms
Signed URL issue	<1 ms
Multi-part commit	5–20 ms

Actual throughput depends on backend storage layer.

8. Security Model

Blob Storage enforces:

per-project ACLs
time-limited signed URLs
mandatory TLS
identity verification at Hub
blocklist/allowlist support

Agents never receive raw credentials — all access is delegated via signed URLs.

9. Limitations

Not suitable for very small data (use KV)
Not intended for in-memory numeric structures (use VMem)
Uploading very large files depends on backend throughput
Cross-region transfer may incur latency

Blob Storage Architecture ​

Large-Object Storage for Planetary Compute Workloads ​

1. Purpose & Role in the Architecture ​

2. Architecture Overview ​

1. Object Store ​

2. Blob Gateway ​

3. Chunk Manager ​

4. Metadata Index ​

5. Security Layer ​

3. Blob Lifecycle ​

4. API Endpoints ​

POST /v1/blob/upload ​

GET /v1/blob/{id} ​

GET /v1/blob/download?id=... ​

Range Requests ​

5. Deterministic Storage Guarantees ​

6. Blob Usage Across Adapters ​

FFmpeg Adapter ​

BLAS / Scientific Compute ​

PCA / Climate ​

ForgeCAT UTC ​

ETA / Logistics ​

7. Performance Expectations ​

8. Security Model ​

9. Limitations ​

Related Documentation ​