Appearance
Blob Storage Architecture
Large-Object Storage for Planetary Compute Workloads
Forge Blob is the large-scale binary object storage layer of the Forge Pool architecture.
It provides efficient storage and retrieval of:
- model inputs
- media segments
- scientific datasets
- intermediate artifacts
- video/audio chunks
- adapter-specific binary payloads
Blob Storage is optimized for workloads requiring MB–GB scale data, complementing KV (small metadata) and VMem (medium-size numeric memory).
1. Purpose & Role in the Architecture
Blob Storage:
- stores large binary objects
- enables distributed adapters (FFmpeg, BLAS, PCA, CAT)
- supports chunked upload/download
- ensures deterministic availability
- integrates directly with Hub routing
- minimizes memory pressure on Agents
- supports streaming and partial range reads
It is the persistent binary backbone of the planetary compute network.
2. Architecture Overview
Forge Blob consists of:
1. Object Store
Backed by a pluggable storage backend (S3, MinIO, GCS, Azure, filesystem).
2. Blob Gateway
Hub-facing REST interface for:
- write operations
- read operations
- multi-part upload
- signed URLs
- range reads
3. Chunk Manager
Splits large files into deterministic chunks for distributed processing.
4. Metadata Index
Stores:
- object size
- hash checksums
- content-type
- adapter association
- lifecycle/expiration
5. Security Layer
Per-project access controls with signed URLs.
3. Blob Lifecycle
- Client uploads object (single or multi-part)
- Hub records metadata + checksum
- Object is stored in Backend Store
- Adapters request chunks through signed URLs
- Agents stream chunk data via QUIC
- Results reference Blob IDs, not raw data
- Blob may auto-expire according to lifecycle policy
This enables memory-efficient distributed compute —
Agents never load full objects unless required.
4. API Endpoints
POST /v1/blob/upload
json
{
"object_name": "video/segment_00013.ts",
"content_type": "video/mp2t"
}Returns signed URL for uploading.
GET /v1/blob/{id}
Provides metadata and access instructions.
GET /v1/blob/download?id=...
Returns signed URL for download.
Range Requests
Agents or adapters may request:
Range: bytes=2000000-4000000Enabling partial consumption of large datasets.
5. Deterministic Storage Guarantees
Forge Blob enforces:
- content-hash verification
- object immutability
- consistent regional replication (optional)
- identity-bound access
- audit logging
Blob IDs are globally unique and derived from project scope and content hash, with system-level entropy to prevent collisions.
6. Blob Usage Across Adapters
FFmpeg Adapter
- segment ingestion
- chunked transcoding
- deterministic stitching
BLAS / Scientific Compute
- matrix tiles
- dataset partitions
- PCA coefficient storage
PCA / Climate
- anomaly fields
- eigenvector chunks
ForgeCAT UTC
- cyclone track files
- ensemble members
ETA / Logistics
- historical baseline files
- weather-conditioning datasets
Blob Storage is deeply integrated with the entire compute ecosystem.
7. Performance Expectations
| Operation | Typical Runtime |
|---|---|
| Metadata fetch | 1–3 ms |
| Chunk download | 10–40 ms |
| Signed URL issue | <1 ms |
| Multi-part commit | 5–20 ms |
Actual throughput depends on backend storage layer.
8. Security Model
Blob Storage enforces:
- per-project ACLs
- time-limited signed URLs
- mandatory TLS
- identity verification at Hub
- blocklist/allowlist support
Agents never receive raw credentials — all access is delegated via signed URLs.
9. Limitations
- Not suitable for very small data (use KV)
- Not intended for in-memory numeric structures (use VMem)
- Uploading very large files depends on backend throughput
- Cross-region transfer may incur latency
