Batch Processing

Scale your neural distribution by processing hundreds of models in parallel across our global sharding nodes.

Large-scale Ingestion

The Batch API is designed for enterprise users who need to maintain versioned versions of their entire model zoo across multiple hardware targets (Jetson, mobile, and Intel compute).

Parallel Sharding

Process up to 100 models simultaneously per organization.

Target Matrix

Auto-export one model to all 5+ supported targets in a single call.

Resilient S3 Storage

Direct-to-bucket uploads with zero-knowledge encryption.

Webhooks

Fire event notifications to your CI/CD when a batch is ready.

JSON Payload Example

{
  "batch_name": "llama-v3-release",
  "targets": ["tensorrt", "coreml", "tflite"],
  "models": [ ... ],
  "callback_url": "https://hooks.ai.com/edge"
}