Edge AI to Orbit: How Data Flows in the Distributed AI Stack
AI compute is distributing across tiers — from edge sensors to regional hubs to terrestrial data centers to orbital platforms. Understanding the data flows between these tiers is the first step to building infrastructure that works.
The Multi-Tier Architecture
AI workloads no longer run in a single place. The emerging architecture has at least four distinct compute tiers, each with different capabilities, constraints, and data characteristics:
Distributed AI Compute Stack:
══════════════════════════════════════════════════════════
Tier 4: Orbital Compute (future)
├── Passive cooling, solar power, extreme isolation
├── Latency: 5-20 ms to ground (LEO)
└── Use: Bulk training, archival, specialized workloads
▲ Optical/RF downlink (1-10 Gbps burst)
│
Tier 3: Terrestrial Data Centers
├── GPU clusters, high-bandwidth interconnect
├── Latency: 1-50 ms between facilities
└── Use: Model training, large-scale inference
▲ WAN / dedicated fiber (1-400 Gbps)
│
Tier 2: Regional Aggregation
├── Edge data centers, telco PoPs
├── Latency: 1-10 ms to edge devices
└── Use: Data filtering, preprocessing, caching
▲ LAN / cellular / satellite (100 Mbps - 10 Gbps)
│
Tier 1: Edge Devices
├── Sensors, cameras, IoT, phones, vehicles
├── Compute: Limited (NPUs, mobile GPUs)
└── Use: Data capture, local inference, filtering
Data flows in both directions through this stack. Raw data moves up from edge to cloud for training. Trained models move down from cloud to edge for inference. The volume, frequency, and latency requirements differ at every boundary.
Tier 1: The Edge — Where Data Is Born
The vast majority of AI training data originates at the edge. Cameras, microphones, LiDAR units, industrial sensors, smartphones — these devices generate the raw material that feeds the entire AI pipeline.
The numbers are staggering. IDC estimates that the global datasphere will reach 291 zettabytes by 2027, with over 70% of data generated at or near the edge. A single autonomous vehicle generates 5-20 TB of sensor data per day. A manufacturing facility with 100 machine vision cameras generates 1-5 TB per shift.
Edge AI exists precisely because you can't move all of this data. Local inference at the edge — running smaller models on device — serves two purposes: it provides real-time responses without network round-trips, and it filters data before it moves upstream.
A well-designed edge inference pipeline reduces upstream data volume by 90-99%. Instead of streaming every frame from a security camera (500 GB/day at 4K), edge inference can extract only the frames containing detected events (5-50 GB/day). The raw footage stays on local storage; only curated data moves to Tier 2.
Tier 1 to Tier 2: The First Transfer Boundary
Even after edge filtering, the volume of data moving from edge to regional aggregation is substantial. Consider a retail chain with 1,000 stores, each running computer vision for inventory and customer analytics:
- 50-200 cameras per store
- Edge inference reduces data to ~20 GB/day per store
- Total: 20 TB/day flowing from edge to regional hubs
- Monthly: 600 TB
At cloud egress rates ($0.05-0.12/GB), moving 600 TB/month costs $30,000-72,000. That's before any training compute. This is why direct P2P transfer matters at the edge-to-aggregation boundary — the per-GB model breaks at this scale.
The connectivity at this boundary is often unreliable. Edge devices connect over cellular (4G/5G), satellite (for remote sites), or limited-bandwidth wired connections. Transfer protocols that depend on stable, low-latency links fail in these conditions. Protocols designed for intermittent connectivity — resumable, tolerant of packet loss, capable of operating over high-latency paths — are necessary infrastructure, not optional features.
Tier 2: Regional Aggregation
Regional hubs serve as the first concentration point for edge data. These might be edge data centers (Equinix, EdgeConnex), telco points of presence, or on-premise server rooms in enterprise environments.
At this tier, data undergoes preprocessing: deduplication, format normalization, quality filtering, and annotation pipeline staging. The goal is to transform raw edge data into training-ready datasets.
Data gravity matters here. Data gravity — the tendency of data to attract applications and services — means that once data lands in a regional hub, it's costly to move again. Preprocessing infrastructure tends to co-locate with storage. Annotation teams (or annotation AI) run where the data is. The regional tier becomes a natural accumulation point.
This is both useful and dangerous. Useful because it reduces the volume of data that needs to move to Tier 3 (after deduplication and filtering, datasets might be 30-50% of ingested volume). Dangerous because it creates data silos — each regional hub has a different subset of the total dataset, and assembling the full training corpus requires pulling data from many hubs.
Tier 2 to Tier 3: The Training Bottleneck
Moving preprocessed data from regional hubs to centralized training clusters is where most organizations hit their transfer ceiling. The bandwidth between Tier 2 and Tier 3 is typically shared WAN — 1-100 Gbps depending on the facilities and geography.
A concrete example: an ML team at an autonomous driving company needs to assemble a 2 PB training dataset from 8 regional collection hubs across North America and Europe. Each hub holds 200-300 TB of preprocessed sensor data. The training cluster is in a centralized facility.
Transfer scenario: 2 PB aggregation from 8 regional hubs
──────────────────────────────────────────────────────────
Per-hub data: ~250 TB
Link speed: 10 Gbps per hub (shared WAN)
Effective rate: 6 Gbps (60% utilization, realistic)
Per-hub time: ~3.9 days
Parallel transfers: All 8 simultaneously
Total time: ~3.9 days (parallel)
But: shared WAN means contention
Realistic total: 5-7 days
Cloud egress cost: $100,000 - $240,000 (one-time)
Handrive P2P cost: $0
Five to seven days to assemble a training dataset. If the training run itself takes two weeks, data movement represents 25-33% of total pipeline time. And if a data quality issue is discovered during training, requiring re-collection from edge sources, the clock resets.
Tier 3: Terrestrial Data Centers
This is where the heavy compute happens. GPU clusters running training jobs, inference serving fleets, and the storage systems that feed them. Within a data center, bandwidth is abundant — InfiniBand at 400 Gbps per link, with aggregate bisection bandwidth in the hundreds of terabits per second for large clusters.
But AI workloads increasingly span multiple data centers. A training run might use 10,000 GPUs split across two facilities because no single facility has enough capacity or power. Model weights need to sync between sites. Model checkpoints — typically 1-10 TB for frontier models — need to replicate for fault tolerance.
Data movement between AI data centers is a growing concern. The inter-facility links are fast (100-400 Gbps) but finite. When multiple training runs compete for the same backbone links, checkpoint synchronization delays cascade into GPU idle time. At $2-8 per GPU-hour for H100 equivalents, idle time is expensive.
Tier 3 to Tier 4: The Orbital Frontier
Orbital compute is the emerging Tier 4. Multiple companies are developing orbital data centers for workloads that benefit from the space environment: free cooling (eliminating a major terrestrial cost), abundant solar power, and physical isolation for secure computation.
The data transfer challenge at this boundary is the most severe in the entire stack. Ground-to-orbit and orbit-to-ground links are constrained by contact windows, atmospheric conditions, and the physics of optical/RF propagation through the atmosphere. We covered these constraints in detail in our analysis of orbital data center transfer challenges.
The key question for Tier 3-to-4 data flow is: what moves up, and what moves down? Bulk training data moves up (large volume, latency-tolerant). Trained model weights move down (moderate volume, moderate latency sensitivity). Real-time inference traffic probably stays terrestrial — the latency overhead of an orbital round-trip (minimum 5-20 ms for LEO, plus processing) is acceptable for batch workloads but not for interactive applications.
The Role of File Transfer in This Architecture
At every tier boundary, data moves. And at every boundary, the transfer characteristics are different:
| Boundary | Volume | Link Quality | Key Requirement |
|---|---|---|---|
| Edge to Regional | TB/day | Unreliable | Resumability |
| Regional to DC | PB/week | Moderate | Throughput |
| DC to DC | PB/day | Good | Low overhead |
| DC to Orbit | TB/day | Intermittent | Window efficiency |
A transfer tool that only works well in one of these regimes is insufficient. Enterprise transfer tools were built for the DC-to-DC case — reliable links, high bandwidth, predictable conditions. They struggle at the edge and are untested at orbital boundaries.
Handrive's protocol was designed for the hardest conditions in this stack: unreliable links, high latency, intermittent connectivity. A protocol that works on a satellite link also works on a stable fiber connection — the reverse isn't true. That's why building for the worst case provides coverage across all tiers.
Reducing Data Movement
The best transfer is the one you don't make. Reducing data movement through architectural choices is as important as making transfers faster:
- Edge inference: Process data locally to reduce what moves upstream. A 99% reduction at the edge turns petabytes into terabytes.
- Federated learning: Train on data where it lives. Only model updates (gradients, not data) move between nodes. Transfer volume drops by orders of magnitude.
- Data deduplication: At the regional tier, deduplicate before forwarding. Cross-hub deduplication can reduce aggregate dataset size by 30-60%.
- Incremental sync: Don't re-transfer unchanged data. Block-level differencing transfers only what's new — critical for datasets that evolve incrementally.
But even with all of these optimizations, the remaining data that must move is measured in petabytes. The question isn't whether to invest in transfer infrastructure — it's whether to invest now or scramble later. For more on the macro dimensions of this problem, see the coming data transfer crisis.
Where This Is Heading
The distributed AI stack is still forming. Two years ago, "edge AI" was a buzzword. Now it's a $20+ billion market with real hardware shipping. Orbital compute is at the same stage edge AI was in 2022 — early prototypes, genuine engineering work, and a clear value proposition waiting for the technology to mature.
The common thread across all tiers is data fluidity. The organizations that can move data efficiently between tiers — without per-GB taxes, without protocol limitations, without manual orchestration — will iterate faster, train on more data, and deploy to more endpoints.
Handrive is building the transfer layer for this multi-tier future. A single protocol that handles edge connectivity, data center throughput, and orbital intermittency. One tool that runs on a Raspberry Pi at the edge, a headless server in a data center, and (eventually) a compute node in orbit. For details on how the secure Earth-to-orbit AI pipeline comes together, that's where the pieces connect.
One Protocol, Every Tier
From edge devices to data centers, Handrive's transfer protocol handles unreliable links, high latency, and petabyte-scale volumes.
Download Handrive