How Data in the Cloud Supports Robots in the Sky
Drones, Docks, and Data at Scale
At Skydio, we believe in the promise of autonomous flight—not just as a technological feat, but as a transformative tool for our customers. Our users, whether it’s your local power company or a community police officer keeping your neighborhood safe, trust Skydio drones to be their eye-in-the-sky: safely and reliably capturing critical information in real time.
Ten years ago, drones were mainly for hobbyists. In the last five years, they’ve become an essential tool across both private and public-sector operations. Most public safety agencies (e.g., fire departments, police departments, and national parks) now have at least one drone pilot on scene for critical events. Skydio believes in a future where drones arrive first: deploying from optimally positioned docking stations and flown by remote pilots, so public safety teams can get eyes on the scene in minutes, not hours. In the near future, our autonomy will enable remote pilots to orchestrate multiple drones across multiple locations simultaneously.
This evolution is what we call the Arc of Autonomy: the progression from a single operator with one drone (1:1), to one operator managing many drones (1:many), and ultimately to fleets of drones operating autonomously (many: many) with remote operator oversight. We believe the stage is set for drones to become a critical component in managing large-scale infrastructure, including power, water, and roadways.
If what you’ve read so far has already gotten you fired up, you can learn more about Skydio and the various roles we’re hiring for here: About | Careers.
Data is the Backbone
Underpinning this whole transformation is data. Typical drone flights generate gigabytes of telemetry and log data, enough for engineers to reproduce nearly every aspect of a mission. That data is piling up fast. As customer adoption of Docks and Remote Operations accelerates, flight volume rises, and the amount of data we ingest grows right along with it.
In this blog post series, we’ll share a technical overview of how we’ve scaled our data lake (Databricks) to sustain that growth, along with what we’ve learned. The rest of this post provides a high-level overview of our data systems and the requirements they’re built around. Future posts will go deeper on specific topics like source-controlled data pipelines (dbt), streaming, CI/CD, data migrations, AI, and cost management.
The Foundation: Scaling for a Global Fleet
Supporting explosive growth in drone usage has required an equally dramatic evolution in our data operations. Architecturally speaking, analyzing a single flight is easy: you plug a USB stick into a drone, and you transfer the logs to your laptop to analyze. From hardware health to user-instrumented events, we built a suite of internal tools that enable engineers across disciplines to investigate issues and collaborate on root-cause analysis based on logs from the drone.
But as our customers embrace Docks and Remote Operations, our flight count has reached the millions. To give a sense of scale, since 2024:
- Fleet scale: We have shipped 16,000 X10/X10Ds
- Data velocity: We ingest 2–3 billion events per day, capturing granular telemetry from every mission.
- Flight volume: We see anywhere from 60k - 70k flights per month and expect to see 100,000 flights per month by the end of the year.
- Storage: Our data lake manages 3.2+ petabytes of data; foundational for autonomy and reliability research.
By capturing this influx of data at scale, we have made Skydio more data-driven. For example, we drive our safety and reliability metrics - like flight success rate, connectivity metrics, and parachute - ensuring our customers are successful in the field. Data is also at the heart of Skydio’s manufacturing process, enabling resource planning, production line optimization, and ultimately increasing manufacturing yield.
Beyond volume, ingestion speed is a hard requirement. When responding to a critical failure, the urgent question is:
Does the issue affecting this drone also threaten other drones? Should I ground my fleet, or keep flying?
Engineering quickly pivots to root cause: Was it a kernel panic or a hawk attack? Yes, a literal bird… and yes, it happens.
A recent example is our launch of an improved version of our flight batteries. We received reports of inaccurate battery remaining capacity readings. By leveraging the deep telemetry we capture on every flight, we identified the root cause as a hardware issue with a vendor’s chip. We then extended this diagnosis to our entire fleet and identified all affected customers, verified the root cause, and pushed out a software fix to address the issue without needing to replace any hardware.
The sooner we can determine what happened, the sooner customers can trust their fleet again. Skydio aims to respond in hours, not days, which makes ingestion speed a key bottleneck for root-cause analysis.
Lastly, we need to support log data from a variety of drone types, running a variety of firmware, and therefore producing data with a variety of schemas. All this data needs to be reliably consumed and made available to users.
Architecture: The Engine of Autonomy
To move from individual log analysis to a globally scalable platform, we built a modern, distributed stack designed for durability, performance, and cost efficiency.
Raw flight data ingestion
Everything starts with pulling data off the robot. We reserve all on-board computation resources and connection bandwidth mid-flight for command and control to maximize safety during the flight. Our drones upload telemetry when they’re safely landed after the flight. It’s a race to upload flight data before the aircraft needs to launch again. You don’t want to re-drive ingestion from the drone, so the goal is to upload raw data as quickly as possible and move the source of truth upstream.
We chose direct upload to AWS S3, using Skydio Cloud only as an authentication broker via ephemeral AWS STS credentials. This lets data uploads scale with S3 rather than routing all traffic through our own systems. The drone and cloud coordinate to ensure every file is hash-verified, so we can trust S3 as the raw source of truth.
Getting data into the data lake
Once data lands in S3, we use AWS SNS to trigger Temporal workflows (https://temporal.io/) that transform it into Parquet for efficient downstream consumption (both the data lake and Skydio Cloud). Temporal is built for failure by design, supporting automatic retries and clean re-drives, so transient issues rarely require alerts, and non-transient issues are straightforward to fix and reprocess without extensive custom tooling.
We then load data into Databricks using Auto Loader plus custom Python. Auto Loader lets us provide an SQS queue and schema, then continuously ingest telemetry into tables as new data arrives in S3. Subsequent Python jobs move data from a raw “multi-event” table into event-specific tables optimized for analysis.
We also ingest cloud-application telemetry. As users fly via web applications, we log usage in Datadog and ingest those logs into our data lake through custom export jobs.
When root-causing drone issues, it’s incredibly valuable to incorporate data across the entire lifecycle:
- Manufacturing data: Links parts back to origin, including records of hardware tests.
- CRM data: Enables search across customer support tickets to identify patterns.
- Cloud product data: Captures the pilot experience and ties cloud-side actions to drone-side behavior.
- Internal issue tracking data: Connects symptoms to known issues across software versions.
This is a lot of data from disparate systems. The “magic” happens in the processing layer: we use dbt, SQL, and PySpark for transformation, orchestrated via Databricks Workflows. Our data-team repository hosts hundreds of dbt models, CI/CD checks, and READMEs—helping developers build pipelines safely and consistently, grounded in a shared understanding of existing transformations. dbt has played a major role in helping us add business context to tables in a disciplined way, shifting time from writing repetitive SQL to extracting insights from well-enriched data.
Turning Data into Action
A data lake is only as valuable as the applications it powers. Here are a few examples of what our data team builds:
- Proactive device health monitoring: We can proactively mitigate 98% of potential issues, ensuring drones are flight-ready before a pilot arrives on site.
- Connectivity maps: Visibility into 5G, LTE, and Wi-Fi signal strength to understand environmental link quality. We then provide this insight to the pilots.
- Logdash: An interactive Python-based application used by hundreds of daily active users to analyze and debug newly uploaded logs in near real time.
- Live flight visibility: Using Spark Structured Streaming and Reverse ETL APIs, we provide a real-time heartbeat of global flight activity.
- AI-driven discovery: We use AI and documentation tools to help non-technical users discover and query the data they need without a middleman.
Over the next year, our vision is to expand the Cloud product by integrating more of our internal data lake, so customers can see the same data we see.
The Road Ahead
The next posts in this series will dive into the projects we’ve delivered over the past couple of years to keep pace with the drone transformation. We’ll go deeper on challenges like schema collisions, automated system-health monitoring, and the concrete strategies we used to slash infrastructure costs while continuing to scale.
The journey to true autonomy is as much a data story as it is a robotics one. We’re excited to share what we built, why we built it, and what we learned along the way.
We’re hiring if you’re interested in helping build the data lake and the applications above.