How an asynchronous ETL chain works: optimization and flexibility

A classic ETL chain runs in three steps that wait on each other: you extract everything, transform everything, then load everything. As long as extraction isn’t finished, transformation doesn’t start. As long as a single record is stuck, the whole batch waits. It’s easy to draw on a whiteboard, and it’s exactly what derails a reload at real volume.

In a synchronous chain, the throughput of the whole is that of its slowest step. A stuck record doesn’t just slow itself down: it freezes everyone behind it.

The batch is the problem — not the solution

Processing in batches gives a false sense of control. You launch the nightly batch, you wait for the morning report. Except a batch is an all-or-nothing object: it succeeds whole or it fails, and when it fails at 80%, you often can’t say where. You replay everything. You pay again for the extraction and transformation of the 80% that were already fine, just to reach the remaining 20%.

The higher the volume, the more this logic turns against you. A million records behind one mishandled edge case, and there goes the cutover window.

Asynchronous decouples the steps

Going asynchronous means cutting the rigid links between extract, transform and load. Each step becomes an independent station, connected to the others by a queue:

extraction drops records into a queue, at its own pace;
transformation consumes that queue as soon as a record is available, without waiting for extraction to finish;
loading takes over the same way, downstream.

No step waits for the previous one to finish all of its work. They work in parallel on different records. While the last extract is still arriving, the first ones are already loaded.

Optimization comes from parallelism and isolation

Once decoupled, each station is sized for what it does. Transformation is CPU-heavy? You multiply its consumers without touching extraction. The target system only accepts a limited number of connections? You cap the loading stage alone, without throttling the rest of the chain.

Above all, a failing record no longer blocks the queue. It goes to a dedicated error queue — diagnosed, corrected, replayed individually — while the others keep moving forward. Throughput is no longer dictated by the worst case, but by the normal pace of the flow. It’s the same principle as per-record qualification: what passes moves forward, what fails is handled on the side.

Flexibility comes from the same break

Decoupling pays off a second time when the project changes — and it always changes. Adding a transformation rule, inserting an enrichment step, plugging in a second target: in an asynchronous chain, that’s one more consumer on an existing queue. You don’t rebuild the pipeline, you graft a station onto it.

Replay benefits just as much. A rule fixed mid-course doesn’t force a rerun of the entire reload: you reinject only the affected records, from their queue. The chain absorbs the correction instead of starting from scratch.

What it changes for a migration

A migration isn’t a one-off transfer you succeed or fail at in one block. It’s a flow of records, each with its own verdict. A synchronous chain forces that flow to behave like a monolithic batch, with all the fragility that implies.

Asynchronous gives the flow back its true nature: independent units that move forward, fail and get corrected without blocking one another. This is the foundation that makes the Run / Fix loops of our methodology possible. Optimization and flexibility aren’t two options you bolt on afterward — they are the direct consequences of having stopped making everything wait on everyone.