gather() Is Not Structured Concurrency

asyncio.gather() runs coroutines concurrently. That is all it promises.

When one coroutine raises an exception, gather() propagates that exception to the caller — but the other coroutines keep running. They are now orphans. They hold resources. They may produce results no one will consume. You have no contract about when they finish or whether they will be cancelled.

In an LLM pipeline, this is not theoretical. Fan-out patterns are everywhere: parallel tool calls, multi-model queries, batch embeddings, concurrent retrieval against multiple indexes. Any of these can fail.

``python


This looks safe. It is not.
results = await asyncio.gather(
    call_tool_a(query),
    call_tool_b(query),
    call_tool_c(query),
)

If call_tool_a raises, gather() returns the exception to you. call_tool_b and call_tool_c are still running. They will run until they complete or the event loop closes.

asyncio.TaskGroup

, added in Python 3.11, enforces a different contract:

python
async with asyncio.TaskGroup() as tg:
    task_a = tg.create_task(call_tool_a(query))
    task_b = tg.create_task(call_tool_b(query))
    task_c = tg.create_task(call_tool_c(query))
All tasks have finished or been cancelled before this line executes.

The async with block does not exit until every task spawned inside it has either completed or been cancelled and cleaned up. No orphans. When one task raises, the group cancels the remaining tasks automatically, waits for them to acknowledge the cancellation, then re-raises the exception. The stack unwinds cleanly.


This is structured concurrency: every concurrent operation has a defined scope, a defined lifetime, and a defined cancellation contract. The idea originates with Trio's nurseries and was adopted into the Python standard library.

For LLM agents dispatching parallel tool calls, the correct pattern is a single TaskGroup per tool-dispatch cycle. All calls go inside the group. The group exits when all calls are done, regardless of which succeeded and which failed. Results are collected after the block. This is not a style preference — it is the pattern that prevents resource leaks and silent partial results.

gather()

 is not wrong for fire-and-forget workloads or when you genuinely want survivors after a failure. For anything that must complete together or cancel together, use

TaskGroup`.

The rule is simple: if the tasks are structurally related — they should all finish before you move on, or all be cancelled if any one fails — give them a shared scope. That is what structured concurrency is for.

References

1. Python Software Foundation. (2023). "asyncio.TaskGroup — Python 3.11 documentation." https://docs.python.org/3/library/asyncio-task.html#asyncio.TaskGroup

2. Pán, T. (2026). "Structured Concurrency for AI Pipelines: Why asyncio.gather() Isn't Enough." tianpan.co. https://tianpan.co/blog/2026-04-09-structured-concurrency-ai-pipelines-parallel-tool-calls

3. Zylos Research. (2026). "Structured Concurrency Patterns for AI Agent Task Management." https://zylos.ai/research/2026-03-13-structured-concurrency-ai-agent-task-management

4. Nautiyal, A. "Async LLM Calls." Engineers of AI. https://engineersofai.com/docs/ai-engineering/production-ai-patterns/Async-LLM-Calls